,---------------. | Contributions | `---------------´ ,------------------------------------------ | 2025-09-16 11:19:01 GeraldWodni wrote: | proposal - Non parsing CREATE | see: https://forth-standard.org/proposals/non-parsing-create#contribution-413 `------------------------------------------ ## Author: Gerald Wodni ## Change Log: ## Problem: There is no standard way to set the name of a word defined by `create` via a string. However, each Forth system has such a word as a factor of `create`. Given that this functionality is exposed and documented, it is requested by some users. Personally I need it to create multiple words with a common prefix. ## Solution: Expose the underlying word(s) using a standard name. - In VFXForth it is called `($create)` - In Gforth its equivalent is `nextname create` > This gives the rationale for specific decisions you have taken in the proposal (often in response to comments), or discusses specific issues that have not been decided yet. ## Typical use: Create a store and fetch word pair for a register ``` 32 buffer: next-name : register: ( addr "prefix" -- ) \ *G create 2 words, with a common prefix dup \ save address for both definitions parse-name >r next-name r@ cmove \ save prefix '!' r@ next-name + c! \ add '!' suffix \ create store-word next-name r@ 1+ [: ($create) , does> @ ! ;] execute '@' r@ next-name + c! \ replace with '@' suffix \ create fetch-word next-name r> 1+ [: ($create) , does> @ @ ;] execute ; variable uart-addr \ pretend to be a hardware register uart-addr register: uart 'h' uart! uart@ emit ``` ## Proposal: Add a new word `non-parsing-create` with references to [create](https://forth-standard.org/standard/core/CREATE). The behavior is like create except that the name is taken as a string `( caddr len -- )` form the stack instead of being parsed from the input stream. ## Reference implementation: System specific, hence standardization is required. ,---------. | Replies | `---------´ ,------------------------------------------ | 2025-09-13 17:33:10 ruv replies: | proposal - Recognizer committee proposal 2025-09-11 | see: https://forth-standard.org/proposals/recognizer-committee-proposal-2025-09-11#reply-1543 `------------------------------------------ > it would also be nice if there were an easy way to discard the entire stack effect of a given translation, if the result of a recognizer produces a different translation than desired. Maybe `discard ( translation -- )` Yes, it would be useful and convenient. But then we should associate more information with data type identifiers. I think I would implement this. In this proposal, "translation" (yes, the name is inappropriate) in stack diagrams is a data type symbol for the data type that is a subtype of `( ut td )`, which is a pair of _ut_ (unqualified token) and _td_ (token descriptor) at the top, **where**: - `td => x`, i.e., the token descriptor (a data type) is a subtype of the unspecified cell; - `ut => ( F: j*r ; S: i*x ; )`, i.e., the unqualified token (a data type) is an arbitrary tuple (possibly empty) of _r_ and _x_ values, so that _r_ values reside on the floating point stack, _x_ values reside on the data stack. In the language of type theories, `( ut td )` is a **dependent pair** type, since each **member** of _td_ is associated with some particular **subtype** of _uq_. In other words, the **value** of the stack parameter _td_ determines the **type** of the stack parameter _ut_. Therefore, a member of the type `( ut td )` can be interpreted as a tagged data object, in which the value _td_ is a tag for the value _ut_. Thus, each member of _td_ is also an identifier of some data type. The Forth system associates with this identifier information about how to translate (interpret or compiler) the members of the corresponding data type (a subtype of _ut_). Of course, this identifier may also be associated with information about how to remove the members of this data type from the stacks. Since the user can effectively define own data types, we should provide a way to create a token descriptor (a members of _td_) and associate **various** information with it in several steps. The main information piece is about translation of the data type members. Information about postponing and discarding (removing from the stacks) my be optional. Regarding terminology/naming. We can use the term "data type identifier" instead of "token descriptor", but 1) this name is longer, 2) then there will be a number of terms that look very similar: "data type", "data type symbol", "data type identifier". Therefore, I would prefer more distinguishable terms. As an option, instead of "token descriptor" we can use "type descriptor". ----- An alternative solution to remove a qualified token from the stack is to determine its size. This can be done by storing the stack depth before the qualified token is placed on the stack and then calculating the change in stack depth. For example, see the word `apply-recognizer-filter` in my [recognizer/filter.fth](https://github.com/ForthHub/fep-recognizer/blob/master/implementation/variant.gamma/recognizer/filter.fth) and its use in the word `available-xt` in [example.text-translator.fth](https://github.com/ForthHub/fep-recognizer/blob/4e74119a11b177f9422dce25e9315318ef0e032c/implementation/variant.gamma/example.text-translator.fth#L91). ,------------------------------------------ | 2025-09-14 15:27:43 AntonErtl replies: | proposal - Recognizer committee proposal 2025-09-11 | see: https://forth-standard.org/proposals/recognizer-committee-proposal-2025-09-11#reply-1544 `------------------------------------------ The mentions of `translate-num` in the examples are oversights and should be `translate-cell`. The way that the example `rec-tick` deals with the case where the word is not found does not work with a non-zero `translate-none`. A correct implementation is: ```` : rec-tick ( addr u -- translation ) \ gforth-experimental over c@ '`' = if 1 /string find-name dup if name>interpret translate-cell exit then drop translate-none exit then rec-none ; ```` ,------------------------------------------ | 2025-09-14 20:59:38 EricBlake replies: | proposal - Recognizer committee proposal 2025-09-11 | see: https://forth-standard.org/proposals/recognizer-committee-proposal-2025-09-11#reply-1545 `------------------------------------------ > : rec-tick ( addr u -- translation ) \ gforth-experimental > over c@ '`' = if The `\ gforth-experimental` comment can be dropped. Is a recognizer guaranteed that u will be non-zero, or is this `c@` at risk of reading beyond the bounds of the input argument? And if the recognizer is called on the length-1 string "\`", should this example be relying on the implementation-defined results of `c-addr 0 find-name` (likely 0, but possibly an xt if the implementation allows for an empty-length dictionary entry)? ,------------------------------------------ | 2025-09-15 19:54:13 ruv replies: | proposal - New words: latest-name and latest-name-in | see: https://forth-standard.org/proposals/new-words-latest-name-and-latest-name-in#reply-1546 `------------------------------------------ ## Author Ruv ## Change Log - 2023-10-22 Initial revision - 2023-10-23 Add testing, examples, a question to discuss, change the throw code description - 2023-10-27 Some rationales and explanations added, the throw code description changed back, better wording in some places - 2024-06-20 Fix some typos, make some wording and formatting better, add some examples and test cases, add motivation for `LATEST-NAME-IN`, change the status to "formal". - 2024-06-20 Add a test case to check that `LATEST-NAME` returns different value after the compilation word list is switched. - 2024-06-20 Simplify the normative text description, and add a rationale for this simplification. - 2025-09-15 Add clause about findable words, add rationale sections in the proposal, address a [question](https://forth-standard.org/proposals/new-words-latest-name-and-latest-name-in#reply-1502) re `immediate`, note a bug in `traverse-wordlist` in some Forth systems, make some rewording and minor corrections, add a more general reference implementation. ## Problem In some applications, mainly in libraries and extensions, the capability to obtain the most recently added definition is very useful and demanded. To make such programs portable, we should introduce a standard method to obtain the most recently added word. For example, if we are creating a library for decoration, tracing, support for OOP, simple DSLs (e.g., to describe Final State Machines), etc — it is always useful to have an accessor to the recent definition, instead of redefining a lot of words to define such an access method yourself, or juggling with the input buffer and search. One simple example. If we want to have variables that are initialized by zero, we can use: ``` : var ( "name" -- ) variable 0 latest-name name> execute ! ; ``` A number of specific examples is provided in my [post](https://github.com/ForthHub/discussion/discussions/153#discussioncomment-7418639) on ForthHub (those examples are not inserted here so as not to bloat the text). And additionally, there has been much discussions regarding standardization of such a method in recent decades. For example, Elizabeth D. Rather [wrote](https://groups.google.com/g/comp.lang.forth/c/RsJQVnEQQuw/m/M1PrzPAcE-0J) on 2011-12-09 in `comp.lang.forth`: > AFAIK most if not all Forths have some method for knowing the latest definition, it's kinda necessary. The problem is, that they all do it differently (at different times, in different forms, etc.), which is why it hasn't been possible to standardize it. > > Although it's a system necessity, I haven't found this of much value in application programming. > > Elizabeth D. Rather It's true: depending on the system, an internal method can return the recent word regardless of the compilation word list, or depending on the compilation word list, a completed definition, or not yet completed definition, also unnamed definition, or only named definition, etc. The value in application programming is shown by me above. Some known internal methods: `latest ( -- nt|0 )`, `last @ ( -- nt|0 )`, `latestxt ( -- xt|0 )`, etc. Thus, although almost every Forth system contains such a method, there is no portable way for programs to obtain the latest definition. But a such portable method is actually very useful, as shown in my examples. ## Solution Let's introduce the following words: - `LATEST-NAME-IN ( wid -- nt|0 )` - `LATEST-NAME ( -- nt )` The first word returns the name token for the definition whose name was placed most recently into the given word list, or zero if this word list is empty. The second word returns the name token for the definition whose name was placed most recently into **the compilation word list**, or throws an exception if there is no such definition. These words do not expose or limit any internal mechanism of the compiler. They just provide information about word lists, like the words `FIND-NAME-IN`, `FIND-NAME`, and `TRAVERSE-WORDLIST` do. It's a kind of introspection/reflection. This words are intended for programs. The system may use them, but is not required to do so. The system may continue to use its internal `LAST`, `LATEST`, or whatever it was using before. It seems, the best place for these words is the section [15.6.2 Programming-Tools extension words](https://forth-standard.org/standard/tools#subsection.15.6.2), where `TRAVERSE-WORDLIST` is also placed. ### Rationale #### Connection with word lists By considering definitions in the frame of a word list only, we solve several problems, namely: 1. A word list contains only completed definitions (see the accepted proposal #153 [Traverse-wordlist does not find unnamed/unfinished definitions](https://forth-standard.org/proposals/traverse-wordlist-does-not-find-unnamed-unfinished-definitions?hideDiff#reply-487)). This eliminates the question of whether the word of returned _nt_ is finished — yes, it is always finished (completed). 2. Nameless definitions are not considered since they are not placed into the compilation word list (regardless of whether the system creates a name token for them, or places them into an internal system-specific word list). 3. An extension or library can create definitions in its internal word list for internal purposes. And it will not affect the compilation word list or other user-defined word lists. Thus, the user of such library always gets the expected result from `latest-name` (regardless of what words are created by this library for internal purposes on the fly). For example, when different dictionary spaces will be introduced, we can implement something like local variables (or local definitions) in portable way, and creating such a definition will not affect the value that `latest-name` returns. #### Return values As a matter of practice, almost all the use cases for the word `LATEST-NAME` imply that the requested definition exists, and if it doesn't exist, only an error can be reported. So the option to return `0` by this word only burdens users with having to analyze this zero, or redefine this word as: ``` : latest-name ( -- nt ) latest-name dup 0= -80 and throw ; ``` If the user needs to handle the case where the compilation word list is empty, they can use the word `latest-name-in` as: ``` get-current latest-name-in dup if ( nt ) ... else ( 0 ) drop ... then ``` #### Implementation options If the the word list structure in a Forth system contains information about the latest placed definition, the implementations for the proposed words are trivial. In some plausible Forth systems the word list structure doesn't directly contain information about which definition was placed into the word list most recently, and this information cannot be obtained indirectly. Such systems might not provide the proposed words, or they are changed to keep this information in the word list structure. It seems, in most systems the word list structure directly contains this information, or this information can be obtained indirectly. Some checked systems: - Gforth, minForth, ikForth, SP-Forth, Post4 — a word list keeps information about the definition that was placed in it most recently; - SwiftForth, VFX — the most recently placed word in a word list can be correctly obtained form the strands/threads (since _nt_ values are monotonically increased); - lxf/ntf 2017 — the most recently placed word in a word list can be obtained using `travarse-wordlist` (since _nt_ values are monotonically increased). Note that some systems have a bug in `traverse-wordlist` so it can return the _nt_ for a definition that cannot be found (namely, for the current definition). This is incorrect (see a [testcase](https://forth-standard.org/standard/tools/TRAVERSE-WORDLIST?hideDiff#reply-1123)). If a system does not implement The optional Search-Order word set, it might not provide the word `LATEST-NAME-IN`. #### Naming The names `LATEST-NAME-IN` and `LATEST-NAME` of new words are similar to `FIND-NAME-IN` and `FIND-NAME` by the form. Stack effects are also similar. The difference is that **find** is a verb, but **latest** is an adjective (or sometimes a noun, see [Wiktionary](https://en.wiktionary.org/wiki/latest#Adjective)). Both are historical in their use in naming words. As well as "NAME". In Forth-84 "NAME" _in word names_ denoted **NFA** (name field address), and now it denotes a **name token**, which is the successor of NFA. In all standard words, e.g. `FIND-NAME`, `NAME>STRING`, `NAME>COMPILE`, etc. (except `PARSE-NAME`), "NAME" denotes a name token. NB: the term "token" in "name token" does not mean a character sequence! It's used in a general sense, like "something serving as an expression of something else" (see [Wiktionary](https://en.wiktionary.org/wiki/token#Noun)). #### Normative text description The proposed normative text description is based on: - [16.2](https://forth-standard.org/standard/search#section.16.2): "compilation word list: The word list into which new definition _**names**_ are _**placed**_", - [15.3.1](https://forth-standard.org/standard/tools#subsection.15.3.1): "A _**name token**_ is a single-cell value that _**identifies**_ a named word", - [3.4.3](https://forth-standard.org/standard/usage#usage:semantics): "[Semantics] are _**largely specified by the stack notation**_ in the glossary entries, which shows what values shall be consumed and produced. The prose in each glossary entry _**further**_ specifies the definition's behavior" (there is no need to repeat in the text description what is already indicated in the stack diagrams). (emphasis added) #### Throw code description If the throw code description states that there is no latest name, it can be confusing since latest name in _some sense_ probably always exists. Therefore, it's better to say: "the compilation word list is empty" — it is what actually happens. #### Motivation for `LATEST-NAME-IN` 1. It's a natural factor for `LATEST-NAME`. It's **always possible** to extract this factor from the implementation of `LATEST-NAME`, because the latter returns _nt_ from **the compilation word list**, and the system should take _wid_ of the compilation word list and extract most recent _nt_ from this word list. 2. It's very important to specify the behavior of this word to avoid different behavior in different systems, since in many systems this word will exist (will be implemented as a natural factor). 3. In some cases a program needs to check if a word list is empty, or obtain the latest word from a particular word list (for example, to use this word as entry point, like `main`, or as the default exported word from a module). 4. These both words are optional. And if `LATEST-NAME-IN` is not provided, it can be implemented in a portable way via `LATEST-NAME` as: ``` : latest-name-in ( wid -- nt|0 ) get-current >r set-current ['] latest-name catch if 0 then r> set-current ; ``` ## Things to discuss Is it worth introducing the word `LATEST-NAME-XT ( -- xt )`? If `name>interpret` never returns `0` (see my [comment](https://forth-standard.org/proposals/name-interpret-wording?hideDiff#reply-1118)), this word can be implemented as: ``` : latest-name-xt ( -- xt ) latest-name name>interpret ; ``` The desired (and much discussed) pattern is: ``` defer bar : foo ... ; latest-name-xt is bar ``` Sometimes the name "`it`" has been suggested for this word, but this name is too short and has more chance for conflicts. Guido Draheim [wrote](https://groups.google.com/g/comp.lang.forth/c/_WCxDv1qd2M/m/dZp-OAryvt8J) in `comp.lang.forth` on 2003-03-16: > I think that everyone has been thinking of using `IT` for something really clever, it's a nice short word - and I'd say that we should leave it for application usage. > > I want to support that argument also with real life experience in the telco world where there are a whole lot of abbreviations for various services, signals, connectors around. All too often now I see people making a SYNONYM at the file-start to get a second name for an ANS forth word that is needed in the implemenation but coincides with a common term of the application. This seems convincing to me. ## Typical use ```forth : struct: ( "name" -- wid.compilation.prev u.offset ) get-current vocabulary also latest-name name> execute definitions 0 ; : ;struct ( wid.compilation.prev u.offset -- ) s" __size" ['] constant execute-parsing set-current ; ``` ```forth \ In the application's vocabulary : it ( -- xt ) latest-name name>interpret ; defer foo \ ... : bar ... ; it is foo ``` ## Proposal ### Changes in existing sections Add the following line into the [Table 9.1: THROW code assignments](https://forth-standard.org/standard/exception#table:throw): > `-80` the compilation word list is empty Editorial note: the actual throw code may change. ### New glossary sections Add the following sections into [15.6.2 Programming-Tools extension words](https://forth-standard.org/standard/tools#subsection.15.6.2): #### 15.6.2.xxxx `LATEST-NAME-IN` TOOLS EXT _( wid -- nt|0 )_ If the word list identified by _wid_ is empty, then the returned value is `0`; otherwise, the name token _nt_ identifies the definition whose name was placed most recently into the word list _wid_. Note: _nt_ can only be returned for a definition that can be found in _wid_. See also: 15.6.2.xxxx `LATEST-NAME`, [15.6.2.2297 `TRAVERSE-WORDLIST`](https://forth-standard.org/standard/tools/TRAVERSE-WORDLIST), [6.1.0460 `;`](https://forth-standard.org/standard/core/Semi). #### 15.6.2.xxxx `LATEST-NAME` TOOLS EXT _( -- nt )_ If the compilation word list is not empty, the name token _nt_ identifies the definition whose name was placed most recently into this word list. Otherwise, the exception code `-80` is thrown. Note: _nt_ can only be returned for a definition that can be found in the compilation word list. See also: 15.6.2.xxxx `LATEST-NAME-IN`, [15.6.2.2297 `TRAVERSE-WORDLIST`](https://forth-standard.org/standard/tools/TRAVERSE-WORDLIST), [6.1.0460 `;`](https://forth-standard.org/standard/core/Semi). ### New rationale sections Add the following sections into [A.15.6 Glossary](https://forth-standard.org/standard/rationale#subsection.A.15.6): #### A.15.6.2.xxxx `LATEST-NAME-IN` The word `latest-name-in` cannot return an _nt_ that cannot be obtained using `search-wordlist` or `traverse-wordlist` applied to the same word list. See also: A.15.6.2.xxxx `LATEST-NAME`. #### A.15.6.2.xxxx `LATEST-NAME` The word `latest-name` cannot return an _nt_ that cannot be obtained using `search-wordlist` or `traverse-wordlist` applied to the compilation word list. In some Forth systems the word `:` (colon) places an _nt_ into the compilation word list and makes it hidden (unfindable). This _nt_ must not be available for `traverse-wordlist` and for `latest-name`. Thus, **formally**, only the words `;` (semicolon) and `does>` are allowed to add the _nt_ of a definition created with `:` (colon) to the compilation word list. After execution of `immediate`, `latest-name` shall return the same value as before this execution. If a Forth system does not provide the optional Search-Order word set, and in that Forth system the word `immediate` moves an _nt_ from one **internal** word list to another, this must not affect what `latest-name` returns, and this must not affect what `find-name` returns (for example, consider a case where two last words have the same name and `immediate` is used for the latest one). Typical use ```forth : var ( "name" -- ) variable 0 latest-name name>interpret execute ! ; ``` ## Reference implementation In this implementation for `latest-name-in` we assume that a _wid_ is an address that contains _nt_ of the most recently placed definition name into this word list. ``` : latest-name-in ( wid -- nt|0 ) @ ; ``` In this implementation for `latest-name-in` we assume that the values of _nt_ interpreted as unsigned numbers are monotonically increased (it works on most systems): ```forth : umax ( u1 u2 -- u.max ) 2dup u< if swap then drop ; : latest-name-in ( wid -- nt|0 ) >r 0 [: umax true ;] r> traverse-wordlist ; ``` An implementation for `latest-name`: ```forth : latest-name ( -- nt ) get-current latest-name-in dup if exit then -80 throw ; ``` ## Testing ``` : IT ( -- xt ) LATEST-NAME NAME>INTERPRET ; WORDLIST CONSTANT WL1 T{ : LN1 ; IT ' LN1 = -> TRUE }T T{ GET-CURRENT LATEST-NAME-IN ' LN1 = -> TRUE }T T{ :NONAME [ IT ] LITERAL ; EXECUTE ' LN1 = -> TRUE }T T{ : LN2 [ IT ] LITERAL ; LN2 ' LN1 = -> TRUE }T T{ WL1 LATEST-NAME-IN -> 0 }T GET-CURRENT WL1 SET-CURRENT ( wid.prev ) T{ ' LATEST-NAME CATCH -> -80 }T T{ : LN3 ; -> }T SET-CURRENT T{ IT ' LN2 = -> TRUE }T ``` ,------------------------------------------ | 2025-09-15 20:46:29 EricBlake replies: | proposal - New words: latest-name and latest-name-in | see: https://forth-standard.org/proposals/new-words-latest-name-and-latest-name-in#reply-1547 `------------------------------------------ Suggested typo and wording fixes: > Problem The example here duplicates the example given later in the Rational "Typical Use", except that `0 latest-name name> execute !` should use `name>interpret` instead of `name>`. "examples is provided" should be "are" > Rationale Implementation Options "ixl ... travarse-wordlist" should be "traverse-wordlist" > Typical Use In `struct:`, "also latest-name name> execute definitions" - `name>` is not a standard word; is this intended to be `name>interpret`? Likewise, execute-parsing used in `;struct` is not standard, but appears to be well-known. > New rationale section In addition to `;` and `does>`, it is possible that `;code` can change the latest definition in a wid. > Testing It is odd that most of this proposal (as of this revision) uses lower-case, but the Testing section (still) uses upper-case (yes, I know that there are also proposals on how case (in-)sensitivity should be adjusted for future standards). ,------------------------------------------ | 2025-09-16 14:00:54 EricBlake replies: | proposal - Non parsing CREATE | see: https://forth-standard.org/proposals/non-parsing-create#reply-1548 `------------------------------------------ Possible reference implementation, shown here with a dependency on the memory word set (a BUFFER: version would also be possible): ``` : NON-PARSING-CREATE ( c-addr u -- ) DUP 7 + ALLOCATE THROW DUP S" CREATE " ROT SWAP CMOVE DUP >R 7 + SWAP DUP 7 + >R CMOVE 2R> OVER SWAP EVALUATE FREE THROW ; ``` Obviously, this reference implementation is probably less efficient than reusing an implementation's existing factor of create. Also, this implementation leaks an allocated buffer if the evaluate fails (such as if create throws because the dictionary has run out of space). ,------------------------------------------ | 2025-09-16 17:18:37 AntonErtl replies: | proposal - Non parsing CREATE | see: https://forth-standard.org/proposals/non-parsing-create#reply-1549 `------------------------------------------ Out of 8 uses of `nextname` in Gforth's image, only one is with `create`. So it might be better to have `nextname`. Even more general: `execute-parsing` ([Implementation in standard Forth](http://theforth.net/package/compat/current-view/execute-parsing.fs); take a look at the example in that file).