,---------------. | Contributions | `---------------´ ,------------------------------------------ | 2024-06-20 12:04:57 ruv wrote: | proposal - Allow the text interpreter to use `WORD` and the pictured numeric output | see: https://forth-standard.org/proposals/allow-the-text-interpreter-to-use-word-and-the-pictured-numeric-output#contribution-345 `------------------------------------------ ## Author Ruv ## Change Log - 2024-06-20 Initial revision ## Problem In traditional implementations, the Forth text interpreter itself uses the word [`WORD`](https://forth-standard.org/standard/core/WORD) and thus it clobbers a transient region containing the parsed lexeme. But the standard does not allow the Forth text interpreter to clobber this region. Ditto for the words [`<#`](https://forth-standard.org/standard/core/num-start), [`#>`](https://forth-standard.org/standard/core/num-end), when, for example, the text interpreter shows the stack items after the input string is interpreted. This problem was pointed out in the contribution [[315] WORD and the text interpreter](https://forth-standard.org/standard/core/WORD#contribution-315) by Anton Ertl. ## Solution The proposed options are: 1. Fix the systems to avoid clobbering the WORD buffer in the text interpreter 2. Change the standard to allow clobbering the word buffer by parsing in the text interpreter. Since the word `WORD` will be obsolete in any case, there is little point in fixing the systems. Therefore, I propose to change the standard so the text interpreter can overwrite this buffer. Concerning the pictured numeric output string buffer, it can be used when numbers are displayed, according to [3.2.1.3 Free-field number display](https://forth-standard.org/standard/usage#subsubsection.3.2.1.3): "Number display may use the pictured numeric output string buffer". An additional benefit is that the user will be allowed to implement their own limited but standard-compliant text interpreter using `WORD` (this approach is probably used in some old programs) and displaying numbers. ## Proposal Add into the section [3.3.3.6 Other transient regions](https://forth-standard.org/standard/usage#usage:transient), after the second paragraph (that ends with "could also corrupt the regions.") the following paragraph: > The data space regions identified by `WORD` and `#>` may become invalid after any step in the Forth text interpreter loop. ,------------------------------------------ | 2024-06-20 15:25:45 ruv wrote: | requestClarification - Eliminating ambiguous conditions for Tick | see: https://forth-standard.org/standard/core/Tick#contribution-346 `------------------------------------------ I'm interested in how we want to define Tick's behavior in edge cases, as part of the general trend of reducing the number of ambiguous conditions. #### Undefined word When a word is not found (regardless of STATE), Tick shall throw an exception `-13` ("undefined word"). It's obvious. #### Undefined interpretation semantics When interpretation semantics for the word are not defined by the standard, Tick shall do one from the following options: - return _xt_ for a system-defined behavior (so, an ambiguous condition exists if this _xt_ is executed); - throw exception `-32` "invalid _name_ argument"; - throw exception `-13` "undefined word" (I think, this option is undesired. Can we exclude it?). This case is for the words like `if`, `>r`, `exit`, `begin`, etc. #### Interpretation semantics are defined, but execution semantics undefined The main question is this: should we allow Tick to throw an exception if interpretation semantics for the word are defined, but execution semantics are not defined? At the moment, it's for the five standard words: `s"`, `s\"`, `to`, `is`, `action-of`. Most systems in this case return _xt_ that, when executed in interpretation state, performs the interpretation semantics for the word. Behavior in compilation state vary from system to system. So, by the fact, an ambiguous condition exists **only** when this _xt_ is executed **in compilation state**. What do you think? ,---------. | Replies | `---------´ ,------------------------------------------ | 2024-06-18 07:17:32 AntonErtl replies: | proposal - Support for single line comments during `evaluate` | see: https://forth-standard.org/proposals/support-for-single-line-comments-during-evaluate-#reply-1238 `------------------------------------------ What about other parsing words, e.g., `s"`, or user-defined parsing words? If you want to support multi-line `evaluate`, wouldn't it be better to extend `evaluate` to only present the first line to parsing, and then after `refill` the next line and so on. Then `\` should work automatically as intended, ,------------------------------------------ | 2024-06-18 07:50:22 ruv replies: | proposal - Special memory access words | see: https://forth-standard.org/proposals/special-memory-access-words#reply-1239 `------------------------------------------ > For l-family and x-family words, it should be mentioned in a note that the word can be only provided by the system if the cell size is not less than 32 and 64 bits correspondingly. I mean, it should be mentioned in some **normative part**: either in the glossary entry for the word, or somewhere in the section "Optional Special Memory Access word set". ,------------------------------------------ | 2024-06-18 08:15:46 ruv replies: | proposal - Support for single line comments during `evaluate` | see: https://forth-standard.org/proposals/support-for-single-line-comments-during-evaluate-#reply-1240 `------------------------------------------ > What about other parsing words, e.g., s", or user-defined parsing words? It can be solved by adding: "When parsing from a **text string** using a space delimiter, control characters shall be treated the same as the space character" (I.e., the same as for a file). So, `parse-name` will skip a line terminator in evaluating string too. (It seems like most systems already behave this way). `s"`, will be able to parse multiple text lines in an evaluating string (if `"` is not found in the current text line), but I don't see any problem with that. > If you want to support multi-line evaluate, wouldn't it be better to extend evaluate to only present the first line to parsing, and then after refill the next line and so on. Then \ should work automatically as intended, 1. My idea is that a program should not depend on `refill`, i.e., whether the parse area contains a single text line or multiple text lines. 2. Changing `\` is almost portable. I.e., it can be implemented via a polyfill (a portable module). 3. This approach is slightly more efficient, since we don't need to break text into lines before feed the text interpreter. 4. Less internal states, easier implementation (compared to support for refilling when evaluating a string). ,------------------------------------------ | 2024-06-18 13:50:08 GeraldWodni replies: | proposal - Special memory access words | see: https://forth-standard.org/proposals/special-memory-access-words#reply-1241 `------------------------------------------ __Yes, thanks!__ I cannot believe we have not yet standardized this, let's check for system compliance and get that standardized. There is a small typo: It ways `w` is `Wyde`. I guess that should be `Word` or `Wide`? One little bike-shedding I want to point out, is that words like `w>s` look like conversion words. `lbe` does not. I would profer `>lbe`, but this should be discussed face 2 face. ,------------------------------------------ | 2024-06-18 17:32:06 AntonErtl replies: | proposal - Special memory access words | see: https://forth-standard.org/proposals/special-memory-access-words#reply-1242 `------------------------------------------ The sign-extending words `w>s` etc. are specified as ( x -- n ) because these can be considered as bitwise operations like `lshift`. Alternatively, they could be specified as ( u -- n ) because the result of the fetching words and the byte-order words are unsigned. But it makes no difference how they are specified. The fetching words and byte-order words have unsigned results because they zero-extend (if anything) the data that they fetch or reorder. Specifying u here indicates that. The addresses are specified as c-addr to point out that the addresses are not required to be aligned. This is less clear with addr. About the wording of the sign-extension words: good point. I will change it to > Sign-extend the low-order 8 bits in x to the full cell width. Concerning the l and x words: I expect that these words will be optional. It should be obvious to the implementor of a 32-bit system that the x words don't make sense for their system, but just in case we could mention that in the Rationale. I don't see a point in putting this in the normative part, and it would make the text more verbose and less usable. There is a section "Larger address units" that discusses the case of address units larger than 8 bits. I don't see how the proposal (or any other practically usable wordset) could work with `w@` `w!` that read the low-order 16 bits of an address unit and still result in portable code. There is a reason why byte-addressed architectures have won. The section "Larger address units" outlines an approach where `w@` `w!` etc. use only 8 bits per address unit. I don't want to prescribe this approach (maybe there are other ways, although I don't see them), so `w@` `w!` etc. are not specified in this way; the current specification is also much more readable for the vast majority of users (those on byte-addressed systems) than one that prescribes accesses to 8 bits per au. Addressing systems with larger aus in the rationale looks good enough to me; and maybe the implementors of those systems don't want to implement any of these words anyway; there's a reason why they went for a system that is not byte-addressed. But maybe I should be putting in some normative text in the proposal that prescribes 8 bits per address unit (maybe associated with a type b-addr). `Octet` is a term from telecommunications that has no place in computing since 8-bit bytes won >50 years ago. `c>s` sign-extends 8-bit bytes. There is no need for b words, even on systems with larger address units than bytes. See the section "Larger address units". However, with possible systems where `c@` zero-extends bigger units than 8 bits, we need a `c>u` that zero-extends 8-bit units. I will add this to the proposal. I will replace "bottom" with "least significant". As for addressing larger address units, see the discussion above. "Wyde" is not a typo, but a word I have from Bernd Paysan; just as "byte" is not a typo for "bite", and "nybble" is not a typo for "nibble" (but "nibble" is more common). "w" may originally have come from DEC/Motorola/Intel usage, which is based on the idea of a 16-bit word (as in the PDP-11, 6800, and 8080), but other architectures started with 32-bit words, so "word" would not only conflict with other usage of "word" in Forth, but also with other usage in various computer architectures, and with the general use in computer architecture (where "word" means the same as "cell" means in Forth). 'lbe' is a conversion word, but it converts to and from big-endian order, so how should we place the `>`? `l>be` or `lbe>`. Either one is wrong for one of the two usages, so it was just called `lbe`. ,------------------------------------------ | 2024-06-18 21:19:30 ruv replies: | proposal - Special memory access words | see: https://forth-standard.org/proposals/special-memory-access-words#reply-1243 `------------------------------------------ > 'lbe' is a conversion word, but it converts to and from big-endian order, so how should we place the `>`? I agree with the form like `lbe`. It can be read as a modifier like `bin ( fam1 -- fam2 )`. ----- > But maybe I should be putting in some normative text in the proposal that prescribes 8 bits per address unit (maybe associated with a type b-addr). Introducing _b-addr_ seems like a good idea! It will then be obvious that these words are for byte-addressed systems only (by design). > The addresses are specified as c-addr to point out that the addresses are not required to be aligned. This is less clear with addr. Can't agree. According to [3.1.1 Data-type relationships](https://forth-standard.org/standard/usage#subsection.3.1.1): _a-addr ⇒ c-addr ⇒ addr ⇒ u_ Where "_c-addr_" is a symbol for the "**character-aligned** address" data type. See also [3.3.3.1 Address alignment](https://forth-standard.org/standard/usage#subsubsection.3.3.3.1). In Forth-2012, the _c-addr_ data type may be not equal to the _addr_ data type; it means, the address returned by `align here 1+` does not belong to _c-addr_ in the general case (because, in some plausible system a character consists of four address units). So it's **absolutely clear** that a parameter of the _addr_ data type is not required to be aligned at all. In Forth-2019, _c-addr_ is equal to _addr_, but "c-addr" is still defined as "**character-aligned** address". Thus, a better way is to either introduce _b-addr_, or use _addr_. ----- > The fetching words and byte-order words have unsigned results because they zero-extend (if anything) the data that they fetch or reorder. Specifying _u_ here indicates that. To indicate zero-extending, another data type should be introduced. By design, the data type _u_ does not indicate zero-extending. This data type only indicates that the operation interprets a parameter of this type as unsigned number, and the result will be incorrect if the parameter is interpreted by the user as, for example, a negative number. So, for a word that is specified as `wbe ( u1 -- u2 )`, the stack parameter in the position _u2_ should be always interpreted by the user as a particular unsigned number (ditto for _u1_). But it's wrong in the general case for the word `wbe`. Because after the byte order is changed (or before it is changed), **the parameter is just a tuple of bits**. Also, it is possible that even in the native byte order a parameter is not interpreted as a number at all (that is, neither a signed number, nor an unsigned number). For example, _xt_ is formally not a number (and _xt ⇒ x_). And when you specify _u_, you exclude such use case. ,------------------------------------------ | 2024-06-19 08:30:04 ruv replies: | proposal - Special memory access words | see: https://forth-standard.org/proposals/special-memory-access-words#reply-1244 `------------------------------------------ > after the byte order is changed (or before it is changed), **the parameter is just a tuple of bits**. An improved version of the parameter data type specifications (via stack diagrams): ``` w@ ( addr -- x ) wbe ( u1 -- x2 | x1 -- u2 ) w>s ( x1 -- n2 ) ``` But I'm still not happy enough with it. Because the following sequence: ``` w@ ( addr -- x ) wbe ( x -- u ) w>s ( x -- n ) ``` is formally incorrect, because on the last step we actually convert the parameter of the _u_ data type to the parameter of the _n_ data type. If we have: ``` w@ ( addr -- x ) wbe ( x1 -- x2 ) w>s ( x1 -- n2 ) ``` The following sequence is correct (in terms of data types matching): ``` w@ ( addr -- x ) wbe ( x -- x ) w>s ( x -- n ) ``` ,------------------------------------------ | 2024-06-19 11:57:08 PeterFalth replies: | proposal - Special memory access words | see: https://forth-standard.org/proposals/special-memory-access-words#reply-1245 `------------------------------------------ The following sequence is correct (in terms of data types matching): w@ ( addr -- x ) wbe ( x -- x ) w>s ( x -- n ) Does this sequence make any sense? how can w>s know that the stack item is in big-endian in this case and work properly? I think it is a mistake to have a separate words for sign extending. In lxf/ntf I have When searching for a word with NDCS, what XT should be returned? NDCS means "Non-Default Compilation Semantics". Immediate words (in the normative sense) are NDCS words, but an NDCS word can be not immediate (in the normative sense). At the moment my position is as follows. ### Meaning of _xt_ If a word is found by `search-wordlist`, the returned _xt_ identifies the execution semantics for the word (standard or system-specific), and it is the same _xt_ regardless of STATE. ### Ambiguous conditions An ambiguous condition exists if interpretation semantics for the word are not defined by the standard and _xt_ is executed (by any means). An ambiguous condition exists if execution semantics for the word are not defined by the standard and _xt_ is executed in compilation state. ### Meaning of the code (the top value) When a word is found, the top value shall be `-1` if the word is an ordinary word (i.e., it has default interpretation semantics and default compilation semantics), and it shall be `1` otherwise. Thus, the top value is `1` if the found word is an NDCS word. This will not have any unexpected effects concerning immediacy due to the mentioned ambiguous conditions. ### Edge cases explanation If interpretation semantics or execution semantics are not defined by the standard, _xt_ identifies the **system-specific** execution semantics for the word with the following constrain: if this _xt_ is executed in interpretation state, the interpretation semantics for the word (the standard behavior, or system-specific if not defined by the standard) shall be performed. For words like `to` and `s"`, for which interpretation semantics are defined, and execution semantics are not defined, the returned _xt_, when executed, shall perform the corresponding interpretation semantics in interpretation state regardless of implementation details, and it is not allowed to perform this _xt_ in compilation state because the behavior depends on implementation details. ### Ignoring an existing word Probably, it is conceivable if `search-wordlist` returns `0` for a word for which interpretation semantics are not defined by the standard — since a standard program cannot execute the returned _xt_ anyway; examples of such words are `if`, `>r`, etc. But a more reliable option is to return an _xt_ that simply throws an exception when executed (or perform a more useful system-defined behavior, if any — see [A.3.4.3.2 Interpretation semantics](https://forth-standard.org/standard/rationale#paragraph.A.3.4.3.2)). ,------------------------------------------ | 2024-06-19 13:18:13 ruv replies: | requestClarification - NDCS xt | see: https://forth-standard.org/standard/search/SEARCH-WORDLIST#reply-1247 `------------------------------------------ One important thing is that `search-wordlist` provides information whether the found word is an ordinary word (or _implemented_ as an ordinary word), or not. The later standardized words `find-name-in`, `name>compile`, `name>interpret` don't allow to obtain this information at the moment. So, `search-wordlist` cannot be implemented via these words. ,------------------------------------------ | 2024-06-19 13:31:34 ruv replies: | proposal - Special memory access words | see: https://forth-standard.org/proposals/special-memory-access-words#reply-1248 `------------------------------------------ `w>s ` accepts argument in the only native endianness of the Forth system. ``` w@ ( addr -- x ) wbe ( x -- x ) w>s ( x -- n ) ``` In this sequence we know that the wyde **in memory** is in big-endian. `w@` reads this wyde as is, and `wbe` converts it from the big-endian to the native endianness. And `w>s` interprets 16 least significant bits of its parameter as a **signed 16-bit 2s complement value**, and extends the sign to the full cell. Probably we should also specify two's complement in some normative parts. ,------------------------------------------ | 2024-06-20 09:35:36 ruv replies: | proposal - New words: latest-name and latest-name-in | see: https://forth-standard.org/proposals/new-words-latest-name-and-latest-name-in#reply-1249 `------------------------------------------ ## Author Ruv ## Change Log - 2023-10-22 Initial revision - 2023-10-23 Add testing, examples, a question to discuss, change the throw code description - 2023-10-27 Some rationales and explanations added, the throw code description changed back, better wording in some places - 2024-06-20 Fix some typos, make some wording and formatting better, add some examples and test cases, add motivation for `LATEST-NAME-IN`, change the status to "formal". ## Problem In some applications, mainly in libraries and extensions, the capability to obtain the most recently added definition is very useful and demanded. To make such programs portable, we should introduce a standard method to obtain the most recently added word. For example, if we are creating a library for decoration, tracing, support for OOP, simple DSLs (e.g., to describe Final State Machines), etc — it is always useful to have an accessor to the recent definition, instead of redefining a lot of words to define such an access method yourself, or juggling with the input buffer and search. One simple example. If we want to have variables that are initialized by zero, we can use: ``` : var ( "name" -- ) variable 0 latest-name name> execute ! ; ``` A number of specific examples is provided in my [post](https://github.com/ForthHub/discussion/discussions/153#discussioncomment-7418639) on ForthHub (those examples are not inserted here so as not to bloat the text). And additionally, there has been much discussions regarding standardization of such a method in recent decades. For example, Elizabeth D. Rather [wrote](https://groups.google.com/g/comp.lang.forth/c/RsJQVnEQQuw/m/M1PrzPAcE-0J) on 2011-12-09 in `comp.lang.forth`: > AFAIK most if not all Forths have some method for knowing the latest definition, it's kinda necessary. The problem is, that they all do it differently (at different times, in different forms, etc.), which is why it hasn't been possible to standardize it. > > Although it's a system necessity, I haven't found this of much value in application programming. > > Elizabeth D. Rather It's true: depending on the system, an internal method can return the recent word regardless of the compilation word list, or depending on the compilation word list, a completed definition, or not yet completed definition, also unnamed definition, or only named definition, etc. The value in application programming is shown by me above. Some known internal methods: `latest ( -- nt|0 )`, `last @ ( -- nt|0 )`, `latestxt ( -- xt|0 )`, etc. Thus, although almost every Forth system contains such a method, there is no portable way for programs to obtain the latest definition. But a such portable method is actually very useful, as shown in my examples. ## Solution Let's introduce the following words: - `LATEST-NAME-IN ( wid -- nt|0 )` - `LATEST-NAME ( -- nt )` The first word returns the name token for the definition whose name was placed most recently into the given word list, or zero if this word list is empty. The second word returns the name token for the definition whose name was placed most recently into **the compilation word list**, or throws an exception if there is no such definition. These words do not expose or limit any internal mechanism of the compiler. They just provide information about word lists, like the words `FIND-NAME-IN`, `FIND-NAME`, and `TRAVERSE-WORDLIST` do. It's a kind of introspection/reflection. This words are intended for programs. The system may use them, but is not required to do so. The system may continue to use its internal `LAST`, `LATEST`, or whatever it was using before. It seems, the best place for these words is the section [15.6.2 Programming-Tools extension words](https://forth-standard.org/standard/tools#subsection.15.6.2), where `TRAVERSE-WORDLIST` is also placed. ### Rationale #### Connection with word lists By considering definitions in the frame of a word list only, we solve several problems, namely: 1. A word list contains only completed definitions (see the accepted proposal #153 [Traverse-wordlist does not find unnamed/unfinished definitions](https://forth-standard.org/proposals/traverse-wordlist-does-not-find-unnamed-unfinished-definitions?hideDiff#reply-487)). This eliminates the question of whether the word of returned _nt_ is finished — yes, it is always finished (completed). 2. Nameless definitions are not considered since they are not placed into the compilation word list (regardless of whether the system creates a name token for them, or places them into an internal system-specific word list). 3. An extension or library can create definitions in its internal word list for internal purposes. And it will not affect the compilation word list or other user-defined word lists. Thus, the user of such library always gets the expected result from `latest-name` (regardless of what words are created by this library for internal purposes on the fly). For example, when different dictionary spaces will be introduced, we can implement something like local variables (or local definitions) in portable way, and creating such a definition will not affect the value that `latest-name` returns. #### Return values As a matter of practice, almost all the use cases for the word `LATEST-NAME` imply that the requested definition exists, and if it doesn't exist, only an error can be reported. So the option to return `0` by this word only burdens users with having to analyze this zero, or redefine this word as: ``` : latest-name ( -- nt ) latest-name dup 0= -80 and throw ; ``` If the user needs to handle the case where the compilation word list is empty, they can use the word `latest-name-in` as: ``` get-current latest-name-in dup if ( nt ) ... else ( 0 ) drop ... then ``` #### Implementation options If the the word list structure in a Forth system contains information about the latest placed definition, the implementations for the proposed words are trivial. In some plausible Forth systems, the word list structure doesn't contain any information about the definition that was placed into this word list most recently. Such systems might not provide the proposed words, or they are changed to keep the mentioned information in the word list structure. It seems, in most systems the word list structure contains this information. Some checked systems: - SwiftForth, VFX, Gforth, minForth, ikForth, SP-Forth — a word list keeps information about the definition that was placed in it most recently; - lxf/ntf 2017 — it seems, it doesn't keep this information. If a system does not implement The optional Search-Order word set, it might not provide the word `LATEST-NAME-IN`. #### Naming The names `LATEST-NAME-IN` and `LATEST-NAME` of new words are similar to `FIND-NAME-IN` and `FIND-NAME` by the form. Stack effects are also similar. The difference is that **find** is a verb, but **latest** is an adjective (or sometimes a noun, see [Wiktionary](https://en.wiktionary.org/wiki/latest#Adjective)). Both are historical in their use in naming words. As well as "NAME". In Forth-84 "NAME" _in word names_ denoted **NFA** (name field address), and now it denotes a **name token**, which is the successor of NFA. In all standard words, e.g. `FIND-NAME`, `NAME>STRING`, `NAME>COMPILE`, etc. (except `PARSE-NAME`), "NAME" denotes a name token. NB: the term "token" in "name token" does not mean a character sequence! It's used in a general sense, like "something serving as an expression of something else" (see [Wiktionary](https://en.wiktionary.org/wiki/token#Noun)). #### Throw code description If the throw code description states that there is no latest name, it can be confusing since latest name in _some sense_ probably always exists. Therefore, it's better to say: "the compilation word list is empty" — it is what actually happens. #### Motivation for `LATEST-NAME-IN` 1. It's a natural factor for `LATEST-NAME`. It's **always possible** to extract this factor from the implementation of `LATEST-NAME`, because the latter returns _nt_ from **the compilation word list**, and the system should take _wid_ of the compilation word list and extract most recent _nt_ from this word list. 2. It's very important to specify the behavior of this word to avoid different behavior in different systems, since in many systems this word will exist (will be implemented as a natural factor). 3. In some cases a program needs to check if a word list is empty, or obtain the latest word from a particular word list (for example, to use this word as entry point, like `main`, or as the default exported word from a module). 4. These both words are optional. And if `LATEST-NAME-IN` is not provided, it can be implemented in a portable way via `LATEST-NAME` as: ``` : latest-name-in ( wid -- nt|0 ) get-current >r set-current ['] latest-name catch if 0 then r> set-current ; ``` ## Things to discuss Is it worth introducing the word `LATEST-NAME-XT ( -- xt )`? If `name>interpret` never returns `0` (see my [comment](https://forth-standard.org/proposals/name-interpret-wording?hideDiff#reply-1118)), this word can be implemented as: ``` : latest-name-xt ( -- xt ) latest-name name>interpret ; ``` The desired (and much discussed) pattern is: ``` defer bar : foo ... ; latest-name-xt is bar ``` Sometimes the name "`it`" has been suggested for this word, but this name is too short and has more chance for conflicts. Guido Draheim [wrote](https://groups.google.com/g/comp.lang.forth/c/_WCxDv1qd2M/m/dZp-OAryvt8J) in `comp.lang.forth` on 2003-03-16: > I think that everyone has been thinking of using `IT` for something really clever, it's a nice short word - and I'd say that we should leave it for application usage. > > I want to support that argument also with real life experience in the telco world where there are a whole lot of abbreviations for various services, signals, connectors around. All too often now I see people making a SYNONYM at the file-start to get a second name for an ANS forth word that is needed in the implemenation but coincides with a common term of the application. This seems convincing to me. ## Typical use ``` : STRUCT: ( "name" -- wid.current.old u.offset ) GET-CURRENT VOCABULARY ALSO LATEST-NAME NAME> EXECUTE DEFINITIONS 0 ; ``` ``` \ In the application's vocabulary : IT ( -- xt ) LATEST-NAME NAME>INTERPRET ; DEFER FOO : BAR ... ; IT IS FOO ``` ## Proposal Add the following line into the [Table 9.1: THROW code assignments](https://forth-standard.org/standard/exception#table:throw): > `-80` the compilation word list is empty Add the following sections into [15.6.2 Programming-Tools extension words](https://forth-standard.org/standard/tools#subsection.15.6.2): #### 15.6.2.2541 LATEST-NAME-IN _( wid -- nt|0 )_ Remove the word list identifier _wid_ from the stack. If the corresponding word list is empty, then return `0`; otherwise, return the name token _nt_ for the definition whose name was placed most recently into this word list. #### 15.6.2.2542 LATEST-NAME _( -- nt )_ Return the name token _nt_ for the definition whose name was placed most recently into the compilation word list, if such a definition exists. Otherwise, throw the exception code `-80`. ## Reference implementation In this implementation we assume that _wid_ is an address that contains _nt_ of the most recently placed definition name into the word list _wid_. ``` : LATEST-NAME-IN ( wid -- nt|0 ) @ ; : LATEST-NAME ( -- nt ) GET-CURRENT LATEST-NAME-IN DUP IF EXIT THEN -80 THROW ; ``` ## Testing ``` : IT ( -- xt ) LATEST-NAME NAME>INTERPRET ; WORDLIST CONSTANT WL1 T{ : LN1 ; IT ' LN1 = -> TRUE }T T{ GET-CURRENT LATEST-NAME-IN ' LN1 = -> TRUE }T T{ :NONAME [ IT ] LITERAL ; EXECUTE ' LN1 = -> TRUE }T T{ : LN2 [ IT ] LITERAL ; LN2 ' LN1 = -> TRUE }T T{ WL1 LATEST-NAME-IN -> 0 }T T{ GET-CURRENT WL1 SET-CURRENT ' LATEST-NAME CATCH SWAP SET-CURRENT -> -80 }T ``` ,------------------------------------------ | 2024-06-20 09:47:59 ruv replies: | proposal - New words: latest-name and latest-name-in | see: https://forth-standard.org/proposals/new-words-latest-name-and-latest-name-in#reply-1250 `------------------------------------------ ## Author Ruv ## Change Log - 2023-10-22 Initial revision - 2023-10-23 Add testing, examples, a question to discuss, change the throw code description - 2023-10-27 Some rationales and explanations added, the throw code description changed back, better wording in some places - 2024-06-20 Fix some typos, make some wording and formatting better, add some examples and test cases, add motivation for `LATEST-NAME-IN`, change the status to "formal". - 2024-06-20 Add a test case to check that `LATEST-NAME` returns different value after the compilation word list is switched. ## Problem In some applications, mainly in libraries and extensions, the capability to obtain the most recently added definition is very useful and demanded. To make such programs portable, we should introduce a standard method to obtain the most recently added word. For example, if we are creating a library for decoration, tracing, support for OOP, simple DSLs (e.g., to describe Final State Machines), etc — it is always useful to have an accessor to the recent definition, instead of redefining a lot of words to define such an access method yourself, or juggling with the input buffer and search. One simple example. If we want to have variables that are initialized by zero, we can use: ``` : var ( "name" -- ) variable 0 latest-name name> execute ! ; ``` A number of specific examples is provided in my [post](https://github.com/ForthHub/discussion/discussions/153#discussioncomment-7418639) on ForthHub (those examples are not inserted here so as not to bloat the text). And additionally, there has been much discussions regarding standardization of such a method in recent decades. For example, Elizabeth D. Rather [wrote](https://groups.google.com/g/comp.lang.forth/c/RsJQVnEQQuw/m/M1PrzPAcE-0J) on 2011-12-09 in `comp.lang.forth`: > AFAIK most if not all Forths have some method for knowing the latest definition, it's kinda necessary. The problem is, that they all do it differently (at different times, in different forms, etc.), which is why it hasn't been possible to standardize it. > > Although it's a system necessity, I haven't found this of much value in application programming. > > Elizabeth D. Rather It's true: depending on the system, an internal method can return the recent word regardless of the compilation word list, or depending on the compilation word list, a completed definition, or not yet completed definition, also unnamed definition, or only named definition, etc. The value in application programming is shown by me above. Some known internal methods: `latest ( -- nt|0 )`, `last @ ( -- nt|0 )`, `latestxt ( -- xt|0 )`, etc. Thus, although almost every Forth system contains such a method, there is no portable way for programs to obtain the latest definition. But a such portable method is actually very useful, as shown in my examples. ## Solution Let's introduce the following words: - `LATEST-NAME-IN ( wid -- nt|0 )` - `LATEST-NAME ( -- nt )` The first word returns the name token for the definition whose name was placed most recently into the given word list, or zero if this word list is empty. The second word returns the name token for the definition whose name was placed most recently into **the compilation word list**, or throws an exception if there is no such definition. These words do not expose or limit any internal mechanism of the compiler. They just provide information about word lists, like the words `FIND-NAME-IN`, `FIND-NAME`, and `TRAVERSE-WORDLIST` do. It's a kind of introspection/reflection. This words are intended for programs. The system may use them, but is not required to do so. The system may continue to use its internal `LAST`, `LATEST`, or whatever it was using before. It seems, the best place for these words is the section [15.6.2 Programming-Tools extension words](https://forth-standard.org/standard/tools#subsection.15.6.2), where `TRAVERSE-WORDLIST` is also placed. ### Rationale #### Connection with word lists By considering definitions in the frame of a word list only, we solve several problems, namely: 1. A word list contains only completed definitions (see the accepted proposal #153 [Traverse-wordlist does not find unnamed/unfinished definitions](https://forth-standard.org/proposals/traverse-wordlist-does-not-find-unnamed-unfinished-definitions?hideDiff#reply-487)). This eliminates the question of whether the word of returned _nt_ is finished — yes, it is always finished (completed). 2. Nameless definitions are not considered since they are not placed into the compilation word list (regardless of whether the system creates a name token for them, or places them into an internal system-specific word list). 3. An extension or library can create definitions in its internal word list for internal purposes. And it will not affect the compilation word list or other user-defined word lists. Thus, the user of such library always gets the expected result from `latest-name` (regardless of what words are created by this library for internal purposes on the fly). For example, when different dictionary spaces will be introduced, we can implement something like local variables (or local definitions) in portable way, and creating such a definition will not affect the value that `latest-name` returns. #### Return values As a matter of practice, almost all the use cases for the word `LATEST-NAME` imply that the requested definition exists, and if it doesn't exist, only an error can be reported. So the option to return `0` by this word only burdens users with having to analyze this zero, or redefine this word as: ``` : latest-name ( -- nt ) latest-name dup 0= -80 and throw ; ``` If the user needs to handle the case where the compilation word list is empty, they can use the word `latest-name-in` as: ``` get-current latest-name-in dup if ( nt ) ... else ( 0 ) drop ... then ``` #### Implementation options If the the word list structure in a Forth system contains information about the latest placed definition, the implementations for the proposed words are trivial. In some plausible Forth systems, the word list structure doesn't contain any information about the definition that was placed into this word list most recently. Such systems might not provide the proposed words, or they are changed to keep the mentioned information in the word list structure. It seems, in most systems the word list structure contains this information. Some checked systems: - SwiftForth, VFX, Gforth, minForth, ikForth, SP-Forth — a word list keeps information about the definition that was placed in it most recently; - lxf/ntf 2017 — it seems, it doesn't keep this information. If a system does not implement The optional Search-Order word set, it might not provide the word `LATEST-NAME-IN`. #### Naming The names `LATEST-NAME-IN` and `LATEST-NAME` of new words are similar to `FIND-NAME-IN` and `FIND-NAME` by the form. Stack effects are also similar. The difference is that **find** is a verb, but **latest** is an adjective (or sometimes a noun, see [Wiktionary](https://en.wiktionary.org/wiki/latest#Adjective)). Both are historical in their use in naming words. As well as "NAME". In Forth-84 "NAME" _in word names_ denoted **NFA** (name field address), and now it denotes a **name token**, which is the successor of NFA. In all standard words, e.g. `FIND-NAME`, `NAME>STRING`, `NAME>COMPILE`, etc. (except `PARSE-NAME`), "NAME" denotes a name token. NB: the term "token" in "name token" does not mean a character sequence! It's used in a general sense, like "something serving as an expression of something else" (see [Wiktionary](https://en.wiktionary.org/wiki/token#Noun)). #### Throw code description If the throw code description states that there is no latest name, it can be confusing since latest name in _some sense_ probably always exists. Therefore, it's better to say: "the compilation word list is empty" — it is what actually happens. #### Motivation for `LATEST-NAME-IN` 1. It's a natural factor for `LATEST-NAME`. It's **always possible** to extract this factor from the implementation of `LATEST-NAME`, because the latter returns _nt_ from **the compilation word list**, and the system should take _wid_ of the compilation word list and extract most recent _nt_ from this word list. 2. It's very important to specify the behavior of this word to avoid different behavior in different systems, since in many systems this word will exist (will be implemented as a natural factor). 3. In some cases a program needs to check if a word list is empty, or obtain the latest word from a particular word list (for example, to use this word as entry point, like `main`, or as the default exported word from a module). 4. These both words are optional. And if `LATEST-NAME-IN` is not provided, it can be implemented in a portable way via `LATEST-NAME` as: ``` : latest-name-in ( wid -- nt|0 ) get-current >r set-current ['] latest-name catch if 0 then r> set-current ; ``` ## Things to discuss Is it worth introducing the word `LATEST-NAME-XT ( -- xt )`? If `name>interpret` never returns `0` (see my [comment](https://forth-standard.org/proposals/name-interpret-wording?hideDiff#reply-1118)), this word can be implemented as: ``` : latest-name-xt ( -- xt ) latest-name name>interpret ; ``` The desired (and much discussed) pattern is: ``` defer bar : foo ... ; latest-name-xt is bar ``` Sometimes the name "`it`" has been suggested for this word, but this name is too short and has more chance for conflicts. Guido Draheim [wrote](https://groups.google.com/g/comp.lang.forth/c/_WCxDv1qd2M/m/dZp-OAryvt8J) in `comp.lang.forth` on 2003-03-16: > I think that everyone has been thinking of using `IT` for something really clever, it's a nice short word - and I'd say that we should leave it for application usage. > > I want to support that argument also with real life experience in the telco world where there are a whole lot of abbreviations for various services, signals, connectors around. All too often now I see people making a SYNONYM at the file-start to get a second name for an ANS forth word that is needed in the implemenation but coincides with a common term of the application. This seems convincing to me. ## Typical use ``` : STRUCT: ( "name" -- wid.current.old u.offset ) GET-CURRENT VOCABULARY ALSO LATEST-NAME NAME> EXECUTE DEFINITIONS 0 ; ``` ``` \ In the application's vocabulary : IT ( -- xt ) LATEST-NAME NAME>INTERPRET ; DEFER FOO : BAR ... ; IT IS FOO ``` ## Proposal Add the following line into the [Table 9.1: THROW code assignments](https://forth-standard.org/standard/exception#table:throw): > `-80` the compilation word list is empty Add the following sections into [15.6.2 Programming-Tools extension words](https://forth-standard.org/standard/tools#subsection.15.6.2): #### 15.6.2.2541 LATEST-NAME-IN _( wid -- nt|0 )_ Remove the word list identifier _wid_ from the stack. If the corresponding word list is empty, then return `0`; otherwise, return the name token _nt_ for the definition whose name was placed most recently into this word list. #### 15.6.2.2542 LATEST-NAME _( -- nt )_ Return the name token _nt_ for the definition whose name was placed most recently into the compilation word list, if such a definition exists. Otherwise, throw the exception code `-80`. ## Reference implementation In this implementation we assume that _wid_ is an address that contains _nt_ of the most recently placed definition name into the word list _wid_. ``` : LATEST-NAME-IN ( wid -- nt|0 ) @ ; : LATEST-NAME ( -- nt ) GET-CURRENT LATEST-NAME-IN DUP IF EXIT THEN -80 THROW ; ``` ## Testing ``` : IT ( -- xt ) LATEST-NAME NAME>INTERPRET ; WORDLIST CONSTANT WL1 T{ : LN1 ; IT ' LN1 = -> TRUE }T T{ GET-CURRENT LATEST-NAME-IN ' LN1 = -> TRUE }T T{ :NONAME [ IT ] LITERAL ; EXECUTE ' LN1 = -> TRUE }T T{ : LN2 [ IT ] LITERAL ; LN2 ' LN1 = -> TRUE }T T{ WL1 LATEST-NAME-IN -> 0 }T GET-CURRENT WL1 SET-CURRENT ( wid.prev ) T{ ' LATEST-NAME CATCH -> -80 }T T{ : LN3 ; -> }T SET-CURRENT T{ IT ' LN2 = -> TRUE }T ``` ,------------------------------------------ | 2024-06-20 11:03:25 ruv replies: | proposal - New words: latest-name and latest-name-in | see: https://forth-standard.org/proposals/new-words-latest-name-and-latest-name-in#reply-1251 `------------------------------------------ ## Author Ruv ## Change Log - 2023-10-22 Initial revision - 2023-10-23 Add testing, examples, a question to discuss, change the throw code description - 2023-10-27 Some rationales and explanations added, the throw code description changed back, better wording in some places - 2024-06-20 Fix some typos, make some wording and formatting better, add some examples and test cases, add motivation for `LATEST-NAME-IN`, change the status to "formal". - 2024-06-20 Add a test case to check that `LATEST-NAME` returns different value after the compilation word list is switched. - 2024-06-20 Simplify the normative text description, and add a rationale for this simplification. ## Problem In some applications, mainly in libraries and extensions, the capability to obtain the most recently added definition is very useful and demanded. To make such programs portable, we should introduce a standard method to obtain the most recently added word. For example, if we are creating a library for decoration, tracing, support for OOP, simple DSLs (e.g., to describe Final State Machines), etc — it is always useful to have an accessor to the recent definition, instead of redefining a lot of words to define such an access method yourself, or juggling with the input buffer and search. One simple example. If we want to have variables that are initialized by zero, we can use: ``` : var ( "name" -- ) variable 0 latest-name name> execute ! ; ``` A number of specific examples is provided in my [post](https://github.com/ForthHub/discussion/discussions/153#discussioncomment-7418639) on ForthHub (those examples are not inserted here so as not to bloat the text). And additionally, there has been much discussions regarding standardization of such a method in recent decades. For example, Elizabeth D. Rather [wrote](https://groups.google.com/g/comp.lang.forth/c/RsJQVnEQQuw/m/M1PrzPAcE-0J) on 2011-12-09 in `comp.lang.forth`: > AFAIK most if not all Forths have some method for knowing the latest definition, it's kinda necessary. The problem is, that they all do it differently (at different times, in different forms, etc.), which is why it hasn't been possible to standardize it. > > Although it's a system necessity, I haven't found this of much value in application programming. > > Elizabeth D. Rather It's true: depending on the system, an internal method can return the recent word regardless of the compilation word list, or depending on the compilation word list, a completed definition, or not yet completed definition, also unnamed definition, or only named definition, etc. The value in application programming is shown by me above. Some known internal methods: `latest ( -- nt|0 )`, `last @ ( -- nt|0 )`, `latestxt ( -- xt|0 )`, etc. Thus, although almost every Forth system contains such a method, there is no portable way for programs to obtain the latest definition. But a such portable method is actually very useful, as shown in my examples. ## Solution Let's introduce the following words: - `LATEST-NAME-IN ( wid -- nt|0 )` - `LATEST-NAME ( -- nt )` The first word returns the name token for the definition whose name was placed most recently into the given word list, or zero if this word list is empty. The second word returns the name token for the definition whose name was placed most recently into **the compilation word list**, or throws an exception if there is no such definition. These words do not expose or limit any internal mechanism of the compiler. They just provide information about word lists, like the words `FIND-NAME-IN`, `FIND-NAME`, and `TRAVERSE-WORDLIST` do. It's a kind of introspection/reflection. This words are intended for programs. The system may use them, but is not required to do so. The system may continue to use its internal `LAST`, `LATEST`, or whatever it was using before. It seems, the best place for these words is the section [15.6.2 Programming-Tools extension words](https://forth-standard.org/standard/tools#subsection.15.6.2), where `TRAVERSE-WORDLIST` is also placed. ### Rationale #### Connection with word lists By considering definitions in the frame of a word list only, we solve several problems, namely: 1. A word list contains only completed definitions (see the accepted proposal #153 [Traverse-wordlist does not find unnamed/unfinished definitions](https://forth-standard.org/proposals/traverse-wordlist-does-not-find-unnamed-unfinished-definitions?hideDiff#reply-487)). This eliminates the question of whether the word of returned _nt_ is finished — yes, it is always finished (completed). 2. Nameless definitions are not considered since they are not placed into the compilation word list (regardless of whether the system creates a name token for them, or places them into an internal system-specific word list). 3. An extension or library can create definitions in its internal word list for internal purposes. And it will not affect the compilation word list or other user-defined word lists. Thus, the user of such library always gets the expected result from `latest-name` (regardless of what words are created by this library for internal purposes on the fly). For example, when different dictionary spaces will be introduced, we can implement something like local variables (or local definitions) in portable way, and creating such a definition will not affect the value that `latest-name` returns. #### Return values As a matter of practice, almost all the use cases for the word `LATEST-NAME` imply that the requested definition exists, and if it doesn't exist, only an error can be reported. So the option to return `0` by this word only burdens users with having to analyze this zero, or redefine this word as: ``` : latest-name ( -- nt ) latest-name dup 0= -80 and throw ; ``` If the user needs to handle the case where the compilation word list is empty, they can use the word `latest-name-in` as: ``` get-current latest-name-in dup if ( nt ) ... else ( 0 ) drop ... then ``` #### Implementation options If the the word list structure in a Forth system contains information about the latest placed definition, the implementations for the proposed words are trivial. In some plausible Forth systems, the word list structure doesn't contain any information about the definition that was placed into this word list most recently. Such systems might not provide the proposed words, or they are changed to keep the mentioned information in the word list structure. It seems, in most systems the word list structure contains this information. Some checked systems: - SwiftForth, VFX, Gforth, minForth, ikForth, SP-Forth — a word list keeps information about the definition that was placed in it most recently; - lxf/ntf 2017 — it seems, it doesn't keep this information. If a system does not implement The optional Search-Order word set, it might not provide the word `LATEST-NAME-IN`. #### Naming The names `LATEST-NAME-IN` and `LATEST-NAME` of new words are similar to `FIND-NAME-IN` and `FIND-NAME` by the form. Stack effects are also similar. The difference is that **find** is a verb, but **latest** is an adjective (or sometimes a noun, see [Wiktionary](https://en.wiktionary.org/wiki/latest#Adjective)). Both are historical in their use in naming words. As well as "NAME". In Forth-84 "NAME" _in word names_ denoted **NFA** (name field address), and now it denotes a **name token**, which is the successor of NFA. In all standard words, e.g. `FIND-NAME`, `NAME>STRING`, `NAME>COMPILE`, etc. (except `PARSE-NAME`), "NAME" denotes a name token. NB: the term "token" in "name token" does not mean a character sequence! It's used in a general sense, like "something serving as an expression of something else" (see [Wiktionary](https://en.wiktionary.org/wiki/token#Noun)). #### Normative text description The proposed normative text description is based on: - [16.2](https://forth-standard.org/standard/search#section.16.2): "compilation word list: The word list into which new definition _**names**_ are _**placed**_", - [15.3.1](https://forth-standard.org/standard/tools#subsection.15.3.1): "A _**name token**_ is a single-cell value that _**identifies**_ a named word", - [3.4.3](https://forth-standard.org/standard/usage#usage:semantics): "[Semantics] are _**largely specified by the stack notation**_ in the glossary entries, which shows what values shall be consumed and produced. The prose in each glossary entry _**further**_ specifies the definition's behavior" (there is no need to repeat in the text description what is already indicated in the stack diagrams). (emphasis added) #### Throw code description If the throw code description states that there is no latest name, it can be confusing since latest name in _some sense_ probably always exists. Therefore, it's better to say: "the compilation word list is empty" — it is what actually happens. #### Motivation for `LATEST-NAME-IN` 1. It's a natural factor for `LATEST-NAME`. It's **always possible** to extract this factor from the implementation of `LATEST-NAME`, because the latter returns _nt_ from **the compilation word list**, and the system should take _wid_ of the compilation word list and extract most recent _nt_ from this word list. 2. It's very important to specify the behavior of this word to avoid different behavior in different systems, since in many systems this word will exist (will be implemented as a natural factor). 3. In some cases a program needs to check if a word list is empty, or obtain the latest word from a particular word list (for example, to use this word as entry point, like `main`, or as the default exported word from a module). 4. These both words are optional. And if `LATEST-NAME-IN` is not provided, it can be implemented in a portable way via `LATEST-NAME` as: ``` : latest-name-in ( wid -- nt|0 ) get-current >r set-current ['] latest-name catch if 0 then r> set-current ; ``` ## Things to discuss Is it worth introducing the word `LATEST-NAME-XT ( -- xt )`? If `name>interpret` never returns `0` (see my [comment](https://forth-standard.org/proposals/name-interpret-wording?hideDiff#reply-1118)), this word can be implemented as: ``` : latest-name-xt ( -- xt ) latest-name name>interpret ; ``` The desired (and much discussed) pattern is: ``` defer bar : foo ... ; latest-name-xt is bar ``` Sometimes the name "`it`" has been suggested for this word, but this name is too short and has more chance for conflicts. Guido Draheim [wrote](https://groups.google.com/g/comp.lang.forth/c/_WCxDv1qd2M/m/dZp-OAryvt8J) in `comp.lang.forth` on 2003-03-16: > I think that everyone has been thinking of using `IT` for something really clever, it's a nice short word - and I'd say that we should leave it for application usage. > > I want to support that argument also with real life experience in the telco world where there are a whole lot of abbreviations for various services, signals, connectors around. All too often now I see people making a SYNONYM at the file-start to get a second name for an ANS forth word that is needed in the implemenation but coincides with a common term of the application. This seems convincing to me. ## Typical use ``` : STRUCT: ( "name" -- wid.current.old u.offset ) GET-CURRENT VOCABULARY ALSO LATEST-NAME NAME> EXECUTE DEFINITIONS 0 ; ``` ``` \ In the application's vocabulary : IT ( -- xt ) LATEST-NAME NAME>INTERPRET ; DEFER FOO : BAR ... ; IT IS FOO ``` ## Proposal Add the following line into the [Table 9.1: THROW code assignments](https://forth-standard.org/standard/exception#table:throw): > `-80` the compilation word list is empty Add the following sections into [15.6.2 Programming-Tools extension words](https://forth-standard.org/standard/tools#subsection.15.6.2): #### 15.6.2.xxxx `LATEST-NAME-IN` TOOLS EXT _( wid -- nt|0 )_ If the word list identified by _wid_ is empty, then the returned value is `0`; otherwise, the name token _nt_ identifies the definition whose name was placed most recently into the word list _wid_. #### 15.6.2.xxxx `LATEST-NAME` TOOLS EXT _( -- nt )_ If the compilation word list is not empty, the name token _nt_ identifies the definition whose name was placed most recently into this word list. Otherwise, the exception code `-80` is thrown. ## Reference implementation In this implementation we assume that _wid_ is an address that contains _nt_ of the most recently placed definition name into the word list _wid_. ``` : LATEST-NAME-IN ( wid -- nt|0 ) @ ; : LATEST-NAME ( -- nt ) GET-CURRENT LATEST-NAME-IN DUP IF EXIT THEN -80 THROW ; ``` ## Testing ``` : IT ( -- xt ) LATEST-NAME NAME>INTERPRET ; WORDLIST CONSTANT WL1 T{ : LN1 ; IT ' LN1 = -> TRUE }T T{ GET-CURRENT LATEST-NAME-IN ' LN1 = -> TRUE }T T{ :NONAME [ IT ] LITERAL ; EXECUTE ' LN1 = -> TRUE }T T{ : LN2 [ IT ] LITERAL ; LN2 ' LN1 = -> TRUE }T T{ WL1 LATEST-NAME-IN -> 0 }T GET-CURRENT WL1 SET-CURRENT ( wid.prev ) T{ ' LATEST-NAME CATCH -> -80 }T T{ : LN3 ; -> }T SET-CURRENT T{ IT ' LN2 = -> TRUE }T ```