Digest #319 2025-10-08

Contributions

[414] 2025-10-07 08:43:07 ruv wrote:

comment - Interpretation of the top input parameter of PICK

The word pick has the type ( x.0 u.cnt*x u.cnt -- x.0 u.cnt*x x.0 ).

The word roll has the type ( x.0 u.cnt*x u.cnt -- u.cnt*x x.0 ).

The discussed word poke has the type ( x.0 u.cnt*x x.1 u.cnt -- x.1 u.cnt*x ).

In my other comment I wrote that pick "interprets the underneath stack items as an array on which it operates", and then u.cnt is an index in this array.

A more fundamental interpretation is that the input parameter u.cnt (in pick, poke, roll) represents the number of stack items that need to be "skipped" to locate the target input parameter x.0 (that is copied, taken, or overwritten).

A consequence of this is that:

  • 2pick must have the type ( xd.0 u.cnt*x u.cnt -- xd.0 u.cnt*x xd.0 );
  • 2poke must have the type ( xd.0 u.cnt*x xd.1 u.cnt -- xd.1 u.cnt*x );
  • 2roll must have the type ( xd.0 u.cnt*x u.cnt -- u.cnt*x xd.0 );

Rationale: the number of "skipped" stack items should not depend on the data type of the target parameter.

See also my other comment on ForthHub (2024-12-02) on this regard.


[415] 2025-10-07 15:53:02 obijohn wrote:

comment - Formatting

I'm not exactly sure how to mention this without leaving a comment here, since it affects this definition. Draft 21.1, specifically the PDF version uploaded last month, has a formatting issue in the text of the EKEY? definition on page 102. The second line of the last sentence overflows into the footer at the bottom of the page, making the text and footer (including the page number) unreadable.

I don't think Draft PDF formatting issues fall into any Proposal category. If there's a more appropriate means of conveying the information, please let me know.

Replies

[r1550] 2025-09-17 08:43:47 BerndPaysan replies:

proposal - Non parsing CREATE

What's used even more frequently in Gforth is noname, which allows to create unnamed words (using latestxt afterwards to access the xt). And a lot of words with nextname or noname are users of create, that exist anyways. That's why Gforth uses this one-shot modified header creation instead of a variant of create.

execute-parsing works for the named part, but noname won't, unless we e.g. would allow that s" " ['] create execute-parsing would create an unnamed word instead of complaining about the missing name.


[r1551] 2025-09-17 10:03:52 AntonErtl replies:

proposal - Non parsing CREATE

Execute-parsing cannot work for parsing an empty name to create because regular create produces an error when it tries to parse a name and there is none there. This is a difference to nextname: nextname replaces the parsing, so you can give to it an empty name or a name containing spaces. OTOH, execute-parsing works for all parsing words (also for multiple invocations of parse-name or other basic parsing words), not just defining words, whereas nextname only works for defining words.

A possible generalization of both is to let a nextname-like word precharge the next invocation of parse-name or parse, and maybe allow to precharge several invocations of these words in some way.


[r1552] 2025-09-19 11:26:42 ruv replies:

requestClarification - What should the behavior be if the system has no hard-coded limit on size?

is it better to include the word UNUSED and have it return -1 to indicate unknown (or rather, the maximum unsigned integer since the prototype uses u), or to omit the word UNUSED from the core extension set?

Of these two options, it is better to omit unused, since this word has type ( -- u ) and so it cannot return -1.

The program interprets the result of unused as an unsigned number, and the system must ensure that the corresponding amount of memory can be reserved in one or more subsequent calls to allot. Note that allot has type ( n -- ), and it reserves memory if n > 0, and releases memory if n < 0.

Thus, the system should pass the following testcase:

: u/ ( u u\0 -- u ) 0 swap um/mod nip ;

s" MAX-N" environment? invert [if] -1 2 u/ 1 - [then]
constant largest-n

: reserve ( u -- )
  dup largest-n u< if ( +n )        allot exit then  dup 2 u/  dup recurse  - recurse
;
: release ( u -- )
  dup largest-n u< if ( +n ) negate allot exit then  dup 2 u/  dup recurse  - recurse
;

t{ here unused  2dup reserve here swap - over =  over release  ->  here unused true }t

Regarding unused and dynamic memory or unknown memory size. An alternative option is to provide a specific (possible configured) amount of memory, which can be reserved with allot, and update the available memory on creating a definition.

An example (derived from my another comment):

\ private

2variable dictspace-dp \ the data pointer and border
: assume-dictspace ( addr u -- ) over + swap dictspace-dp 2! ;

\ public

: unused  ( -- u    ) dictspace-dp 2@ - ;
: here    ( -- addr ) dictspace-dp @    ;
: allot   ( n --    ) dictspace-dp +!   ;

\ private

500 1024 * constant dictspace-unused-initial
1   1024 * constant dictspace-unused-low

: ensure-dictspace-reservation ( -- )
  unused dictspace-unused-low u> if exit then
  dictspace-unused-initial  dup allocate throw  swap  assume-dictspace
;

\ initialization
0 0 assume-dictspace ensure-dictspace-reservation

\ public

: ; ( colon-sys -- )  postpone ;  ensure-dictspace-reservation  ; immediate

I think, a standard Forth system is allowed to provide such an implementation.


[r1553] 2025-09-19 12:16:19 ruv replies:

proposal - New words: latest-name and latest-name-in

Eric, thank for the suggested corrections.

It is odd that most of this proposal (as of this revision) uses lower-case, but the Testing section (still) uses upper-case

Currently, all standard word names in the text of the Standard are spelled in uppercase. Therefore, I tried to use uppercase for word names in the parts of my proposal that should be included in the Standard. But I'm reluctant to use uppercase in programs and prose when possible.


[r1554] 2025-09-19 13:17:55 ruv replies:

proposal - New words: latest-name and latest-name-in

Author

Ruv

Change Log

  • 2023-10-22 Initial revision
  • 2023-10-23 Add testing, examples, a question to discuss, change the throw code description
  • 2023-10-27 Some rationales and explanations added, the throw code description changed back, better wording in some places
  • 2024-06-20 Fix some typos, make some wording and formatting better, add some examples and test cases, add motivation for LATEST-NAME-IN, change the status to "formal".
  • 2024-06-20 Add a test case to check that LATEST-NAME returns different value after the compilation word list is switched.
  • 2024-06-20 Simplify the normative text description, and add a rationale for this simplification.
  • 2025-09-15 Add clause about findable words, add rationale sections in the proposal, address a question re immediate, note a bug in traverse-wordlist in some Forth systems, make some rewording and minor corrections, add a more general reference implementation.
  • 2025-09-19 Make corrections from Eric Blake, mention find-name instead of search-wordlist, use lowercase in the test cases in in prose when possible, make some rewording in the prose, add some links.

Problem

In some applications, mainly in libraries and extensions, the capability to obtain the most recently added definition is very useful and demanded.

To make such programs portable, we should introduce a standard method to obtain the most recently added word.

For example, if we are creating a library for decoration, tracing, support for OOP, simple DSLs (e.g., to describe Final State Machines), etc — it is always useful to have an accessor to the recent definition, instead of redefining a lot of words to define such an access method yourself, or juggling with the input buffer and search.

See some examples in the Typical use section.

Also, a number of specific examples are provided in my post on ForthHub (those examples are not inserted here so as not to bloat the text).

And additionally, there has been much discussions regarding standardization of such a method in recent decades. For example, Elizabeth D. Rather wrote on 2011-12-09 in comp.lang.forth:

AFAIK most if not all Forths have some method for knowing the latest definition, it's kinda necessary. The problem is, that they all do it differently (at different times, in different forms, etc.), which is why it hasn't been possible to standardize it.

Although it's a system necessity, I haven't found this of much value in application programming.

Elizabeth D. Rather

Indeed, depending on the system, the internal method may return the recent word depending on the compilation word list or independent of the compilation word list, a completed definition or an incomplete definition, an unnamed definition or only a named definition, and so on.

However, I believe that a standardized method has significant value for libraries and DSLs in application programming, as my examples should demonstrate.

Some known internal methods: latest ( -- nt|0 ), last @ ( -- nt|0 ), latestxt ( -- xt|0 ), etc.

Thus, although almost every Forth system contains such a method, there is no portable way for programs to obtain the latest definition.

Solution

Let's introduce the following words:

  • latest-name-in ( wid -- nt|0 )
  • latest-name ( -- nt )

The first word returns the name token for the definition whose name was placed most recently into the given word list, or zero if this word list is empty.

The second word returns the name token for the definition whose name was placed most recently into the compilation word list, or throws an exception if there is no such definition.

These words do not expose or limit any internal mechanism of the Forth system. They just provide information about word lists, like the words find-name-in, find-name, and traverse-wordlist do. It's a kind of introspection/reflection.

This words are intended for programs. The system may use them, but is not required to do so. The system may continue to use its internal last, latest, or whatever it was using before.

It seems, the best place for these words is the section 15.6.2 Programming-Tools extension words, where traverse-wordlist is also placed.

Rationale

Connection with word lists

By considering definitions in the frame of a word list only, we solve several problems, namely:

  1. A word list contains only completed definitions (see the accepted proposal #153 Traverse-wordlist does not find unnamed/unfinished definitions). This eliminates the question of whether the word of returned nt is finished — yes, it is always finished (completed).

  2. Nameless definitions are not considered since they are not placed into the compilation word list (regardless of whether the system creates a name token for them, or places them into an internal system-specific word list).

  3. An extension or library can create definitions in its internal word list for internal purposes. And it will not affect the compilation word list or other user-defined word lists. Thus, the user of such library always gets the expected result from latest-name (regardless of what words are created by this library for internal purposes on the fly). For example, when different dictionary spaces will be introduced, we can implement something like local variables (or local definitions) in portable way, and creating such a definition will not affect the value that latest-name returns.

Returned values

As a matter of practice, almost all the use cases for the word latest-name imply that the requested definition exists, and if it doesn't exist, only an error can be reported. So the option to return 0 by this word only burdens users with having to analyze this zero, or redefine this word as:

: latest-name ( -- nt ) latest-name dup 0= -80 and throw ;

If the user needs to handle the case where the compilation word list is empty, they can use the word latest-name-in as:

get-current latest-name-in dup if ( nt ) ... else ( 0 ) drop ... then

Implementation options

If the the word list structure in a Forth system contains information about the latest placed definition, the implementations for the proposed words are trivial.

In some plausible Forth systems the word list structure doesn't directly contain information about which definition was placed into the word list most recently, and this information cannot be obtained indirectly. Such systems might not provide the proposed words, or they are changed to keep this information in the word list structure. It seems, in most systems the word list structure directly contains this information, or this information can be obtained indirectly.

Some checked systems:

  • Gforth, minForth, ikForth, SP-Forth, Post4 — a word list keeps information about the definition that was placed in it most recently;
  • SwiftForth, VFX — the most recently placed word in a word list can be correctly obtained form the strands/threads (since nt values are monotonically increased);
  • lxf/ntf 2017 — the most recently placed word in a word list can be obtained using traverse-wordlist (since nt values are monotonically increased).

Note that some systems have a bug in traverse-wordlist so it can return the nt for a definition that cannot be found (namely, for the current definition). This is incorrect (see a testcase).

If a system does not implement the optional Search-Order word set, it might not provide the word latest-name-in.

Naming

The names latest-name-in and latest-name of the new words are similar to find-name-in and find-name by the form. Stack effects are also similar.

The difference is that find-xxx is a verb phrase that starts with a verb, but latest-xxx is a noun phrase that starts with an adjective (see Wiktionary/latest).

Both the English words "find" and "latest" have historically been used in Forth word names, as is "name".

In Forth-84 "name" in word names denoted NFA (Name Field Address), and now it denotes a name token, which is the successor of NFA. In all standard words, e.g. find-name, name>string, name>compile, etc. (except parse-name), "name" denotes a name token.

NB: the term "token" in "name token" does not mean a character sequence! It's used in a general sense, like "something serving as an expression of something else" (see Wiktionary).

Normative text description

The proposed normative text description is based on:

  • 16.2: "compilation word list: The word list into which new definition names are placed",
  • 15.3.1: "A name token is a single-cell value that identifies a named word",
  • 3.4.3: "[Semantics] are largely specified by the stack notation in the glossary entries, which shows what values shall be consumed and produced. The prose in each glossary entry further specifies the definition's behavior" (there is no need to repeat in the text description what is already indicated in the stack diagrams). (emphasis added)

Throw code description

If the throw code description states that there is no latest name, it can be confusing since latest name in some sense probably always exists.

Therefore, it's better to say: "the compilation word list is empty" — it is what actually happens.

Motivation for latest-name-in

  1. It's a natural factor for latest-name. It's always possible to extract this factor from the implementation of latest-name, because the latter returns nt from the compilation word list, and the system should take wid of the compilation word list and extract most recent nt from this word list.
  2. It's very important to specify the behavior of this word to avoid different behavior in different systems, since in many systems this word will exist (will be implemented as a natural factor).
  3. In some cases a program needs to check if a word list is empty, or obtain the latest word from a particular word list (for example, to use this word as entry point, like main, or as the default exported word from a module).
  4. These both words are optional. And if latest-name-in is not provided, it can be implemented in a portable way via latest-name as:
    : latest-name-in ( wid -- nt|0 )
      get-current >r set-current
     ['] latest-name catch if 0 then
      r> set-current
    ;
    

Things to discuss

Is it worth introducing the word latest-name-xt ( -- xt )?

If name>interpret never returns 0 (see my comment), this word can be implemented as:

: latest-name-xt ( -- xt ) latest-name name>interpret ;

The desired (and much discussed) pattern is:

defer bar

: foo ... ; latest-name-xt is bar

Sometimes the name "it" has been suggested for this word, but this name is too short and has more chance for conflicts. Guido Draheim wrote in comp.lang.forth on 2003-03-16:

I think that everyone has been thinking of using IT for something really clever, it's a nice short word - and I'd say that we should leave it for application usage.

I want to support that argument also with real life experience in the telco world where there are a whole lot of abbreviations for various services, signals, connectors around. All too often now I see people making a SYNONYM at the file-start to get a second name for an ANS forth word that is needed in the implemenation but coincides with a common term of the application.

This seems convincing to me.

Typical use

: struct: ( "name" -- wid.compilation.prev u.offset )
  get-current  vocabulary
  also  latest-name name>interpret execute  definitions
  0
;
: ;struct ( wid.compilation.prev u.offset -- )
  s" __size" ['] constant execute-parsing
  set-current
;

The word execute-parsing ( i*x c-addr u "ccc" -- j*x ) is a well-known word, see an implemented at https://theforth.net/.

\ In the application's vocabulary
: it ( -- xt ) latest-name name>interpret ;

defer foo
\ ...

: bar ... ; it is foo

Proposal

Changes in existing sections

Add the following line into the Table 9.1: THROW code assignments:

-80 the compilation word list is empty

Editorial note: the actual throw code may change.

New glossary sections

Add the following sections into 15.6.2 Programming-Tools extension words:

15.6.2.xxxx LATEST-NAME-IN TOOLS EXT

( wid -- nt|0 )
If the word list identified by wid is empty, then the returned value is 0; otherwise, the name token nt identifies the definition whose name was placed most recently into the word list wid.

Note: nt can only be returned for a definition that can be found in wid.

See also: 15.6.2.xxxx LATEST-NAME, 15.6.2.2297 TRAVERSE-WORDLIST, 6.1.0460 ;.

15.6.2.xxxx LATEST-NAME TOOLS EXT

( -- nt )
If the compilation word list is not empty, the name token nt identifies the definition whose name was placed most recently into this word list. Otherwise, the exception code -80 is thrown.

Note: nt can only be returned for a definition that can be found in the compilation word list.

See also: 15.6.2.xxxx LATEST-NAME-IN, 15.6.2.2297 TRAVERSE-WORDLIST, 6.1.0460 ;.

New rationale sections

Add the following sections into A.15.6 Glossary:

A.15.6.2.xxxx LATEST-NAME-IN

The word latest-name-in cannot return an nt that cannot be obtained using find-name or traverse-wordlist applied to the same word list.

See also: A.15.6.2.xxxx LATEST-NAME.

A.15.6.2.xxxx LATEST-NAME

The word latest-name cannot return an nt that cannot be obtained using find-name or traverse-wordlist applied to the compilation word list.

In some Forth systems the word : (colon) places an nt into the compilation word list and makes it hidden (unfindable). This nt must not be available for traverse-wordlist and for latest-name. Thus, formally, only the words ; (semicolon), does>, and ;code are allowed to add the nt of a definition created with : (colon) to the compilation word list.

If a Forth system does not provide the optional Search-Order word set, and in that Forth system the word immediate moves an nt from one internal word list to another, this must not affect what latest-name returns, and this must not affect what find-name returns (for example, consider a case where two last words have the same name and immediate is used for the latest one). Thus, after execution of immediate, latest-name shall return the same value as before this execution.

Typical use

: var ( "name" -- )
  variable
  0  latest-name name>interpret execute  !
;

Reference implementation

In this implementation for latest-name-in we assume that a wid is an address that contains nt of the most recently placed definition name into this word list.

: latest-name-in ( wid -- nt|0 ) @ ;

In this implementation for latest-name-in we assume that the values of nt, interpreted as unsigned numbers, monotonically increase when sorted chronologically (it works on most systems):

: umax ( u1 u2 -- u.max ) 2dup u< if swap then drop ;

: latest-name-in ( wid -- nt|0 )
  >r 0 [: umax true ;] r> traverse-wordlist
;

An implementation for latest-name:

: latest-name ( -- nt )
  get-current latest-name-in  dup if exit then  -80 throw
;

Testing

: it ( -- xt ) latest-name name>interpret ;

wordlist constant wl1

t{ : ln1 ; it  ' ln1 =  -> true }t
t{ get-current latest-name-in ' ln1 =  -> true }t
t{ :noname [ it ] literal ; execute  ' ln1 =  -> true }t
t{ : ln2 [ it ] literal ; ln2  ' ln1 =  -> true }t
t{ wl1 latest-name-in -> 0 }t
get-current wl1 set-current ( wid.prev )
t{ ' latest-name catch -> -80 }t
t{ : ln3 ;  -> }t
set-current
t{ it ' ln2 = -> true }t

[r1555] 2025-09-21 13:40:49 ruv replies:

proposal - New words: latest-name and latest-name-in

Errata

In the fragments:

  • The word latest-name-in cannot return an nt that cannot be obtained using find-name or traverse-wordlist

  • The word latest-name cannot return an nt that cannot be obtained using find-name or traverse-wordlist

Read find-name as find-name-in.

Note: in general, find-name can also be applied to a specific word list by specifying only that word list into the search order.


[r1556] 2025-09-24 18:52:49 ruv replies:

proposal - Non parsing CREATE

@GeraldWodni wrote:

Personally I need it to create multiple words with a common prefix.

I used execute-parsing, and holds to compose a name string (example).

So, I would implement your register: in the following way

: register: ( addr "prefix" -- )
    >r \ save the address
    parse-name 2dup
    \ create a store-word
    "set-" <# 2swap holds holds 0 0 #>
    ['] create execute-parsing r@ , [: does> @ ! ;] execute
    \ create a fetch-word
    ['] create execute-parsing r@ , [: does> @ @ ;] execute
    rdrop
;

\ Example of use
variable uart-addr \ pretend to be a hardware register
uart-addr register: uart
'h' set-uart
uart emit

This solution is already standard-compliant, because execute-parsing can be implemented in a standard program.

(see also my rationale against uart! and uart@ names for this case).


In VFXForth it is called ($create)

SP-Forth provides the word created ( sd.name -- ), so the pair (create, created) is similar to the pair of standard words (include, included) by the form.

But if we want to standardize the pure postfix variant of create, why don't we standardize the postfix variants of other defining words?

I believe that standardizing execute-parsing makes more sense than standardizing non-parsing-create, since execute-parsing solves the problem for all defining words, including user-defined words.

As for words whose name is an empty string, I don't see any demand for creating such words in programs. Can anyone provide some examples?

In general, a more useful Forth words (than non-parsing-create) are:

  • a Forth word that builds a new named definition from a name (a string) and an xt that identifies the execution semantics for the new definition;
    • E.g.: enlist-word ( xt sd.name -- )
    • It may allow sd.name be empty.
  • a Forth word that build a new anonymous definition from an xt1 and x2 producing xt2, so applying >body to xt2 gives x2, and executing xt2 places x2 on the stack and executes xt1;
    • E.g.: bind ( x xt1 -- xt2 )

These words provide the functionality of create ... does> ... ; and a little more.


Regarding the reference implementation of non-parsing-create via evaluate — it's incorrect because it depends on the search order.

The following test case should be passed:

t{ : foo get-order  0 set-order  s" bar" non-parsing-create  set-order  does> drop 123 ;  foo bar -> 123 }t

[r1557] 2025-09-26 07:36:57 BerndPaysan replies:

proposal - New words: latest-name and latest-name-in

Why does LATEST-NAME-IN return 0 in case of empty wordlist, and LATEST-NAME throws an error? I would expect a consistent interface: Either both throw or none throw.


[r1558] 2025-09-26 07:59:24 ruv replies:

proposal - New words: latest-name and latest-name-in

Bernd, see the Rationale / Returned values sub-section, it says:

As a matter of practice, almost all the use cases for the word latest-name imply that the requested definition exists, and if it doesn't exist, only an error can be reported. So the option to return 0 by this word only burdens users with having to analyze this zero

Typically, when we use latest-name, we assume that the compilation word list is not empty and don't analyze the returned value. It is better if this assumption is formally supported.


[r1559] 2025-09-26 09:10:26 ruv replies:

proposal - PLACE +PLACE

the underlying system might provide PLACE or +PLACE, but with a different behaviour, rendering the rest of the standard program invalid.

A possible option is to introduce an optional word set or status "Deprecated" (or "Discouraged") specifically for such historical words. Then, in this section/status specify place and +place. This will reserve the names of these words and prevent Forth systems from providing words with these names but with different behavior.


[r1560] 2025-09-26 16:04:45 ruv replies:

proposal - Recognizer committee proposal 2025-09-11

I think, we should fix the following problems.

The term "translation"

The term translation is not suitable to denote the general type of recognizers result. Since "translation" is either an act of translating, or a product of translating (not recognizing). Even the term "recognition" is more suitable, if someone likes it.

Another possible option: "recognized", which will be used as a nominalized adjective (i.e., a noun).

We also need a separate term to denote the type of the topmost x value of a successful recognizing result.

The scheme translate-something

The naming scheme translate-something is not suitable for words that have type ( -- x ) and are constants.

  • Effectively, any member of this naming scheme is a verb phrase; this scheme was intended for words that perform translation (interpretation or compilation), which is an active action with possible side effects.
    • For example, translate-nt ( i*x nt -- j*x ).

A word that is a constant should have the name that is a noun or a noun phrase.

This naming scheme should be aligned with the corresponding general data type name/symbol.

The names get-recs and set-recs

The pair of words ( get-recs, set-recs ) is similar to the pair of standard words ( get-order, set-order ) by the form of their names, but very different conceptually, since they accept the object on the top. This is an inconsistency in naming conventions.

Better naming options are:

  • recs@ and recs!
    • "recs" in these names denotes the pair of types at once
      • the type of a data object that is fetched or stored
      • the type of a target data object
    • it is also similar on some extend to the pairs of standard words (defer@, defer!), (c@, c!), (2@, 2!)
      • see also my post on ForthHub in this regard.
  • fetch-recs and store-recs

The names translate: and rec-sequence:

The corresponding words are proposed as defining words.

Traditionally, a colon was only used in the names of standard defining words that have a counterpart word with a semi-colon in the name. So, this name is inconsistent with other names. Note that this tradition was broken bye new "*field:" words (but not +field).

  • Can we avoid a colon in the defining words that don't have a counterpart word with a semicolon?

The name rec-sequence: is too close to rec-sequence that is a member of the rec-something naming scheme. This is inconsistent and confusing.

  • A possible option: recs — an abbreviation of "recognizers sequence", which is "sequence of recognizers".
    • Maybe it is better if if this word was like wordlist, which produces a new identifier on the stack without creating a word.

[r1561] 2025-09-26 16:09:45 ruv replies:

proposal - Recognizer committee proposal 2025-09-11

Erratum:

see also my post on ForthHub in this regard.

The correct link


[r1562] 2025-09-28 22:21:31 ruv replies:

proposal - Clarification for execution token

Author

Ruv

Change Log

(the latest at the top)

  • 2025-09-28 Huge update; incorporate proposals [249], [212] (partially), [163], [122] (partially); add tests.
  • 2022-09-19 explicitly allow a short formula, describe what it means, better wording, fix some typos
  • 2022-08-13 Initial version

Preceding history

(the latest at the top)

Problem

By the definition of the term "execution token" in Forth-94 and Forth-2012, it's a value that identifies execution semantics. Can such value identify other behavior, e.g. some interpretation semantics or compilation semantics? It's unclear at the first glance.

Another problem is that, following unfortunate change to the term "execution token" in the very quickly accepted proposal [157], the standard does not formally state that an execution token identifies anything at all.

Yet another problem is that it is unclear what behavior are identified by the execution token of a word whose execution semantics are not specified by the standard.

Solution

Actually, an execution token can identify (and does identify) other semantics too, but only if they are equivalent to the execution semantics that this token also identifies.

It is so because for any execution token there exists at least one named or unnamed Forth definition the execution semantics of which are identified by this execution token. So, in any case, an execution token always identifies some execution semantics, but accidentally these semantics can be equivalent to some interpretation semantics, or some compilation semantics, and then it identifies them too. It's unnecessary that they connected to the same Forth definition. Also, consequently, it's impossible that an execution token identifies some compilation semantics, or some interpretation semantics, but doesn't identify the equivalent execution semantics.

Note that there are cases where the semantics cannot be identified by an execution token in a Forth system, because the implementation of the system does not have or cannot have an unnamed (anonymous) definition with equivalent execution semantics.

Some examples of semantics that cannot be identified by an execution token:

  • typically, the run-time semantics of if (an instance of);
  • in some systems (in which compile, is equivalent to postpone literal postpone execute), the execution semantics of >r;
  • in some system (where FVM does not have access to the underlying return stack, e.g. WAForth), the initiation semantics of :noname.

Of course, the standard allows such implementations and disallows programs to obtain an execution token of the corresponding semantics, or even does not provide a way to obtain it.


To solve the initial problem we can formally state these basics explicitly in a normative part, and specify what semantics are identified by the execution token of a word.

Also, we should update the definition of "execution token" term to say what it identifies.

Example

: foo postpone if ;
:noname postpone if ; ( xt )

The execution semantics of foo are equivalent to the compilation semantics for if. In the same time, a Forth system may provide system-dependent execution semantics for if that are not equivalent to the execution semantics of foo.

xt, which is left on the stack in the second line, identifies the execution semantics of an anonymous Forth definition, and these execution semantics are equivalent to the compilation semantics for if.

Typical use

  • "xt identifies the compilation semantics for the word FOO"

    • It means that the execution token xt identifies the execution semantics which are equivalent to the compilation semantics for the word FOO.
    • At the same time, the execution semantics of the word FOO may differ form the execution semantics identified by this xt.
  • "the execution token for the word BAR"

    • It means that this execution token identifies the execution semantics of the word BAR.
    • Note: if the standard does not define interpretation semantics for BAR, the execution token of BAR could identify some system-specific execution semantics, because an ambiguous condition could occur (4.1.2) when the program obtains the execution token of BAR.
  • "xt identifies the interpretation semantics for the word BAZ"

    • It means that the execution token xt identifies the execution semantics which are equivalent to the interpretation semantics for the word BAZ.
    • At the same time, the execution semantics of the word BAZ may differ form the execution semantics identified by this xt.
  • The execution semantics identified by xt are equivalent to the interpretation semantics for the word BAZ.

    • This seems pretty clear.

Incorrect use

Actually, the standard contains only one place where the "execution token" term is used ambiguously in a normative part — the glossary entry for FIND. Since it says that FIND returns the execution token for the word ("its execution token", so it should identify the execution semantics of the word), but:

  • FIND may return two different tokens (one while compiling and another while not compiling) for the same word, which may identify different semantics (then at least one of them does not identifies the execution semantics of the word);
  • in some cases, none of the returned execution tokens identifies the specified execution semantics of the word (for example, for the word >r in some systems).

In another glossary entry — for NAME>INTERPRET — the language is just slightly non normative, since it uses the form "xt represents" instead of the form "xt identifies".

These glossary entries also have some other problems, so they should be corrected anyway; my other proposals for that are in progress.

Proposal

Update "execution token" term

In the section 2.1 Definitions of terms, change:

execution token: A value that can be passed to EXECUTE (6.1.1370)

into

execution token: A value that identifies the execution semantics of a definition.

Update "execution token" data type description

In the section 3.1.3.5 Execution tokens, add the following paragraphs to the beginning:

For any valid execution token in the system, there is at least one Forth definition (named or unnamed) whose execution semantics are identified by that execution token.

The execution semantics identified by an execution token can be equivalent to the interpretation semantics or compilation semantics for some word, or to some run-time semantics. In such a case this execution token also identifies the corresponding interpretation semantics, compilation semantics, or run-time semantics.

It it is not required that every specified semantics be identified by some execution token in the system.

The execution token of a Forth definition, if available, identifies the execution semantics of that definition, which are either specified by this standard or implementation dependent (if permitted).

Update "Execution semantics" notion

In the section 3.4.3.1 Execution semantics,

Change the paragraph:

The execution semantics of each Forth definition are specified in an "Execution:" section of its glossary entry. When a definition has only one specified behavior, the label is omitted.

into

The execution semantics of a Forth definition are specified in an "Execution:" section of its glossary entry. When a definition has only one specified behavior, the label is omitted.

Rationale: for some Forth definitions execution semantics are not specified.

After that, add the the following two paragraphs:

If the execution semantics for a Forth definition are specified by this standard and the glossary entry of that definition does not have an "Interpretation:" section, the execution token of that definition identifies the specified execution semantics. Otherwise the execution token of that definition, if available, identifies the implementation dependent execution semantics.

The implementation dependent execution semantics of a Forth definition, when they are performed in interpretation state, shall perform the interpretation semantics of that definition. An ambiguous condition exists if they are performed in compilation state.

Rationale: until such words like >r are defined using an "Execution" section, we have to rely on absence of an "Interpretation:" section to refer to ordinary words (commented on 2019-06-21, 2020-08-30).

Update glossary entries

In the glossary entries 6.2.2295 TO, 6.2.1725 IS, 6.2.0698 ACTION-OF,

replace the phrase:

An ambiguous condition exists if any of POSTPONE, [COMPILE], ' or ['] are applied to TO.

with the phrase:

An ambiguous condition exists if POSTPONE or [COMPILE] are applied to TO.

Testing

t{ ' s"  execute abc"  s" abc"  compare -> 0 }t
t{ 1 value x  2  ' to  execute x  x -> 2 }t
t{ : (to) [ ' to compile, ] ;  3 (to) x  x -> 3 }t

[r1563] 2025-09-28 22:29:10 ruv replies:

proposal - Revert rewording the term "execution token"

No one has written down any objections to this proposal for three years. Now I have incorporate it into my proposal [251] Clarification for execution token


[r1564] 2025-09-29 05:45:56 AntonErtl replies:

proposal - Fix stack comments for N>R and NR>

We discussed your request for clarification for a while, with various reasons discusses why changing from +n to u would not make a difference or under what circumstances it would and eventually I suggested that we just follow your suggestion, and reached consensus on that.

Thinking about it again: In general, u is preferable to +n, because +n leaves it undefined what happens for negative n. +n is the right choice in cases where system behaviour varies for negative n, but otherwise, we should either specify u or n with a specific behaviour for negative n.


[r1565] 2025-09-29 07:12:58 ruv replies:

proposal - Clarification for execution token

Author

Ruv

Change Log

(the latest at the top)

  • 2025-09-29 Better wording in some places; corrections; update in ambiguous conditions.
  • 2025-09-28 Huge update; incorporate proposals [249], [212] (partially), [163], [122] (partially); add tests.
  • 2022-09-19 explicitly allow a short formula, describe what it means, better wording, fix some typos
  • 2022-08-13 Initial version

Preceding history

(the latest at the top)

Problem

By the definition of the term "execution token" in Forth-94 and Forth-2012, it's a value that identifies execution semantics. Can such value identify other behavior, e.g. some interpretation semantics or compilation semantics? It's unclear at the first glance.

Another problem is that, following unfortunate change to the term "execution token" in the very quickly accepted proposal [157], the standard does not formally state that an execution token identifies anything at all.

Yet another problem is that it is unclear what behavior are identified by the execution token of a word whose execution semantics are not specified by the standard.

Solution

Actually, an execution token can identify (and does identify) other semantics too, but only if they are equivalent to the execution semantics that this token also identifies.

It is so because for any execution token there exists at least one named or unnamed Forth definition the execution semantics of which are identified by this execution token. So, in any case, an execution token always identifies some execution semantics, but accidentally these semantics can be equivalent to some interpretation semantics, or some compilation semantics, and then it identifies them too. It's unnecessary that they connected to the same Forth definition. Also, consequently, it's impossible that an execution token identifies some compilation semantics, or some interpretation semantics, but doesn't identify the equivalent execution semantics.

Note that there are cases where the semantics cannot be identified by an execution token in a Forth system, because the implementation of the system does not have or cannot have a Forth definition with equivalent execution semantics.

Some examples of semantics that cannot be identified by an execution token:

  • typically, the run-time semantics of if (an instance of);
  • in some systems (in which compile, is equivalent to postpone literal postpone execute), the execution semantics of >r;
  • in some system (where FVM does not have access to the underlying return stack, e.g. WAForth), the initiation semantics of :noname.

Of course, the standard allows such implementations and disallows programs to obtain an execution token of the corresponding semantics, or even does not provide a way to obtain it.


To solve the initial problem we can formally state these basics explicitly in a normative part, and specify what semantics are identified by the execution token of a word.

Also, we should update the definition of "execution token" term to say what it identifies.

Example

: foo postpone if ;
:noname postpone if ; ( xt )

The execution semantics of foo are equivalent to the compilation semantics for if. In the same time, a Forth system may provide system-dependent execution semantics for if that are not equivalent to the execution semantics of foo.

xt, which is left on the stack in the second line, identifies the execution semantics of an unnamed Forth definition, and these execution semantics are equivalent to the compilation semantics for if.

Typical use

  • "xt identifies the compilation semantics for the word FOO"

    • It means that the execution token xt identifies the execution semantics which are equivalent to the compilation semantics for the word FOO.
    • At the same time, the execution semantics of the word FOO may differ form the execution semantics identified by this xt.
  • "the execution token for the word BAR"

    • It means that this execution token identifies the execution semantics of the word BAR.
    • Note: if the standard does not define interpretation semantics for BAR, the execution token of BAR could identify some system-specific execution semantics, because an ambiguous condition could occur (4.1.2) when the program obtains the execution token of BAR.
  • "xt identifies the interpretation semantics for the word BAZ"

    • It means that the execution token xt identifies the execution semantics which are equivalent to the interpretation semantics for the word BAZ.
    • At the same time, the execution semantics of the word BAZ may differ form the execution semantics identified by this xt.
  • The execution semantics identified by xt are equivalent to the interpretation semantics of BAZ.

    • This seems pretty clear.

Incorrect use

Actually, the standard contains only one place where the "execution token" term is used ambiguously in a normative part — the glossary entry for FIND. Since it says that FIND returns the execution token of the Forth definition ("its execution token", so it should identify the execution semantics of that definition), but:

  • FIND may return two different tokens (one while compiling and another while not compiling) for the same string (and, formally, the same Forth definition), which may identify different semantics; then at least one of them does not identifies the execution semantics of the definition;
  • in some cases, none of the returned execution tokens identifies the specified execution semantics of the word (for example, for the word >r in some systems).

In another glossary entry — for NAME>INTERPRET — the language is just slightly non normative, since it uses the form "xt represents" instead of the form "xt identifies".

These glossary entries also have some other problems, so they should be corrected anyway; my other proposals for that are in progress.

Proposal

Update "execution token" term

In the section 2.1 Definitions of terms, change:

execution token: A value that can be passed to EXECUTE (6.1.1370)

into

execution token: A value that identifies the execution semantics of a definition.

Update "execution token" data type description

In the section 3.1.3.5 Execution tokens, add the following paragraphs to the beginning:

For any valid execution token in the system, there is at least one Forth definition (named or unnamed) whose execution semantics are identified by that execution token.

The execution semantics identified by an execution token may be equivalent to the interpretation semantics or compilation semantics for some word, or to some run-time semantics. In such a case this execution token also identifies that interpretation semantics, compilation semantics, or run-time semantics.

It it is not required that every specified semantics be identified by some execution token in the system.

The execution token of a Forth definition, if available, identifies the execution semantics that are either specified by this standard for that definition or are implementation dependent (if permitted).

If the interpretation semantics for a Forth definition are defined by this standard, the execution token of that definition shall be available.

Update "Execution semantics" notion

In the section 3.4.3.1 Execution semantics,

Change the paragraph:

The execution semantics of each Forth definition are specified in an "Execution:" section of its glossary entry. When a definition has only one specified behavior, the label is omitted.

into

The execution semantics of a Forth definition are specified in an "Execution:" section of its glossary entry. When a definition has only one specified behavior, the label is omitted.

Rationale: for some Forth definitions execution semantics are not specified.

After that, add the the following two paragraphs:

If the execution semantics for a Forth definition are specified by this standard and the glossary entry of that definition does not have an "Interpretation:" section, the execution token of that definition identifies the specified execution semantics. Otherwise the execution token of that definition, if available, identifies the implementation dependent execution semantics.

The implementation dependent execution semantics of a Forth definition, when they are performed in interpretation state, shall perform the interpretation semantics of that definition. An ambiguous condition exists if they are performed in compilation state.

Rationale: until such words like >r are defined using an "Execution" section, we have to rely on absence of an "Interpretation:" section to refer to ordinary words (commented on 2019-06-21, 2020-08-30).

Update ambiguous conditions

In the section 4.1.2 Ambiguous conditions,

replace the phrase:

attempting to obtain the execution token, (e.g., with 6.1.0070 ', 6.1.1550 FIND, etc. of a definition with undefined interpretation semantics;

with the phrase:

attempting to obtain the execution token with 6.1.0070 ' or 6.1.2510 ['] of a definition with undefined interpretation semantics;

Rationale: find (in interpretation state) and search-wordlist return either execution token of the word or zero, and there is no ambiguous condition in this regard.

Update glossary entries

In the glossary entries 6.2.2295 TO, 6.2.1725 IS, 6.2.0698 ACTION-OF,

replace the phrase:

An ambiguous condition exists if any of POSTPONE, [COMPILE], ' or ['] are applied.

with the phrase:

An ambiguous condition exists if POSTPONE or [COMPILE] are applied.

Testing

t{ ' s"  execute abc"  s" abc"  compare -> 0 }t
t{ 1 value x  2  ' to  execute x  x -> 2 }t
t{ : (to) [ ' to compile, ] ;  3 (to) x  x -> 3 }t

[r1566] 2025-09-30 00:28:33 Josef replies:

proposal - Recognizer committee proposal 2025-09-11

I agree with @ruv that "translation" doesn't quite fit and finding suitable terms is a real challenge. I utilized this proposal, BerndPaysan's retired recognizer proposal, FORTH Inc.'s recognizer page, and the comments here and on the mailing list.

Suggestions short summary

Remove the "translation" term because it's obfuscates the possible outputs, explained below.

Because a "translation token" is a table of run-time actions, "run-time action table" (rat) would seem appropriate, explained below.

A recognizer definition is proposed below.

It doesn't seem that the connection between a recognizer's pattern and the rest of the steps is really discussed. Matching a text token to the pattern is the first step. The parameters fetched according to the recognizer's pattern. The run-time action table is associated with a specific pattern parameter.

Recognizer term

From this proposal and FORTH, Inc.'s write up, the following seems to be how to design a recognizer:

  1. Determine the text pattern of the data.
    • _E.g. complex numbers follow an "a+bi" pattern.
  2. Create a pattern matching algorithm for the text pattern.
  3. Determine pattern parameters to be fetched.
  4. Determine the run-time action tables to pair with the fetched pattern parameters.

Recognizer definiton proposal: A recognizer attempts to match a text token to a pattern. A successful text token match invokes fetching the pattern parameters and the associated run-time action table. A failed matching attempt outputs a rat-none . The text interpreter (and other users, such as postpone ), utilizes the run-time action table to perform either the interpreting run-time, compiling run-time, or postponing run-time.

rec-sequence: make-rec-sequence ( xtu .. xt1 u "name" -- )

rec-name ( c-addr u -- xt rat | rat-none)

rec-num ( c-addr u -- i*x rat | rat-none)

rec-none ( c-addr u -- rat-none )

I agree with @ruv's suggestions for get-recs and set-recs, i.e. recs@ & recs!.

Translation Term

Translation seems to hide information. The relationship between the pattern parameters and the run-time action table is fixed. Because different recognizers produce different outputs, using "translation" as a catchall obscures the output, rather than listing the output i*x rat , xt rat , etc.

translation token run-time action table: Single-cell item that contains the run-time actions associated with specific pattern parameters, i.e. interpreting run-time, compiling run-time, and postponing run-time. (This has formerly been called a rectype, translation token. It's a table of run-time actions.)

translate: make-rat ( xt-int xt-comp xt-post "name" -- )

translate-word rat-word ( -- rat )

pattern parameters: is the optional set of data fetched after a successful text token match. The set is on various stacks below the run-time action table. (This could use a better name, not sure if it's really needed, but it helped my thinking.)

I walk through the examples below with the notes above.

Example: REC-NAME

FORTH, Inc. has this example:

' EXECUTE ' COMPILE, ' POSTPONE, TRANSLATE: TRANSLATE-WORD
' EXECUTE ' EXECUTE  ' COMPILE,  TRANSLATE: TRANSLATE-IMM
: REC-NAME ( c-addr len -- xt addr1 | addr2 )
    (FIND) CASE
        -1 OF  TRANSLATE-WORD  ENDOF
        1 OF  TRANSLATE-IMM  ENDOF
        0 OF  TRANSLATE-NONE  ENDOF
    ENDCASE ;

Compared to the steps above:

  1. Data to be handled is "words in general".
  2. The pattern is a word is in the dictionary.
  3. The pattern parameters fetched could be:
    • xt 1
    • xt -1
    • cddr 0
  4. Pattern parameters are associated to rats as follows:
    • 1 to TRANSLATE-WORD
    • -1 to TRANSLATE-IMM
    • 0 to TRANSLATE-NONE (originally, NOTFOUND).

(FIND) completes both Steps 2 & 3. The rat output is based on the pattern parameters fetched, not the pattern being matched.

Example: REC-TICK

From the proposal:

: rec-tick ( addr u -- translation ) \ gforth-experimental
    over c@ '`' = if
        1 /string find-name dup if
            name>interpret translate-cell exit then
        drop translate-none exit then
    rec-none ;

Walking through the steps:

  1. The data to be handled is a ticked word.
  2. The pattern is a name in the dictionary.
  3. The pattern parameters fetched by 1 /string find-name could be:
    • nt
    • 0
  4. Pattern parameters are associated to rats as follows:
    • nt to translate-cell
    • 0 to translate-none

The rat output is based on the pattern parameters fetched, not the pattern being matched. 2drop translate-none seems clearer than rec-none. I keep getting caught looking at the rec-tick example thinking "what is rec-none recognizing?"

Example Observations

  • Pattern matching and pattern parameter fetching can be combined or separate words.
  • It would be reasonable to have a failed pattern parameter fetch be an error. A pitfall of creating recognizers is ensuring there is little to no overlap of patterns.
    • E.g. 'bob is the name as defined, processed by rec-name. 'stan is ticked version of stan processed by rec-tick.
  • rec-none could be the final recognizer in recognizer sequences, exiting any further evaluation. Instead of creating a new sequence, one could move rec-none earlier in the sequence.

Thank you for reading this far, hopefully there is more food for thought, than madness.


[r1567] 2025-09-30 07:22:07 AntonErtl replies:

proposal - Fix stack comments for N>R and NR>

The committee has accepted this as a non-substantive change in the 2025 meeting with vote #39: 8Y:0N:0A.


[r1568] 2025-09-30 09:32:40 ruv replies:

proposal - Clarification for execution token

Author

Ruv

Change Log

(the latest at the top)

  • 2025-09-30 Add some rationale; correct some typos and grammar mistakes; minor rewording; add consequences.
  • 2025-09-29 Better wording in some places; corrections; update in ambiguous conditions.
  • 2025-09-28 Huge update; incorporate proposals [249], [212] (partially), [163], [122] (partially); add tests. According to my comment on 2024-09-24, this was planned.
  • 2022-09-19 explicitly allow a short formula, describe what it means, better wording, fix some typos
  • 2022-08-13 Initial version

Preceding history

(the latest at the top)

Problem

By the definition of the term "execution token" in Forth-94 and Forth-2012, it's a value that identifies execution semantics. Can such value identify other behavior, e.g. some interpretation semantics or compilation semantics? It's unclear at the first glance.

Another problem is that, following unfortunate change to the term "execution token" in the very quickly accepted proposal [157], the standard does not formally state that an execution token identifies anything at all.

Yet another problem is that it is unclear what behavior are identified by the execution token of a word whose execution semantics are not specified by the standard.

Solution

Actually, an execution token can identify (and does identify) other semantics too, but only if they are equivalent to the execution semantics that this token also identifies.

Example 1

:noname postpone if ; ( xt )
  1. xt, which is left on the stack, identifies the execution semantics of this unnamed Forth definition.
  2. The execution semantics of this definition are equivalent to the compilation semantics for if.
  3. Then, this xt also identifies the compilation semantics for if.

Example 2

: foo postpone if ;
  1. xt of foo identifies the execution semantics of foo.
  2. The execution semantics of foo are equivalent to the compilation semantics for if (this follows from the standard).
  3. Then, this xt also identifies the compilation semantics for if.

Note that the Forth system may provide system-dependent execution semantics for if that are not equivalent to the execution semantics of foo.

Reasoning

Thus, for any execution token there exists at least one Forth definition (named or unnamed) the execution semantics of which are identified by this execution token. So, in any case, an execution token always identifies some execution semantics, but accidentally (or intentionally) these semantics can be equivalent to some interpretation semantics, or some compilation semantics, and then it identifies them too. It's unnecessary that they connected to the same Forth definition. Also, consequently, it's impossible that an execution token identifies some compilation semantics, or some interpretation semantics, but doesn't identify the equivalent execution semantics.

Note that there are cases where the semantics cannot be identified by an execution token in a Forth system, because the implementation of the system does not have or cannot have a Forth definition with equivalent execution semantics.

Examples of semantics that cannot be identified by an execution token:

  • typically, the run-time semantics of if (an instance of);
  • in some systems (in which the phrase postpone literal postpone execute is equivalent to compile,), the formally specified execution semantics of >r;
  • in some systems (where FVM does not have access to the underlying return stack, e.g. WAForth), the initiation semantics of :noname.

Of course, the standard allows such implementations and disallows programs to obtain an execution token of the corresponding semantics, or even does not provide a way to obtain it.

Roadmap

To solve the initial problem we can

  • formally and explicitly state the basics described above,
  • specify what particular semantics are identified by the execution token of a word (in what cases are they defined by the standard and when by the implementation, and to what extent).

Also, we should update the definition of "execution token" term to say what it identifies.

Typical use

  • "The execution semantics identified by xt are equivalent to the interpretation semantics of BAZ"

    • This seems pretty clear.
  • "xt identifies the interpretation semantics for the word BAZ"

    • This means that the execution token xt identifies the execution semantics which are equivalent to the interpretation semantics for the word BAZ.
    • At the same time, the execution semantics of the word BAZ may differ form the execution semantics identified by this xt.
  • "xt identifies the compilation semantics for the word FOO"

    • This means that the execution token xt identifies the execution semantics which are equivalent to the compilation semantics for the word FOO.
    • At the same time, the execution semantics of the word FOO may differ from the execution semantics identified by this xt.
  • "xt of the word BAR"

    • This means that xt identifies the execution semantics of the word BAR.
    • Whether this xt also identifies the compilation semantics, or the interpretation semantics, or both of them, or neither, for the word BAR depends on the word BAR (on how it is defined or specified).
    • Regardless of how BAR is defined, executing xt in interpretation state performs the interpretation semantics for BAR.

Incorrect use

Actually, the standard contains only one place where the "execution token" term is used ambiguously in a normative part — the glossary entry for FIND. The problem is that it says that FIND returns the execution token of the Forth definition ("its execution token", so it should identify the execution semantics of that definition), but:

  • FIND may return two different tokens (one while compiling and another while not compiling) for the same string (and, formally, the same Forth definition), which may identify different semantics; then at least one of them does not identify the execution semantics of the definition (despite the statement).

In the glossary entry for NAME>INTERPRET, the language is just slightly non normative, since it uses the form "xt represents" instead of the form "xt identifies".

These glossary entries also have some other problems, so they should be corrected anyway; my other proposals on this matter are in progress.

Proposal

Update "execution token" term

In the section 2.1 Definitions of terms, change (as of 2021-09-13):

execution token: A value that can be passed to EXECUTE (6.1.1370)

into

execution token: A value that identifies the execution semantics of a definition.

Update "execution token" data type description

In the section 3.1.3.5 Execution tokens, add the following paragraphs to the beginning:

For any valid execution token in the system, there is at least one Forth definition (named or unnamed) whose execution semantics are identified by that execution token.

The execution semantics identified by an execution token may be equivalent to the interpretation semantics, compilation semantics, or other semantics for some named Forth definition. In such cases, the execution token also identifies those interpretation, compilation, or other semantics.

The system does not need to identify every specified semantics by any execution token.

The execution token of a Forth definition, if available, identifies the execution semantics that are either specified by this standard for that definition or are implementation dependent (if permitted).

If the interpretation semantics for a Forth definition are defined by this standard, the execution token of that definition shall be available.

Rationale

  • We use the clause "valid" in "any valid execution token" because an execution token may become invalid after using words like forget and children of marker.
  • The "if available" clause is used since find (in interpretation state) and name>interpret may return zero for some exisitng (but not user-defined) words; this effectively means that the execution token is not available for those words, and search-wordlist should also return zero for them.
  • The last paragraph guarantees that any word that is allowed to be Ticked has an execution token.

Update "Execution semantics" notion

In the section 3.4.3.1 Execution semantics,

Change the paragraph:

The execution semantics of each Forth definition are specified in an "Execution:" section of its glossary entry. When a definition has only one specified behavior, the label is omitted.

into

The execution semantics of a Forth definition are specified in an "Execution:" section of its glossary entry. When a definition has only one specified behavior, the label is omitted.

Rationale: for some words, the execution semantics are not specified by the standard.

After that, add the following two paragraphs:

If the execution semantics for a Forth definition are specified by this standard and the glossary entry of that definition does not have an "Interpretation:" section, the execution token of that definition identifies the specified execution semantics. Otherwise the execution token of that definition, if available, identifies the implementation dependent execution semantics.

The implementation dependent execution semantics of a Forth definition, when they are performed in interpretation state, shall perform the interpretation semantics of that definition. An ambiguous condition exists if they are performed in compilation state.

Rationale

  • We have to rely on the absence of an "Interpretation:" section to refer to ordinary and immediate words until words like >r will be defined using a "Run-Time:" section instead of an "Execution:" section in their glossary entries (commented on 2019-06-21, 2020-08-30).
  • In this section, we use the term "implementation dependent" rather than "implementation defined" because this allows implementors to avoid documenting some system-specific words altogether.
  • When the standard allows to implement system-specific interpretation semantics for a standard word (by specifing that the interpretation semantics are undefined), and the system does so, executing xt of the word (if this xt is available) in interpretation state shall perform the system-specific interpretation semantics. The behavior of this xt in compilation state is not restricted by the standard. In classic Forth systems, in compilation state, it performs the specified compilation semantics for the word.

Update ambiguous conditions

In the section 4.1.2 Ambiguous conditions,

replace the phrase:

attempting to obtain the execution token, (e.g., with 6.1.0070 ', 6.1.1550 FIND, etc. of a definition with undefined interpretation semantics;

with the phrase:

attempting to obtain the execution token with 6.1.0070 ' or 6.1.2510 ['] of a definition with undefined interpretation semantics;

Rationale: find (in interpretation state) and search-wordlist return either the execution token of the word or zero, and there is no ambiguous condition in this regard.

Update glossary entries

In the glossary entries 6.2.2295 TO, 6.2.1725 IS, 6.2.0698 ACTION-OF,

replace the phrase:

An ambiguous condition exists if any of POSTPONE, [COMPILE], ' or ['] are applied

with the phrase:

An ambiguous condition exists if POSTPONE or [COMPILE] are applied

Consequences

This change specifies all the cases in which an xt returned by search-wordlist, ' (Tick), ['] (Bracket Tick), can (or cannot) be used by a standard program in the general case.

This change removes prohibition on Ticking the words to, is, action-of, and specifies that executing of the returned xt in interpretation state shall perform the interpretation semantics for the word. Note that executing can be performed directly by execute, or indirectly by executing of the definition in which this xt is compiled using compile,.

If the system throws an error on ticking these words, or does not provide a correct xt for them, it should be updated to be compliant.

All classic Forth systems comply with this change.

This change does not affect the existing standard programs.

Testing

t{ ' s"  execute abc"  s" abc"  compare -> 0 }t
t{ 1 value x  2  ' to  execute x  x -> 2 }t
t{ : (to) [ ' to compile, ] ;  3 (to) x  x -> 3 }t

[r1569] 2025-09-30 20:37:51 EricBlake replies:

proposal - Clarification for execution token

Reasoning

typically, the run-time semantics of if (an instance of); - I'm still not sure how to parse this. My first thought was maybe you meant something like "typically, the interpretation semantics of if (assuming the implementation defines interpretation semantics)". But re-reading the page on if, maybe what is meant is more like "typically, the run-time semantics that are appended into the current definition when an instance of if is compiled (that is, the run-time semantics that if appends to the current compilation need not correspond to an execution token)"

Typical Use

typo: "may differ form" should be "may differ from"

Proposal 3.1.3.5...

typo: "exisitng" should be "existing"


[r1570] 2025-10-01 11:05:51 ruv replies:

proposal - Recognizer committee proposal 2025-09-11

@Josef wrote:

I agree with @ruv that "translation" doesn't quite fit and finding suitable terms is a real challenge.

Because a "translation token" is a table of run-time actions, "run-time action table" (rat) would seem appropriate, explained below.

I'm making one more attempt on this matter.

The language of the Standard already uses concepts such as data object, data type, typed data object, and subtyping (see 3.1 Data types).

Using these concepts, we can describe a successful recognition result as a pair consisting of a data object and its corresponding data type.

On the stack, data types must be represented by specific identifiers, similar to how semantics elements are represented by xt identifiers. We might refer to such an identifier as a type descriptor (symbol td).

  • Note: "type descriptor" is preferred over "type identifier" because, in the language of the Standard, we will need expressions like "type descriptor td identifies ...". Using "type identifier" would lead to awkward repetitions such as "type identifier ti identifies ...".
  • Another option for this term could be "type token" (seems less preferable).

Additionally, we might define a qualified data object (symbol qdo) as a pair consisting of a data object and the type descriptor that identifies that object's data type.

  • Note. This concept should be distinguished from the existing concept of a "typed data object".

The elegance and strength of this approach lie in the following points:

  • It builds upon existing terminology, with only slight extensions.
  • It incorporates existing data type symbols into naming conventions.
  • It leverages subtyping relationships between data types to reduce redundancy (adhering to the DRY principle).

Type descriptors can be used to:

  • Translate data objects (into the body of a Forth definition when compiling or side effects when interpreting).
  • Convert data objects to different data types (casting).
    • E.g., getting xt from nt (for example, of an ordinary word only)
  • Check subtyping relationships between data types (or of a qualified data object).
  • Define new type descriptors.

These features can be designed independently of recognizers, and recognizers only rely on them when returning a qualified data object or analyzing a qualified data object from another recognizer.


[r1571] 2025-10-01 19:40:13 ruv replies:

proposal - Clarification for execution token

@ErikBlake wrote:

But re-reading the page on if, maybe what is meant is more like "typically, the run-time semantics that are appended into the current definition when an instance of if is compiled (that is, the run-time semantics that if appends to the current compilation need not correspond to an execution token)"

Yes. I literally mean the semantics specified in the "Run-time:" section in the if glossary entry.

I used the "instance of" clause because each performing of the if compilation semantics appends a distinct instance of the run-time semantics to the current definition due to different orig value (in the general case). Isolating that instance and providing an execution token for it is technically difficult.

If this example seems too confusing, I will delete it and maybe add another one.

Thanks also for pointing out the typos. These corrections, along with other changes, will be included in the next version.


[r1572] 2025-10-03 07:23:55 ruv replies:

proposal - Clarification for execution token

Author

Ruv

Change Log

(the latest at the top)

  • 2025-10-03 Correct some normative statements; add ambiguous conditions; make better wording. Add more rationale. Correct some typos.
  • 2025-09-30 Add some rationale; correct some typos and grammar mistakes; minor rewording; add consequences.
  • 2025-09-29 Better wording in some places; corrections; update in ambiguous conditions.
  • 2025-09-28 Huge update; incorporate proposals [249], [212] (partially), [163], [122] (partially); add tests. According to my comment on 2024-09-24, this was planned.
  • 2022-09-19 explicitly allow a short formula, describe what it means, better wording, fix some typos
  • 2022-08-13 Initial version

Preceding history

(the latest at the top)

Problem

By the definition of the term "execution token" in Forth-94 and Forth-2012, it's a value that identifies execution semantics. Can such value identify other behavior, e.g., some interpretation semantics or compilation semantics? It's unclear at the first glance.

Another problem is that, following unfortunate change to the term "execution token" in the very quickly accepted proposal [157], the standard does not formally state that an execution token identifies anything at all.

Yet another problem is that it is unclear what behavior are identified by the execution token of a word whose execution semantics are not specified by the standard.

Solution

Actually, an execution token can identify (and does identify) other semantics too, but only if they are equivalent to the execution semantics that this token also identifies.

Example 1

:noname postpone if ; ( xt )
  1. xt, which is left on the stack, identifies the execution semantics of this unnamed Forth definition.
  2. The execution semantics of this definition are equivalent to the compilation semantics for if (this follows from the standard).
  3. Then, this xt also identifies the compilation semantics for if.

Example 2

: foo postpone if ;
  1. xt of foo identifies the execution semantics of foo.
  2. The execution semantics of foo are equivalent to the compilation semantics for if (this follows from the standard).
  3. Then, this xt also identifies the compilation semantics for if.

Note that the Forth system may provide system-dependent execution semantics for if that are not equivalent to the execution semantics of foo.

Reasoning

Thus, for any execution token there exists at least one Forth definition (named or unnamed) the execution semantics of which are identified by this execution token. So, in any case, an execution token always identifies some execution semantics, but accidentally (or intentionally) these semantics can be equivalent to some interpretation semantics, or some compilation semantics, and then it identifies them too. It's unnecessary that they connected to the same Forth definition. Also, consequently, it's impossible that an execution token identifies some compilation semantics, or some interpretation semantics, but doesn't identify the equivalent execution semantics.

Note that there are cases where the semantics cannot be identified by an execution token in a Forth system, because the implementation of the system does not have or cannot have a Forth definition with equivalent execution semantics.

Examples of semantics that cannot be identified by any execution token:

  • in some systems (in which the phrase postpone literal postpone execute is equivalent to compile,), the formally specified execution semantics of >r;
  • in some systems (where FVM does not have access to the underlying return stack, e.g. WAForth), the initiation semantics of :noname;
  • typically, the run-time semantics of if (which are appended to the current definition by the compilation semantics of if);

Of course, the standard allows such implementations and disallows programs to obtain an execution token of the corresponding semantics, or even does not provide a way to obtain it.

Roadmap

To solve the initial problem, we can:

  • formally and explicitly state the basics described above,
  • specify what particular semantics are identified by the execution token of a word (in what cases are they defined by the standard and when by the implementation, and to what extent),
  • update the definition of the "execution token" term to say what these tokens identify.

Typical use

  • "The execution semantics identified by xt are equivalent to the interpretation semantics of BAZ"

    • This seems pretty clear.
  • "xt identifies the interpretation semantics for the word BAZ"

    • This means that the execution token xt identifies the execution semantics which are equivalent to the interpretation semantics for the word BAZ.
    • At the same time, the execution semantics of the word BAZ may differ from the execution semantics identified by this xt.
  • "xt identifies the compilation semantics for the word FOO"

    • This means that the execution token xt identifies the execution semantics which are equivalent to the compilation semantics for the word FOO.
    • At the same time, the execution semantics of the word FOO may differ from the execution semantics identified by this xt.
  • "xt of the word BAR"

    • This means that xt identifies the execution semantics of the word BAR.
    • Whether this xt also identifies the compilation semantics, or the interpretation semantics, or both of them, or neither, for the word BAR depends on the word BAR (on how it is defined or specified).
    • Regardless of how BAR is defined, executing xt in interpretation state performs the interpretation semantics for BAR.

Incorrect use

Actually, the standard contains only one place where the "execution token" term is used ambiguously in a normative part — the glossary entry for FIND. The problem is that it says that FIND returns the execution token of the Forth definition ("its execution token", so it should identify the execution semantics of that definition), but:

  • FIND may return two different tokens (one while compiling and another while not compiling) for the same string (and, formally, the same Forth definition), which may identify different semantics; then at least one of them does not identify the execution semantics of the definition (despite the statement).

In the glossary entry for NAME>INTERPRET, the language is just slightly non normative, since it uses the form "xt represents" instead of the form "xt identifies".

These glossary entries also have some other problems, so they should be corrected anyway; my other proposals on this matter are in progress.

Proposal

Update "execution token" term

In the section 2.1 Definitions of terms, change (as of 2021-09-13):

execution token: A value that can be passed to EXECUTE (6.1.1370)

into

execution token: A value that identifies the execution semantics of a definition.

Rationale: this is necessary to correctly describe the "execution token" data type.

Update "execution token" data type description

In the section 3.1.3.5 Execution tokens, add the following paragraphs to the beginning:

For any valid execution token in the system, there is at least one Forth definition (named or unnamed) whose execution semantics are identified by that execution token.

The execution semantics identified by an execution token may be equivalent to the interpretation semantics, compilation semantics, or other semantics for some named Forth definition. In such cases, the execution token also identifies those interpretation, compilation, or other semantics.

The system does not need to identify every specified semantics by any execution token.

The execution token of a Forth definition, if available, identifies the execution semantics that are either specified by this standard for that definition or are implementation dependent (if permitted).

If the interpretation semantics for a Forth definition are defined by this standard, the execution token of that definition shall be available.

Rationale

  • We use the clause "valid" in "any valid execution token" because an execution token may become invalid after using words like forget and children of marker.
  • The "if available" clause is used since find (in interpretation state) and name>interpret may return zero for some existing (but not user-defined) words; this effectively means that the execution token is not available for those words, and search-wordlist should also return zero for them.
  • The last paragraph guarantees that any word that is allowed to be Ticked has an execution token.

Add the following paragraph at the end:

See also: A.3.1.3.5 Execution tokens.

Update "Execution semantics" notion

In the section 3.4.3.1 Execution semantics,

Change the first paragraph:

The execution semantics of each Forth definition are specified in an "Execution:" section of its glossary entry. When a definition has only one specified behavior, the label is omitted.

into

The execution semantics of a Forth definition are specified in an "Execution:" section of its glossary entry. When a definition has only one specified behavior, the label is omitted.

Rationale: for some words, the execution semantics are not specified by the standard.

After that, insert the following paragraphs:

If the execution semantics for a Forth definition are specified by this standard and the glossary entry of that definition does not have an "Interpretation:" section, the execution token of that definition identifies the specified execution semantics. Otherwise, the execution token of that definition, if available, identifies the implementation dependent execution semantics.

Rationale

  • We have to rely on the absence of an "Interpretation:" section to refer to ordinary and immediate words until words like >r will be defined using a "Run-Time:" section instead of an "Execution:" section in their glossary entries (commented on 2019-06-21, 2020-08-30).
  • We use the term "implementation dependent" rather than "implementation defined" because this allows implementors to avoid documenting the provided execution semantics of standard words whose execution semantics are not specified by the standard.

The execution semantics identified by the execution token of a Forth definition, when they are performed in interpretation state, shall perform the interpretation semantics of that definition.

Rationale: this guarantees that, while interpreting, ' foo execute is equivalent to foo, even when the execution token (xt) of foo identifies implementation dependent execution semantics.

An ambiguous condition exists if the execution semantics of a Forth definition are not specified by this standard and its execution token is executed in compilation state.

Rationale: In all classic Forth systems, this performs the compilation semantics for the word, but in most dual-xt Forth systems, this performs the interpretation semantics for the word.

An ambiguous condition exists if the interpretation semantics of a Forth definition are undefined by this standard and its execution token is executed.

Rationale

  • The execution semantics of the phrase s" if" forth-worldlist search-wordlist if execute then is always ambiguous, regardless of whether they are performed while interpreting or compiling, because they execute (at least on some systems) the xt of the word if, whose interpretation semantics are undefined by this standard.

Update ambiguous conditions

In the section 4.1.2 Ambiguous conditions,

replace the phrase:

attempting to obtain the execution token, (e.g., with 6.1.0070 ', 6.1.1550 FIND, etc. of a definition with undefined interpretation semantics;

with the phrase:

attempting to obtain, using the words 6.1.0070 ' or 6.1.2510 ['], the execution token of a definition with undefined interpretation semantics;

Rationale: find and "etc." are excluded from the list, because find (in interpretation state), search-wordlist, and name>interpret return either the execution token of the word or zero, and there is no ambiguous condition in this regard.

Update glossary entries

In the glossary entries 6.2.2295 TO, 6.2.1725 IS, 6.2.0698 ACTION-OF,

replace the fragment:

An ambiguous condition exists if any of POSTPONE, [COMPILE], ' or ['] are applied

with the fragment:

An ambiguous condition exists if POSTPONE or [COMPILE] are applied

Note: in the first case the form “are applied” is used, in the second and third cases the form “is applied” is used. They should be harmonized/corrected.

Rationale: it is now specified what semantics the execution token identifies for these words.

Consequences

This change specifies all the cases in which an xt returned by search-wordlist, ' (Tick), ['] (Bracket Tick), can (or cannot) be used by a standard program in the general case.

This change removes prohibition on Ticking the words to, is, action-of, and specifies that executing of the returned xt in interpretation state shall perform the interpretation semantics for the word. Note that executing can be performed directly by execute, or indirectly by executing of the definition in which this xt is compiled using compile,.

If the system throws an error on ticking these words, or does not provide a correct xt for them, it should be updated to be compliant.

According to Anton Ertel's testing of five Forth systems (comment [r885]) in 2022, iForth 5.0.27 and VfxForth 5.11 did not comply with this change with respect to the execution tokens of the word to (and, probably, is and action-of). However, they were already noncompliant without this change, as they returned an incorrect execution token for s" (ticking of which is allowed).

Note that all classic Forth systems natively comply with this change.

This change does not affect the existing standard programs.

Testing

t{ ' s"  execute abc"  s" abc"  compare -> 0 }t
t{ 1 value x  2  ' to  execute x  x -> 2 }t
t{ : (to) [ ' to compile, ] ;  3 (to) x  x -> 3 }t

[r1573] 2025-10-03 18:29:04 ruv replies:

proposal - New words: latest-name and latest-name-in

Author

Ruv

Change Log

  • 2023-10-22 Initial revision
  • 2023-10-23 Add testing, examples, a question to discuss, change the throw code description
  • 2023-10-27 Some rationales and explanations added, the throw code description changed back, better wording in some places
  • 2024-06-20 Fix some typos, make some wording and formatting better, add some examples and test cases, add motivation for LATEST-NAME-IN, change the status to "formal".
  • 2024-06-20 Add a test case to check that LATEST-NAME returns different value after the compilation word list is switched.
  • 2024-06-20 Simplify the normative text description, and add a rationale for this simplification.
  • 2025-09-15 Add clause about findable words, add rationale sections in the proposal, address a question re immediate, note a bug in traverse-wordlist in some Forth systems, make some rewording and minor corrections, add a more general reference implementation.
  • 2025-09-19 Make corrections from Eric Blake, mention find-name instead of search-wordlist, use lowercase in the test cases in in prose when possible, make some rewording in the prose, add some links.
  • 2025-10-03 Rationale for returning zero and throwing an exception. Minor rewording in some places.

Problem

In some applications, mainly in libraries and extensions, the capability to obtain the most recently added definition is very useful and demanded.

To make such programs portable, we should introduce a standard method to obtain the most recently added word.

For example, if we are creating a library for decoration, tracing, support for OOP, simple DSLs (e.g., to describe Final State Machines), etc — it is always useful to have an accessor to the recent definition, instead of redefining a lot of words to define such an access method yourself, or juggling with the input buffer and search.

See some examples in the Typical use section.

Also, a number of specific examples are provided in my post on ForthHub (those examples are not inserted here so as not to bloat the text).

And additionally, there has been much discussions regarding standardization of such a method in recent decades. For example, Elizabeth D. Rather wrote on 2011-12-09 in comp.lang.forth:

AFAIK most if not all Forths have some method for knowing the latest definition, it's kinda necessary. The problem is, that they all do it differently (at different times, in different forms, etc.), which is why it hasn't been possible to standardize it.

Although it's a system necessity, I haven't found this of much value in application programming.

Elizabeth D. Rather

Indeed, depending on the system, the internal method may return the recent word depending on the compilation word list or independent of the compilation word list, a completed definition or an incomplete definition, an unnamed definition or only a named definition, and so on.

However, I believe that a standardized method has significant value for libraries and DSLs in application programming, as my examples should demonstrate.

Some known internal methods: latest ( -- nt|0 ), last @ ( -- nt|0 ), latestxt ( -- xt|0 ), etc.

Thus, although almost every Forth system contains such a method, there is no portable way for programs to obtain the latest definition.

Solution

Let's introduce the following words:

  • latest-name-in ( wid -- nt|0 )
  • latest-name ( -- nt )

The first word returns the name token for the definition whose name was placed most recently into the given word list, or zero if this word list is empty.

The second word returns the name token for the definition whose name was placed most recently into the compilation word list, or throws an exception if there is no such definition.

These words do not expose or limit any internal mechanism of the Forth system. They just provide information about word lists, like the words find-name-in, find-name, and traverse-wordlist do. It's a kind of introspection/reflection.

This words are intended for programs. The system may use them, but is not required to do so. The system may continue to use its internal last, latest, or whatever it was using before.

It seems, the best place for these words is the section 15.6.2 Programming-Tools extension words, where traverse-wordlist is also placed.

Rationale

Connection with word lists

By considering definitions in the frame of a word list only, we solve several problems, namely:

  1. A word list contains only completed definitions (see the accepted proposal #153 Traverse-wordlist does not find unnamed/unfinished definitions). This eliminates the question of whether the word of returned nt is finished — yes, it is always finished (completed).

  2. Nameless definitions are not considered since they are not placed into the compilation word list (regardless of whether the system creates a name token for them, or places them into an internal system-specific word list).

  3. An extension or library can create definitions in its internal word list for internal purposes. And it will not affect the compilation word list or other user-defined word lists. Thus, the user of such library always gets the expected result from latest-name (regardless of what words are created by this library for internal purposes on the fly). For example, when different dictionary spaces will be introduced, we can implement something like local variables (or local definitions) in portable way, and creating such a definition will not affect the value that latest-name returns.

Returned values

As a matter of practice, almost all the use cases for the word latest-name imply that the requested definition exists, and if it doesn't exist, only an error can be reported. So the option to return 0 by this word only burdens users with having to analyze this zero, or redefine this word as:

: latest-name ( -- nt ) latest-name dup 0= -80 and throw ;

If the user needs to handle the case where the compilation word list is empty, they can use the word latest-name-in as:

get-current latest-name-in dup if ( nt ) ... else ( 0 ) drop ... then

Throwing an exception in latest-name is slightly inconsistent with the other words that return 0 when the requested definition does not exist.

But this follows the general principle: throw an exception when, in most use cases, the situation is surprising or cannot be handled locally, return a value when, in most use cases, the situation is anticipated or may require local handling .

In most use cases of latest-name usage, the absence of definitions in the compilation word list is unexpected and cannot be handled locally.

As for latest-name-in, there is no practice of using it yet. I can only imagine one use case: in a package/module framework, the framework obtains the latest word from a provided word list for a special purpose (like "main"), and applies special actions if the word list is empty (e.g., it might use some default word of the same type, or add the default word to the provided word list, or check for a different word list if multiple word lists are provided, etc.). Thus, the empty word list situation in this use case is expected and may require local handling.

Implementation options

If the the word list structure in a Forth system contains information about the latest placed definition, the implementations for the proposed words are trivial.

In some plausible Forth systems the word list structure doesn't directly contain information about which definition was placed into the word list most recently, and this information cannot be obtained indirectly. Such systems might not provide the proposed words, or they are changed to keep this information in the word list structure. It seems, in most systems the word list structure directly contains this information, or this information can be obtained indirectly.

Some checked systems:

  • Gforth, minForth, ikForth, SP-Forth, Post4 — a word list keeps information about the definition that was placed in it most recently;
  • SwiftForth, VFX — the most recently placed word in a word list can be correctly obtained form the strands/threads (since nt values are monotonically increased);
  • lxf/ntf 2017 — the most recently placed word in a word list can be obtained using traverse-wordlist (since nt values are monotonically increased).

Note that some systems have a bug in traverse-wordlist so it can return the nt for a definition that cannot be found (namely, for the current definition). This is incorrect (see a testcase).

If a system does not implement the optional Search-Order word set, it might not provide the word latest-name-in.

Naming

The names latest-name-in and latest-name of the new words are similar to find-name-in and find-name by the form. Stack effects are also similar.

The difference is that find-xxx is a verb phrase that starts with a verb, but latest-xxx is a noun phrase that starts with an adjective (see Wiktionary/latest).

Both the English words "find" and "latest" have historically been used in Forth word names, as is "name".

In Forth-84 "name" in word names denoted NFA (Name Field Address), and now it denotes a name token, which is the successor of NFA. In all standard words, e.g. find-name, name>string, name>compile, etc. (except parse-name), "name" denotes a name token.

NB: the term "token" in "name token" does not mean a character sequence! It's used in a general sense, like "something serving as an expression of something else" (see Wiktionary).

Normative text description

The proposed normative text description is based on:

  • 16.2: "compilation word list: The word list into which new definition names are placed",
  • 15.3.1: "A name token is a single-cell value that identifies a named word",
  • 3.4.3: "[Semantics] are largely specified by the stack notation in the glossary entries, which shows what values shall be consumed and produced. The prose in each glossary entry further specifies the definition's behavior" (there is no need to repeat in the text description what is already indicated in the stack diagrams). (emphasis added)

Throw code description

If the throw code description states that there is no latest name, it can be confusing since latest name in some sense probably always exists.

Therefore, it's better to say: "the compilation word list is empty" — it is what actually happens.

Motivation for latest-name-in

  1. It's a natural factor for latest-name. It's always possible to extract this factor from the implementation of latest-name, because the latter returns nt from the compilation word list, and the system should take wid of the compilation word list and extract most recent nt from this word list.
  2. It's very important to specify the behavior of this word to avoid different behavior in different systems, since in many systems this word will exist (will be implemented as a natural factor).
  3. In some cases a program needs to check if a word list is empty, or obtain the latest word from a particular word list (for example, to use this word as entry point, like main, or as the default exported word from a module).
  4. These both words are optional. And if latest-name-in is not provided, it can be implemented in a portable way via latest-name as:
    : latest-name-in ( wid -- nt|0 )
      get-current >r set-current
     ['] latest-name catch if 0 then
      r> set-current
    ;
    

Things to discuss

Is it worth introducing the word latest-name-xt ( -- xt )?

If name>interpret never returns 0 (see my comment), this word can be implemented as:

: latest-name-xt ( -- xt ) latest-name name>interpret ;

The desired (and much discussed) pattern is:

defer bar

: foo ... ; latest-name-xt is bar

Sometimes the name "it" has been suggested for this word, but this name is too short and has more chance for conflicts. Guido Draheim wrote in comp.lang.forth on 2003-03-16:

I think that everyone has been thinking of using IT for something really clever, it's a nice short word - and I'd say that we should leave it for application usage.

I want to support that argument also with real life experience in the telco world where there are a whole lot of abbreviations for various services, signals, connectors around. All too often now I see people making a SYNONYM at the file-start to get a second name for an ANS forth word that is needed in the implemenation but coincides with a common term of the application.

This seems convincing to me.

Typical use

: struct: ( "name" -- wid.compilation.prev u.offset )
  get-current  vocabulary
  also  latest-name name>interpret execute  definitions
  0
;
: ;struct ( wid.compilation.prev u.offset -- )
  s" __size" ['] constant execute-parsing
  set-current
;

The word execute-parsing ( i*x c-addr u "ccc" -- j*x ) is a well-known word, see an implemented at https://theforth.net/.

\ In the application's vocabulary
: it ( -- xt ) latest-name name>interpret ;

defer foo
\ ...

: bar ... ; it is foo

Proposal

Changes in existing sections

Add the following line into the Table 9.1: THROW code assignments:

-80 the compilation word list is empty

Editorial note: the actual throw code may change.

New glossary sections

Add the following sections into 15.6.2 Programming-Tools extension words:

15.6.2.xxxx LATEST-NAME-IN TOOLS EXT

( wid -- nt|0 ) If the word list identified by wid is empty, then the returned value is 0; otherwise, the name token nt identifies the definition whose name was placed most recently into the word list wid.

Note: nt can only be returned for a definition that can be found in wid.

See also: 15.6.2.xxxx LATEST-NAME, 15.6.2.2297 TRAVERSE-WORDLIST, 6.1.0460 ;.

15.6.2.xxxx LATEST-NAME TOOLS EXT

( -- nt ) If the compilation word list is not empty, the name token nt identifies the definition whose name was placed most recently into this word list. Otherwise, the exception code -80 is thrown.

Note: nt can only be returned for a definition that can be found in the compilation word list.

See also: 15.6.2.xxxx LATEST-NAME-IN, 15.6.2.2297 TRAVERSE-WORDLIST, 6.1.0460 ;.

New rationale sections

Add the following sections into A.15.6 Glossary:

A.15.6.2.xxxx LATEST-NAME-IN

The word latest-name-in cannot return an nt that cannot be obtained using find-name-in or traverse-wordlist applied to the specified word list.

The word latest-name-in returns 0 if the specified word list is empty, allowing the program to handle this situation locally.

See also: A.15.6.2.xxxx LATEST-NAME.

A.15.6.2.xxxx LATEST-NAME

The word latest-name cannot return an nt that cannot be obtained using find-name-in or traverse-wordlist applied to the compilation word list.

In some Forth systems the word : (colon) places an nt into the compilation word list and makes it hidden (unfindable). This nt must not be available for traverse-wordlist and for latest-name. Thus, formally, only the words ; (semicolon), does>, and ;code are allowed to add the nt of a definition created with : (colon) to the compilation word list.

If a Forth system does not provide the optional Search-Order word set, and in that Forth system the word immediate moves an nt from one internal list to another, this must not affect what latest-name returns, and this must not affect what find-name returns (for example, consider a case where two last words have the same name and immediate is used for the latest one). Thus, after execution of immediate, latest-name shall return the same value as before this execution.

Typical use

: var ( "name" -- )
  variable
  0  latest-name name>interpret execute  !
;

The word latest-name throws an exception if the compilation word list is empty, since in typical use cases this situation is surprising and is not handled locally.

Reference implementation

In the following implementation for latest-name-in we assume that a word list identifier wid is an address that contains nt of the most recently placed definition name into this word list.

: latest-name-in ( wid -- nt|0 ) @ ;

In the following implementation for latest-name-in we assume that the values of nt, interpreted as unsigned numbers, monotonically increase when sorted chronologically (it works on most systems):

: umax ( u1 u2 -- u.max ) 2dup u< if swap then drop ;

: latest-name-in ( wid -- nt|0 )
  >r 0 [: umax true ;] r> traverse-wordlist
;

An implementation for latest-name:

: latest-name ( -- nt )
  get-current latest-name-in  dup if exit then  -80 throw
;

Testing

: it ( -- xt ) latest-name name>interpret ;

wordlist constant wl1

t{ : ln1 ; it  ' ln1 =  -> true }t
t{ get-current latest-name-in ' ln1 =  -> true }t
t{ :noname [ it ] literal ; execute  ' ln1 =  -> true }t
t{ : ln2 [ it ] literal ; ln2  ' ln1 =  -> true }t
t{ wl1 latest-name-in -> 0 }t
get-current wl1 set-current ( wid.prev )
t{ ' latest-name catch -> -80 }t
t{ : ln3 ;  -> }t
set-current
t{ it ' ln2 = -> true }t

[r1574] 2025-10-05 03:40:13 NickMessenger replies:

referenceImplementation - Example implementation for PICK

Discussion of the PLACE proposal is not relevant to the word that Mr. Peterson posted. To reduce distraction I'll rename it to BURY [see note], except I wish to change the API slightly, adding 1 to the index. It makes the example implementation more clumsy but I have at least one reason below:

: bury ( x_u x_u-1 ... x_1 x_0 u -- x_0 x_u-1 ... x_1 )
  dup 0 = if 2drop exit then
  dup 1 = if drop nip exit then
  rot >r 1- recurse r>
;

\ Rationale: replace x_u with x_0 then drop x_0:
T{ 44 33 22 11 00 4 bury -> 00 33 22 11 }T
T{ 00 0 bury -> }T

PICK fetches a value from down the stack, BURY stores a value to down the stack. I suspect most systems can feasibly implement constant time BURY, one cell load and store, just like PICK and unlike words like SWAP ROT and especially ROLL.

PICK and BURY are sufficient to express every possible stack operation. No need for even >R R>:

: dup   0 pick ;  : drop  0 bury ;
: over  1 pick ;  : nip   1 bury ;

: 2nip  2 bury 2 bury ;
: swap  dup 2 pick 2nip ;

: 3nip  3 bury 3 bury 3 bury ;
: rot   over over 4 pick 3nip ;
( etc etc )

Some of the rarer ones could be demoted to be optional, since the system user can implement them with PICK BURY.

[note] About the name BURY: I seem to recall seeing some system somewhere that used the names DIG and BURY to mean ROT and -ROT. I thought it was retroforth but no, that has a ROT. As an aside: since I am reusing BURY to mean something different, ideally It'd be nice to rename PICK to DIG to match. I assume that's not appropriate for the standard though. The name is important but the functionality is useful whatever its name.


[r1575] 2025-10-05 15:35:34 ruv replies:

referenceImplementation - Example implementation for PICK

@NickMessenger wrote:

I wish to change the API slightly, adding 1 to the index. It makes the example implementation more clumsy but I have at least one reason below

PICK and BURY are sufficient to express every possible stack operation.

I see, but it seems to make bury more clumsy to use.

Could you give some practical examples where the word bury is useful? Especially, where you need to simply drop the argument ( x 0 ).


[r1576] 2025-10-05 18:22:32 NickMessenger replies:

referenceImplementation - Example implementation for PICK

0 BURY is just another way to express DROP, so not useful in-and-of-itself, again except for the fact that it makes PICK/BURY sufficient to express all stack operations. You could picture a super minimal system that no stack words at all except these two, so DROP DUP NIP would be written in forth as above. Implementation-wise you could imagine 123 0 BURY would store 123 in the same cell where it already is then DROP it, so no need for any special cases. It's the dropping effect we're looking for.

As for practical examples, I listed the above words. I can list some more:

: -rot  dup 3 pick 3 pick 3nip ;
: flip ( a b c -- c b a ) dup 3 pick 2 bury 3 bury ;

: 6nip  6 bury 6 bury 6 bury 6 bury 6 bury 6 bury ;
: 2over  3 pick 3 pick ;
: 2rot  2over 2over 9 pick 9 pick 6nip ;

I feel like PICK BURY could lessen perceived need for local variables. Ideally you rewrite your words to need fewer parameters but if you can't then there's a need to move values somewhere, be it locals, globals, the rstack, etc. SWAP ROT ROLL require N loads and stores so are less efficient, but this would allow you to PICK what you need to work with, BURY the results where they go and DROP any intermediates.


[r1577] 2025-10-05 19:23:08 NickMessenger replies:

referenceImplementation - Example implementation for PICK

A system like gforth with SP@:

: pick ( xu..x0 u -- xu..x0 xu ) 1+ cells sp@ + @ ;
: bury ( xu..x1 x0 u -- x0..x1 ) 1+ cells sp@ + ! ;
 
cr 22 11 00 2 pick .s \ <4> 22 11 0 22
cr 77 2 bury .s \ <4> 22 11 77 22

[r1578] 2025-10-06 17:16:44 ruv replies:

referenceImplementation - Example implementation for PICK

this would allow you to PICK what you need to work with, BURY the results where they go and DROP any intermediates.

Do you have any programs that use bury? I mean, other than a super-minimal system that doesn't provide any stack words other than pick and bury—that's interesting, but impractical )


[r1579] 2025-10-07 01:39:21 NickMessenger replies:

referenceImplementation - Example implementation for PICK

gforth-internal does in fact define the 2 + variant which they call STICK and use a few times. My favorite system, durexForth, uses a split stack and so cannot support SP@ but does have PICK. I wrote a BURY in its provided assembler wordset, then used BURY to implement 2SWAP 2ROT 2NIP.

...I don't actually have any programs that use those words though. And I just benchmarked it and the PICK/BURY version of 2SWAP took about triple the time of the ROT >R ROT R> version. If I use VALUEs instead of literals it's only double the time. So maybe it's pointless. I just think it's weird that the standard has a fetch-from-downstack but no store-to-downstack.


[r1580] 2025-10-07 07:36:21 ruv replies:

referenceImplementation - Example implementation for PICK

I just think it's weird that the standard has a fetch-from-downstack but no store-to-downstack.

Yes, this can be considered as a slight gap in orthogonality. But since such a word can be defined using standard words, I don't think there's any point in standardizing it before it's used in practice.

Concerning the stack effect (API)

I would prefer the original one, because it is similar to other fetch/store words like (@, !), (2@, 2!), (defer@, defer!), etc.

The general stack diagrams of these words are:

  • fetch ( identifier -- data-object )
  • store ( data-object identifier -- )

And if you store a data object by an identifier, you should then read the same data by the same identifier. Your option violates this rule.

A feature of pick is that it interprets the underneath stack items as an array on which it operates. The same should be for the word that stores a value.

Concerning naming

In comp.lang.forth, this word was mentioned (Andrew Haley, 2013-01-29) as poke.

In the paper "Symbolic Stack Addressing" (Adin Tevet, 1989), this word is called post. The idea there is that a phrase n pick n post, where n is a non-negative integer less than the stack depth, does not change the state of the stack.

I would prefer the name poke because:

  • it is a historical name for a function that stores a value in memory, so it has the corresponding semantic association (unlike post, stick and bury);
  • it is mainly a verb (unlike stick, which is mostly a noun);
  • its sound is closer to pick than post and bury;

The second, but much less preferable, option is stick.


[r1581] 2025-10-07 12:58:33 EricBlake replies:

comment - Interpretation of the top input parameter of PICK

The word pick has the type ( x.0 u.cntx u.cnt -- x.0 u.cntx x.0 ).

and it additionally documents that 0 pick is the same as dup, and 1 pick is the same as over.

The word roll has the type ( x.0 u.cntx u.cnt -- u.cntx x.0 ).

and it additionally documents that 1 roll is the same as swap, and 2 roll is the same as rot.

2pick must have the type ( xd.0 u.cntx u.cnt -- xd.0 u.cntx xd.0 );

2roll must have the type ( xd.0 u.cntx u.cnt -- u.cntx xd.0 );

But the standard also already has 2dup, 2over, 2swap, and 2rot. If I try to extrapolate the same mappings on the larger 2-cell types, without paying attention to your proposed stack diagram for 2pick and 2roll, my first guess would have been:

0 2pick would be the same as 2dup, and 1 2pick the same as 2over. 1 2roll would be the same as 2swap, and 2 2roll the same as 2rot.

That is, I would have assumed that since existing 2XXX words have stack effects that force the entire stack to be paired off up to the point of impact, that the same pairing effect would be present in your proposed 2pick and 2roll.

But I was pleasantly surprised to note that your proposal does NOT do that, but is actually more powerful by letting me control how many single-cell slots to skip before finally accessing a 2-cell object.

In particular, while 0 2pick does end up being the same as 2dup, your 1 2pick has an interesting single-cell stack effect ( x.a x.b x.c -- x.a x.b x.c x.a x.b ) that is not available from any one single standard word, but which I have wanted in my own code (I've ended up with things like dup 2over rot drop or over 3 pick swap to get the same effect). And it is not until you get to 2 2pick that you get the effect of 2over.

Similarly, your 1 2roll has the single-cell stack effect ( x.a x.b x.c -- x.c x.a x.b ) which is the well-known -rot (aka rot rot in the standard), and it is not until you get to 2 2roll that you finally accomplish 2swap, or 4 2roll that you accomplish 2rot.


[r1582] 2025-10-07 13:20:53 BerndPaysan replies:

proposal - Clarification for execution token

One additional problem with is, to and action-of is, that there are two different implementation styles: One parses, one sets a state. Both work with your tests, but I'm sure you can come up with a distinguisher, e.g.

3 value x
: check-x  ['] to execute  x ;
t{ 5 check-x x -> 5 }t

The parsing variant will consume x during execution of to, the non-parsing one won't. Note that the standard mandates a parsing to, anyhow (at least by how it is worded), and we should not conflate obtaining an xt of to and executing it with suggested backdoors for non-conforming implementations.


[r1583] 2025-10-07 18:56:10 ruv replies:

proposal - Clarification for execution token

there are two different implementation styles: One parses, one sets a state.

Excellent point, thank you! This should also be noted in the rationale. A distinguisher should use two value-flavored variables, I will add a testcase.

Formally, non-parsing implementations don't conform to either Forth-94 or Forth-2012, regardless the xt of these words, since their fail in compilation state if their immediate argument is an immediate word (testcase). In practice, such cases simply do not occur.

Note that any non-parsing implementation can easily be converted to a parsing implementation. For example, if to is an immediate word:

: to  ['] to execute  parse-name evaluate ; immediate

I think that after 30 years, the standard should stop making concessions to implementations that don't do parsing in to, is, action-of (i.e., stop including the corresponding ambiguous conditions).

Rationale:

  • this allows programs to redefine these words;
  • this makes the standard Forth clearer and more consistent;
  • this supports efforts to reduce the number of ambiguous conditions;
  • changes in the affected Forth systems seems simple;

If no objections, I will add this into the next version.