Digest #319 2025-10-08
Contributions
The word pick
has the type ( x.0 u.cnt*x u.cnt -- x.0 u.cnt*x x.0 )
.
The word roll
has the type ( x.0 u.cnt*x u.cnt -- u.cnt*x x.0 )
.
The discussed word poke
has the type ( x.0 u.cnt*x x.1 u.cnt -- x.1 u.cnt*x )
.
In my other comment I wrote that pick
"interprets the underneath stack items as an array on which it operates", and then u.cnt is an index in this array.
A more fundamental interpretation is that the input parameter u.cnt (in pick
, poke
, roll
) represents the number of stack items that need to be "skipped" to locate the target input parameter x.0 (that is copied, taken, or overwritten).
A consequence of this is that:
2pick
must have the type( xd.0 u.cnt*x u.cnt -- xd.0 u.cnt*x xd.0 )
;2poke
must have the type( xd.0 u.cnt*x xd.1 u.cnt -- xd.1 u.cnt*x )
;2roll
must have the type( xd.0 u.cnt*x u.cnt -- u.cnt*x xd.0 )
;
Rationale: the number of "skipped" stack items should not depend on the data type of the target parameter.
See also my other comment on ForthHub (2024-12-02) on this regard.
I'm not exactly sure how to mention this without leaving a comment here, since it affects this definition. Draft 21.1, specifically the PDF version uploaded last month, has a formatting issue in the text of the EKEY? definition on page 102. The second line of the last sentence overflows into the footer at the bottom of the page, making the text and footer (including the page number) unreadable.
I don't think Draft PDF formatting issues fall into any Proposal category. If there's a more appropriate means of conveying the information, please let me know.
Replies
What's used even more frequently in Gforth is noname
, which allows to create unnamed words (using latestxt
afterwards to access the xt). And a lot of words with nextname
or noname
are users of create
, that exist anyways. That's why Gforth uses this one-shot modified header creation instead of a variant of create
.
execute-parsing
works for the named part, but noname
won't, unless we e.g. would allow that s" " ['] create execute-parsing
would create an unnamed word instead of complaining about the missing name.
Execute-parsing
cannot work for parsing an empty name to create
because regular create
produces an error when it tries to parse a name and there is none there. This is a difference to nextname
: nextname
replaces the parsing, so you can give to it an empty name or a name containing spaces. OTOH, execute-parsing
works for all parsing words (also for multiple invocations of parse-name
or other basic parsing words), not just defining words, whereas nextname
only works for defining words.
A possible generalization of both is to let a nextname
-like word precharge the next invocation of parse-name
or parse
, and maybe allow to precharge several invocations of these words in some way.
requestClarification - What should the behavior be if the system has no hard-coded limit on size?
is it better to include the word
UNUSED
and have it return-1
to indicate unknown (or rather, the maximum unsigned integer since the prototype uses u), or to omit the wordUNUSED
from the core extension set?
Of these two options, it is better to omit unused
, since this word has type ( -- u )
and so it cannot return -1
.
The program interprets the result of unused
as an unsigned number, and the system must ensure that the corresponding amount of memory can be reserved in one or more subsequent calls to allot
. Note that allot
has type ( n -- )
, and it reserves memory if n > 0, and releases memory if n < 0.
Thus, the system should pass the following testcase:
: u/ ( u u\0 -- u ) 0 swap um/mod nip ;
s" MAX-N" environment? invert [if] -1 2 u/ 1 - [then]
constant largest-n
: reserve ( u -- )
dup largest-n u< if ( +n ) allot exit then dup 2 u/ dup recurse - recurse
;
: release ( u -- )
dup largest-n u< if ( +n ) negate allot exit then dup 2 u/ dup recurse - recurse
;
t{ here unused 2dup reserve here swap - over = over release -> here unused true }t
Regarding unused
and dynamic memory or unknown memory size. An alternative option is to provide a specific (possible configured) amount of memory, which can be reserved with allot
, and update the available memory on creating a definition.
An example (derived from my another comment):
\ private
2variable dictspace-dp \ the data pointer and border
: assume-dictspace ( addr u -- ) over + swap dictspace-dp 2! ;
\ public
: unused ( -- u ) dictspace-dp 2@ - ;
: here ( -- addr ) dictspace-dp @ ;
: allot ( n -- ) dictspace-dp +! ;
\ private
500 1024 * constant dictspace-unused-initial
1 1024 * constant dictspace-unused-low
: ensure-dictspace-reservation ( -- )
unused dictspace-unused-low u> if exit then
dictspace-unused-initial dup allocate throw swap assume-dictspace
;
\ initialization
0 0 assume-dictspace ensure-dictspace-reservation
\ public
: ; ( colon-sys -- ) postpone ; ensure-dictspace-reservation ; immediate
I think, a standard Forth system is allowed to provide such an implementation.
Eric, thank for the suggested corrections.
It is odd that most of this proposal (as of this revision) uses lower-case, but the Testing section (still) uses upper-case
Currently, all standard word names in the text of the Standard are spelled in uppercase. Therefore, I tried to use uppercase for word names in the parts of my proposal that should be included in the Standard. But I'm reluctant to use uppercase in programs and prose when possible.
Author
Ruv
Change Log
- 2023-10-22 Initial revision
- 2023-10-23 Add testing, examples, a question to discuss, change the throw code description
- 2023-10-27 Some rationales and explanations added, the throw code description changed back, better wording in some places
- 2024-06-20 Fix some typos, make some wording and formatting better, add some examples and test cases, add motivation for
LATEST-NAME-IN
, change the status to "formal". - 2024-06-20 Add a test case to check that
LATEST-NAME
returns different value after the compilation word list is switched. - 2024-06-20 Simplify the normative text description, and add a rationale for this simplification.
- 2025-09-15 Add clause about findable words, add rationale sections in the proposal, address a question re
immediate
, note a bug intraverse-wordlist
in some Forth systems, make some rewording and minor corrections, add a more general reference implementation. - 2025-09-19 Make corrections from Eric Blake, mention
find-name
instead ofsearch-wordlist
, use lowercase in the test cases in in prose when possible, make some rewording in the prose, add some links.
Problem
In some applications, mainly in libraries and extensions, the capability to obtain the most recently added definition is very useful and demanded.
To make such programs portable, we should introduce a standard method to obtain the most recently added word.
For example, if we are creating a library for decoration, tracing, support for OOP, simple DSLs (e.g., to describe Final State Machines), etc — it is always useful to have an accessor to the recent definition, instead of redefining a lot of words to define such an access method yourself, or juggling with the input buffer and search.
See some examples in the Typical use section.
Also, a number of specific examples are provided in my post on ForthHub (those examples are not inserted here so as not to bloat the text).
And additionally, there has been much discussions regarding standardization of such a method in recent decades. For example, Elizabeth D. Rather wrote on 2011-12-09 in comp.lang.forth
:
AFAIK most if not all Forths have some method for knowing the latest definition, it's kinda necessary. The problem is, that they all do it differently (at different times, in different forms, etc.), which is why it hasn't been possible to standardize it.
Although it's a system necessity, I haven't found this of much value in application programming.
Elizabeth D. Rather
Indeed, depending on the system, the internal method may return the recent word depending on the compilation word list or independent of the compilation word list, a completed definition or an incomplete definition, an unnamed definition or only a named definition, and so on.
However, I believe that a standardized method has significant value for libraries and DSLs in application programming, as my examples should demonstrate.
Some known internal methods: latest ( -- nt|0 )
, last @ ( -- nt|0 )
, latestxt ( -- xt|0 )
, etc.
Thus, although almost every Forth system contains such a method, there is no portable way for programs to obtain the latest definition.
Solution
Let's introduce the following words:
latest-name-in ( wid -- nt|0 )
latest-name ( -- nt )
The first word returns the name token for the definition whose name was placed most recently into the given word list, or zero if this word list is empty.
The second word returns the name token for the definition whose name was placed most recently into the compilation word list, or throws an exception if there is no such definition.
These words do not expose or limit any internal mechanism of the Forth system. They just provide information about word lists, like the words find-name-in
, find-name
, and traverse-wordlist
do.
It's a kind of introspection/reflection.
This words are intended for programs. The system may use them, but is not required to do so. The system may continue to use its internal last
, latest
, or whatever it was using before.
It seems, the best place for these words is the section 15.6.2 Programming-Tools extension words, where traverse-wordlist
is also placed.
Rationale
Connection with word lists
By considering definitions in the frame of a word list only, we solve several problems, namely:
A word list contains only completed definitions (see the accepted proposal #153 Traverse-wordlist does not find unnamed/unfinished definitions). This eliminates the question of whether the word of returned nt is finished — yes, it is always finished (completed).
Nameless definitions are not considered since they are not placed into the compilation word list (regardless of whether the system creates a name token for them, or places them into an internal system-specific word list).
An extension or library can create definitions in its internal word list for internal purposes. And it will not affect the compilation word list or other user-defined word lists. Thus, the user of such library always gets the expected result from
latest-name
(regardless of what words are created by this library for internal purposes on the fly). For example, when different dictionary spaces will be introduced, we can implement something like local variables (or local definitions) in portable way, and creating such a definition will not affect the value thatlatest-name
returns.
Returned values
As a matter of practice, almost all the use cases for the word latest-name
imply that the requested definition exists, and if it doesn't exist, only an error can be reported. So the option to return 0
by this word only burdens users with having to analyze this zero, or redefine this word as:
: latest-name ( -- nt ) latest-name dup 0= -80 and throw ;
If the user needs to handle the case where the compilation word list is empty, they can use the word latest-name-in
as:
get-current latest-name-in dup if ( nt ) ... else ( 0 ) drop ... then
Implementation options
If the the word list structure in a Forth system contains information about the latest placed definition, the implementations for the proposed words are trivial.
In some plausible Forth systems the word list structure doesn't directly contain information about which definition was placed into the word list most recently, and this information cannot be obtained indirectly. Such systems might not provide the proposed words, or they are changed to keep this information in the word list structure. It seems, in most systems the word list structure directly contains this information, or this information can be obtained indirectly.
Some checked systems:
- Gforth, minForth, ikForth, SP-Forth, Post4 — a word list keeps information about the definition that was placed in it most recently;
- SwiftForth, VFX — the most recently placed word in a word list can be correctly obtained form the strands/threads (since nt values are monotonically increased);
- lxf/ntf 2017 — the most recently placed word in a word list can be obtained using
traverse-wordlist
(since nt values are monotonically increased).
Note that some systems have a bug in traverse-wordlist
so it can return the nt for a definition that cannot be found (namely, for the current definition). This is incorrect (see a testcase).
If a system does not implement the optional Search-Order word set, it might not provide the word latest-name-in
.
Naming
The names latest-name-in
and latest-name
of the new words are similar to find-name-in
and find-name
by the form. Stack effects are also similar.
The difference is that find-xxx is a verb phrase that starts with a verb,
but latest-xxx
is a noun phrase that starts with an adjective (see Wiktionary/latest).
Both the English words "find" and "latest" have historically been used in Forth word names, as is "name".
In Forth-84 "name" in word names denoted NFA (Name Field Address), and now it denotes a name token, which is the successor of NFA.
In all standard words, e.g. find-name
, name>string
, name>compile
, etc. (except parse-name
), "name" denotes a name token.
NB: the term "token" in "name token" does not mean a character sequence! It's used in a general sense, like "something serving as an expression of something else" (see Wiktionary).
Normative text description
The proposed normative text description is based on:
- 16.2: "compilation word list: The word list into which new definition names are placed",
- 15.3.1: "A name token is a single-cell value that identifies a named word",
- 3.4.3: "[Semantics] are largely specified by the stack notation in the glossary entries, which shows what values shall be consumed and produced. The prose in each glossary entry further specifies the definition's behavior" (there is no need to repeat in the text description what is already indicated in the stack diagrams). (emphasis added)
Throw code description
If the throw code description states that there is no latest name, it can be confusing since latest name in some sense probably always exists.
Therefore, it's better to say: "the compilation word list is empty" — it is what actually happens.
Motivation for latest-name-in
- It's a natural factor for
latest-name
. It's always possible to extract this factor from the implementation oflatest-name
, because the latter returns nt from the compilation word list, and the system should take wid of the compilation word list and extract most recent nt from this word list. - It's very important to specify the behavior of this word to avoid different behavior in different systems, since in many systems this word will exist (will be implemented as a natural factor).
- In some cases a program needs to check if a word list is empty, or obtain the latest word from a particular word list (for example, to use this word as entry point, like
main
, or as the default exported word from a module). - These both words are optional. And if
latest-name-in
is not provided, it can be implemented in a portable way vialatest-name
as:: latest-name-in ( wid -- nt|0 ) get-current >r set-current ['] latest-name catch if 0 then r> set-current ;
Things to discuss
Is it worth introducing the word latest-name-xt ( -- xt )
?
If name>interpret
never returns 0
(see my comment), this word can be implemented as:
: latest-name-xt ( -- xt ) latest-name name>interpret ;
The desired (and much discussed) pattern is:
defer bar
: foo ... ; latest-name-xt is bar
Sometimes the name "it
" has been suggested for this word, but this name is too short and has more chance for conflicts. Guido Draheim wrote in comp.lang.forth
on 2003-03-16:
I think that everyone has been thinking of using
IT
for something really clever, it's a nice short word - and I'd say that we should leave it for application usage.I want to support that argument also with real life experience in the telco world where there are a whole lot of abbreviations for various services, signals, connectors around. All too often now I see people making a SYNONYM at the file-start to get a second name for an ANS forth word that is needed in the implemenation but coincides with a common term of the application.
This seems convincing to me.
Typical use
: struct: ( "name" -- wid.compilation.prev u.offset )
get-current vocabulary
also latest-name name>interpret execute definitions
0
;
: ;struct ( wid.compilation.prev u.offset -- )
s" __size" ['] constant execute-parsing
set-current
;
The word execute-parsing ( i*x c-addr u "ccc" -- j*x )
is a well-known word, see an implemented at https://theforth.net/.
\ In the application's vocabulary
: it ( -- xt ) latest-name name>interpret ;
defer foo
\ ...
: bar ... ; it is foo
Proposal
Changes in existing sections
Add the following line into the Table 9.1: THROW code assignments:
-80
the compilation word list is empty
Editorial note: the actual throw code may change.
New glossary sections
Add the following sections into 15.6.2 Programming-Tools extension words:
15.6.2.xxxx LATEST-NAME-IN
TOOLS EXT
( wid -- nt|0 )
If the word list identified by wid is empty, then the returned value is 0
; otherwise, the name token nt identifies the definition whose name was placed most recently into the word list wid.
Note: nt can only be returned for a definition that can be found in wid.
See also:
15.6.2.xxxx LATEST-NAME
,
15.6.2.2297 TRAVERSE-WORDLIST
,
6.1.0460 ;
.
15.6.2.xxxx LATEST-NAME
TOOLS EXT
( -- nt )
If the compilation word list is not empty, the name token nt identifies the definition whose name was placed most recently into this word list.
Otherwise, the exception code -80
is thrown.
Note: nt can only be returned for a definition that can be found in the compilation word list.
See also:
15.6.2.xxxx LATEST-NAME-IN
,
15.6.2.2297 TRAVERSE-WORDLIST
,
6.1.0460 ;
.
New rationale sections
Add the following sections into A.15.6 Glossary:
A.15.6.2.xxxx LATEST-NAME-IN
The word latest-name-in
cannot return an nt that cannot be obtained using find-name
or traverse-wordlist
applied to the same word list.
See also: A.15.6.2.xxxx LATEST-NAME
.
A.15.6.2.xxxx LATEST-NAME
The word latest-name
cannot return an nt that cannot be obtained using find-name
or traverse-wordlist
applied to the compilation word list.
In some Forth systems the word :
(colon) places an nt into the compilation word list and makes it hidden (unfindable).
This nt must not be available for traverse-wordlist
and for latest-name
.
Thus, formally, only the words ;
(semicolon), does>
, and ;code
are allowed to add the nt of a definition created with :
(colon) to the compilation word list.
If a Forth system does not provide the optional Search-Order word set, and in that Forth system the word immediate
moves an nt from one internal word list to another, this must not affect what latest-name
returns, and this must not affect what find-name
returns (for example, consider a case where two last words have the same name and immediate
is used for the latest one).
Thus, after execution of immediate
, latest-name
shall return the same value as before this execution.
Typical use
: var ( "name" -- )
variable
0 latest-name name>interpret execute !
;
Reference implementation
In this implementation for latest-name-in
we assume that a wid is an address that contains nt of the most recently placed definition name into this word list.
: latest-name-in ( wid -- nt|0 ) @ ;
In this implementation for latest-name-in
we assume that the values of nt, interpreted as unsigned numbers, monotonically increase when sorted chronologically (it works on most systems):
: umax ( u1 u2 -- u.max ) 2dup u< if swap then drop ;
: latest-name-in ( wid -- nt|0 )
>r 0 [: umax true ;] r> traverse-wordlist
;
An implementation for latest-name
:
: latest-name ( -- nt )
get-current latest-name-in dup if exit then -80 throw
;
Testing
: it ( -- xt ) latest-name name>interpret ;
wordlist constant wl1
t{ : ln1 ; it ' ln1 = -> true }t
t{ get-current latest-name-in ' ln1 = -> true }t
t{ :noname [ it ] literal ; execute ' ln1 = -> true }t
t{ : ln2 [ it ] literal ; ln2 ' ln1 = -> true }t
t{ wl1 latest-name-in -> 0 }t
get-current wl1 set-current ( wid.prev )
t{ ' latest-name catch -> -80 }t
t{ : ln3 ; -> }t
set-current
t{ it ' ln2 = -> true }t
Errata
In the fragments:
The word
latest-name-in
cannot return an nt that cannot be obtained usingfind-name
ortraverse-wordlist
The word
latest-name
cannot return an nt that cannot be obtained usingfind-name
ortraverse-wordlist
Read find-name
as find-name-in
.
Note: in general, find-name
can also be applied to a specific word list by specifying only that word list into the search order.
@GeraldWodni wrote:
Personally I need it to create multiple words with a common prefix.
I used execute-parsing
, and holds
to compose a name string (example).
So, I would implement your register:
in the following way
: register: ( addr "prefix" -- )
>r \ save the address
parse-name 2dup
\ create a store-word
"set-" <# 2swap holds holds 0 0 #>
['] create execute-parsing r@ , [: does> @ ! ;] execute
\ create a fetch-word
['] create execute-parsing r@ , [: does> @ @ ;] execute
rdrop
;
\ Example of use
variable uart-addr \ pretend to be a hardware register
uart-addr register: uart
'h' set-uart
uart emit
This solution is already standard-compliant, because execute-parsing
can be implemented in a standard program.
(see also my rationale against uart!
and uart@
names for this case).
In VFXForth it is called
($create)
SP-Forth provides the word created ( sd.name -- )
, so the pair (create
, created
) is similar to the pair of standard words (include
, included
) by the form.
But if we want to standardize the pure postfix variant of create
, why don't we standardize the postfix variants of other defining words?
I believe that standardizing execute-parsing
makes more sense than standardizing non-parsing-create
, since execute-parsing
solves the problem for all defining words, including user-defined words.
As for words whose name is an empty string, I don't see any demand for creating such words in programs. Can anyone provide some examples?
In general, a more useful Forth words (than non-parsing-create
) are:
- a Forth word that builds a new named definition from a name (a string) and an xt that identifies the execution semantics for the new definition;
- E.g.:
enlist-word ( xt sd.name -- )
- It may allow sd.name be empty.
- E.g.:
- a Forth word that build a new anonymous definition from an xt1 and x2 producing xt2, so applying
>body
to xt2 gives x2, and executing xt2 places x2 on the stack and executes xt1;- E.g.:
bind ( x xt1 -- xt2 )
- E.g.:
These words provide the functionality of create ... does> ... ;
and a little more.
Regarding the reference implementation of non-parsing-create
via evaluate
— it's incorrect because it depends on the search order.
The following test case should be passed:
t{ : foo get-order 0 set-order s" bar" non-parsing-create set-order does> drop 123 ; foo bar -> 123 }t
proposal - New words: latest-name and latest-name-in
Why does LATEST-NAME-IN
return 0 in case of empty wordlist, and LATEST-NAME
throws an error? I would expect a consistent interface: Either both throw or none throw.
Bernd, see the Rationale / Returned values sub-section, it says:
As a matter of practice, almost all the use cases for the word
latest-name
imply that the requested definition exists, and if it doesn't exist, only an error can be reported. So the option to return 0 by this word only burdens users with having to analyze this zero
Typically, when we use latest-name
, we assume that the compilation word list is not empty and don't analyze the returned value. It is better if this assumption is formally supported.
the underlying system might provide
PLACE
or+PLACE
, but with a different behaviour, rendering the rest of the standard program invalid.
A possible option is to introduce an optional word set or status "Deprecated" (or "Discouraged") specifically for such historical words. Then, in this section/status specify place
and +place
. This will reserve the names of these words and prevent Forth systems from providing words with these names but with different behavior.
I think, we should fix the following problems.
The term "translation"
The term translation
is not suitable to denote the general type of recognizers result. Since "translation" is either an act of translating, or a product of translating (not recognizing). Even the term "recognition" is more suitable, if someone likes it.
Another possible option: "recognized", which will be used as a nominalized adjective (i.e., a noun).
We also need a separate term to denote the type of the topmost x value of a successful recognizing result.
The scheme translate-something
The naming scheme translate-something
is not suitable for words that have type ( -- x )
and are constants.
- Effectively, any member of this naming scheme is a verb phrase; this scheme was intended for words that perform translation (interpretation or compilation), which is an active action with possible side effects.
- For example,
translate-nt ( i*x nt -- j*x )
.
- For example,
A word that is a constant should have the name that is a noun or a noun phrase.
This naming scheme should be aligned with the corresponding general data type name/symbol.
The names get-recs
and set-recs
The pair of words ( get-recs
, set-recs
) is similar to the pair of standard words ( get-order
, set-order
) by the form of their names, but very different conceptually, since they accept the object on the top. This is an inconsistency in naming conventions.
Better naming options are:
recs@
andrecs!
- "recs" in these names denotes the pair of types at once
- the type of a data object that is fetched or stored
- the type of a target data object
- it is also similar on some extend to the pairs of standard words (
defer@
,defer!
), (c@
,c!
), (2@
,2!
)- see also my post on ForthHub in this regard.
- "recs" in these names denotes the pair of types at once
fetch-recs
andstore-recs
The names translate:
and rec-sequence:
The corresponding words are proposed as defining words.
Traditionally, a colon was only used in the names of standard defining words that have a counterpart word with a semi-colon in the name.
So, this name is inconsistent with other names. Note that this tradition was broken bye new "*field:" words (but not +field
).
- Can we avoid a colon in the defining words that don't have a counterpart word with a semicolon?
The name rec-sequence:
is too close to rec-sequence
that is a member of the rec-something
naming scheme. This is inconsistent and confusing.
- A possible option:
recs
— an abbreviation of "recognizers sequence", which is "sequence of recognizers".- Maybe it is better if if this word was like
wordlist
, which produces a new identifier on the stack without creating a word.
- Maybe it is better if if this word was like
Author
Ruv
Change Log
(the latest at the top)
- 2025-09-28 Huge update; incorporate proposals [249], [212] (partially), [163], [122] (partially); add tests.
- 2022-09-19 explicitly allow a short formula, describe what it means, better wording, fix some typos
- 2022-08-13 Initial version
Preceding history
(the latest at the top)
2022-08-12 [249] Revert rewording the term "execution token" (proposal, retracted on 2025-09-28).
2021-09-08 [157] Reword the term "execution token" (proposal, accepted on 2021-09-13).
2021-09-08 [212] Tick and undefined execution semantics - 2 (proposal, considered on 2024-09-26).
2020-10-29 [163] Tick and undefined execution semantics (proposal, retracted on 2025-09-12)
2020-09-03 An attempt to solve the problem in
NAME>INTERPRET
by change meaning of "execution token" — [157] Reword the term "execution token"2020-02-20 Pointing out a problem in NAME>INTERPRET wording
2019-10-08 [122] Clarify FIND, more classic approach (proposal, in progress)
Problem
By the definition of the term "execution token" in Forth-94 and Forth-2012, it's a value that identifies execution semantics. Can such value identify other behavior, e.g. some interpretation semantics or compilation semantics? It's unclear at the first glance.
Another problem is that, following unfortunate change to the term "execution token" in the very quickly accepted proposal [157], the standard does not formally state that an execution token identifies anything at all.
Yet another problem is that it is unclear what behavior are identified by the execution token of a word whose execution semantics are not specified by the standard.
Solution
Actually, an execution token can identify (and does identify) other semantics too, but only if they are equivalent to the execution semantics that this token also identifies.
It is so because for any execution token there exists at least one named or unnamed Forth definition the execution semantics of which are identified by this execution token. So, in any case, an execution token always identifies some execution semantics, but accidentally these semantics can be equivalent to some interpretation semantics, or some compilation semantics, and then it identifies them too. It's unnecessary that they connected to the same Forth definition. Also, consequently, it's impossible that an execution token identifies some compilation semantics, or some interpretation semantics, but doesn't identify the equivalent execution semantics.
Note that there are cases where the semantics cannot be identified by an execution token in a Forth system, because the implementation of the system does not have or cannot have an unnamed (anonymous) definition with equivalent execution semantics.
Some examples of semantics that cannot be identified by an execution token:
- typically, the run-time semantics of
if
(an instance of); - in some systems (in which
compile,
is equivalent topostpone literal postpone execute
), the execution semantics of>r
; - in some system (where FVM does not have access to the underlying return stack, e.g. WAForth), the initiation semantics of
:noname
.
Of course, the standard allows such implementations and disallows programs to obtain an execution token of the corresponding semantics, or even does not provide a way to obtain it.
To solve the initial problem we can formally state these basics explicitly in a normative part, and specify what semantics are identified by the execution token of a word.
Also, we should update the definition of "execution token" term to say what it identifies.
Example
: foo postpone if ;
:noname postpone if ; ( xt )
The execution semantics of foo
are equivalent to the compilation semantics for if
.
In the same time, a Forth system may provide system-dependent execution semantics for if
that are not equivalent to the execution semantics of foo
.
xt, which is left on the stack in the second line, identifies the execution semantics of an anonymous Forth definition, and these execution semantics are equivalent to the compilation semantics for if
.
Typical use
"xt identifies the compilation semantics for the word
FOO
"- It means that the execution token xt identifies the execution semantics which are equivalent to the compilation semantics for the word
FOO
. - At the same time, the execution semantics of the word
FOO
may differ form the execution semantics identified by this xt.
- It means that the execution token xt identifies the execution semantics which are equivalent to the compilation semantics for the word
"the execution token for the word
BAR
"- It means that this execution token identifies the execution semantics of the word
BAR
. - Note: if the standard does not define interpretation semantics for
BAR
, the execution token ofBAR
could identify some system-specific execution semantics, because an ambiguous condition could occur (4.1.2) when the program obtains the execution token ofBAR
.
- It means that this execution token identifies the execution semantics of the word
"xt identifies the interpretation semantics for the word
BAZ
"- It means that the execution token xt identifies the execution semantics which are equivalent to the interpretation semantics for the word
BAZ
. - At the same time, the execution semantics of the word
BAZ
may differ form the execution semantics identified by this xt.
- It means that the execution token xt identifies the execution semantics which are equivalent to the interpretation semantics for the word
The execution semantics identified by xt are equivalent to the interpretation semantics for the word
BAZ
.- This seems pretty clear.
Incorrect use
Actually, the standard contains only one place where the "execution token" term is used ambiguously in a normative part — the glossary entry for FIND
.
Since it says that FIND
returns the execution token for the word ("its execution token", so it should identify the execution semantics of the word), but:
FIND
may return two different tokens (one while compiling and another while not compiling) for the same word, which may identify different semantics (then at least one of them does not identifies the execution semantics of the word);- in some cases, none of the returned execution tokens identifies the specified execution semantics of the word (for example, for the word
>r
in some systems).
In another glossary entry — for NAME>INTERPRET
— the language is just slightly non normative, since it uses the form "xt represents" instead of the form "xt identifies".
These glossary entries also have some other problems, so they should be corrected anyway; my other proposals for that are in progress.
Proposal
Update "execution token" term
In the section 2.1 Definitions of terms, change:
execution token: A value that can be passed to
EXECUTE
(6.1.1370)
into
execution token: A value that identifies the execution semantics of a definition.
Update "execution token" data type description
In the section 3.1.3.5 Execution tokens, add the following paragraphs to the beginning:
For any valid execution token in the system, there is at least one Forth definition (named or unnamed) whose execution semantics are identified by that execution token.
The execution semantics identified by an execution token can be equivalent to the interpretation semantics or compilation semantics for some word, or to some run-time semantics. In such a case this execution token also identifies the corresponding interpretation semantics, compilation semantics, or run-time semantics.
It it is not required that every specified semantics be identified by some execution token in the system.
The execution token of a Forth definition, if available, identifies the execution semantics of that definition, which are either specified by this standard or implementation dependent (if permitted).
Update "Execution semantics" notion
In the section 3.4.3.1 Execution semantics,
Change the paragraph:
The execution semantics of
eachForth definition are specified in an "Execution:" section of its glossary entry. When a definition has only one specified behavior, the label is omitted.
into
The execution semantics of a Forth definition are specified in an "Execution:" section of its glossary entry. When a definition has only one specified behavior, the label is omitted.
Rationale: for some Forth definitions execution semantics are not specified.
After that, add the the following two paragraphs:
If the execution semantics for a Forth definition are specified by this standard and the glossary entry of that definition does not have an "Interpretation:" section, the execution token of that definition identifies the specified execution semantics. Otherwise the execution token of that definition, if available, identifies the implementation dependent execution semantics.
The implementation dependent execution semantics of a Forth definition, when they are performed in interpretation state, shall perform the interpretation semantics of that definition. An ambiguous condition exists if they are performed in compilation state.
Rationale: until such words like >r
are defined using an "Execution" section, we have to rely on absence of an "Interpretation:" section to refer to ordinary words (commented on 2019-06-21, 2020-08-30).
Update glossary entries
In the glossary entries
6.2.2295 TO
,
6.2.1725 IS
,
6.2.0698 ACTION-OF
,
replace the phrase:
An ambiguous condition exists if any of
POSTPONE
,[COMPILE]
,'
or[']
are applied toTO
.
with the phrase:
An ambiguous condition exists if
POSTPONE
or[COMPILE]
are applied toTO
.
Testing
t{ ' s" execute abc" s" abc" compare -> 0 }t
t{ 1 value x 2 ' to execute x x -> 2 }t
t{ : (to) [ ' to compile, ] ; 3 (to) x x -> 3 }t
No one has written down any objections to this proposal for three years. Now I have incorporate it into my proposal [251] Clarification for execution token
We discussed your request for clarification for a while, with various reasons discusses why changing from +n to u would not make a difference or under what circumstances it would and eventually I suggested that we just follow your suggestion, and reached consensus on that.
Thinking about it again: In general, u is preferable to +n, because +n leaves it undefined what happens for negative n. +n is the right choice in cases where system behaviour varies for negative n, but otherwise, we should either specify u or n with a specific behaviour for negative n.
Author
Ruv
Change Log
(the latest at the top)
- 2025-09-29 Better wording in some places; corrections; update in ambiguous conditions.
- 2025-09-28 Huge update; incorporate proposals [249], [212] (partially), [163], [122] (partially); add tests.
- 2022-09-19 explicitly allow a short formula, describe what it means, better wording, fix some typos
- 2022-08-13 Initial version
Preceding history
(the latest at the top)
2022-08-12 [249] Revert rewording the term "execution token" (proposal, retracted on 2025-09-28).
2021-09-08 [157] Reword the term "execution token" (proposal, accepted on 2021-09-13).
2021-09-08 [212] Tick and undefined execution semantics - 2 (proposal, considered on 2024-09-26).
2020-10-29 [163] Tick and undefined execution semantics (proposal, retracted on 2025-09-12)
2020-09-03 An attempt to solve the problem in
NAME>INTERPRET
by change meaning of "execution token" — [157] Reword the term "execution token"2020-02-20 Pointing out a problem in NAME>INTERPRET wording
2019-10-08 [122] Clarify FIND, more classic approach (proposal, in progress)
Problem
By the definition of the term "execution token" in Forth-94 and Forth-2012, it's a value that identifies execution semantics. Can such value identify other behavior, e.g. some interpretation semantics or compilation semantics? It's unclear at the first glance.
Another problem is that, following unfortunate change to the term "execution token" in the very quickly accepted proposal [157], the standard does not formally state that an execution token identifies anything at all.
Yet another problem is that it is unclear what behavior are identified by the execution token of a word whose execution semantics are not specified by the standard.
Solution
Actually, an execution token can identify (and does identify) other semantics too, but only if they are equivalent to the execution semantics that this token also identifies.
It is so because for any execution token there exists at least one named or unnamed Forth definition the execution semantics of which are identified by this execution token. So, in any case, an execution token always identifies some execution semantics, but accidentally these semantics can be equivalent to some interpretation semantics, or some compilation semantics, and then it identifies them too. It's unnecessary that they connected to the same Forth definition. Also, consequently, it's impossible that an execution token identifies some compilation semantics, or some interpretation semantics, but doesn't identify the equivalent execution semantics.
Note that there are cases where the semantics cannot be identified by an execution token in a Forth system, because the implementation of the system does not have or cannot have a Forth definition with equivalent execution semantics.
Some examples of semantics that cannot be identified by an execution token:
- typically, the run-time semantics of
if
(an instance of); - in some systems (in which
compile,
is equivalent topostpone literal postpone execute
), the execution semantics of>r
; - in some system (where FVM does not have access to the underlying return stack, e.g. WAForth), the initiation semantics of
:noname
.
Of course, the standard allows such implementations and disallows programs to obtain an execution token of the corresponding semantics, or even does not provide a way to obtain it.
To solve the initial problem we can formally state these basics explicitly in a normative part, and specify what semantics are identified by the execution token of a word.
Also, we should update the definition of "execution token" term to say what it identifies.
Example
: foo postpone if ;
:noname postpone if ; ( xt )
The execution semantics of foo
are equivalent to the compilation semantics for if
.
In the same time, a Forth system may provide system-dependent execution semantics for if
that are not equivalent to the execution semantics of foo
.
xt, which is left on the stack in the second line, identifies the execution semantics of an unnamed Forth definition, and these execution semantics are equivalent to the compilation semantics for if
.
Typical use
"xt identifies the compilation semantics for the word
FOO
"- It means that the execution token xt identifies the execution semantics which are equivalent to the compilation semantics for the word
FOO
. - At the same time, the execution semantics of the word
FOO
may differ form the execution semantics identified by this xt.
- It means that the execution token xt identifies the execution semantics which are equivalent to the compilation semantics for the word
"the execution token for the word
BAR
"- It means that this execution token identifies the execution semantics of the word
BAR
. - Note: if the standard does not define interpretation semantics for
BAR
, the execution token ofBAR
could identify some system-specific execution semantics, because an ambiguous condition could occur (4.1.2) when the program obtains the execution token ofBAR
.
- It means that this execution token identifies the execution semantics of the word
"xt identifies the interpretation semantics for the word
BAZ
"- It means that the execution token xt identifies the execution semantics which are equivalent to the interpretation semantics for the word
BAZ
. - At the same time, the execution semantics of the word
BAZ
may differ form the execution semantics identified by this xt.
- It means that the execution token xt identifies the execution semantics which are equivalent to the interpretation semantics for the word
The execution semantics identified by xt are equivalent to the interpretation semantics of
BAZ
.- This seems pretty clear.
Incorrect use
Actually, the standard contains only one place where the "execution token" term is used ambiguously in a normative part — the glossary entry for FIND
.
Since it says that FIND
returns the execution token of the Forth definition ("its execution token", so it should identify the execution semantics of that definition), but:
FIND
may return two different tokens (one while compiling and another while not compiling) for the same string (and, formally, the same Forth definition), which may identify different semantics; then at least one of them does not identifies the execution semantics of the definition;- in some cases, none of the returned execution tokens identifies the specified execution semantics of the word (for example, for the word
>r
in some systems).
In another glossary entry — for NAME>INTERPRET
— the language is just slightly non normative, since it uses the form "xt represents" instead of the form "xt identifies".
These glossary entries also have some other problems, so they should be corrected anyway; my other proposals for that are in progress.
Proposal
Update "execution token" term
In the section 2.1 Definitions of terms, change:
execution token: A value that can be passed to
EXECUTE
(6.1.1370)
into
execution token: A value that identifies the execution semantics of a definition.
Update "execution token" data type description
In the section 3.1.3.5 Execution tokens, add the following paragraphs to the beginning:
For any valid execution token in the system, there is at least one Forth definition (named or unnamed) whose execution semantics are identified by that execution token.
The execution semantics identified by an execution token may be equivalent to the interpretation semantics or compilation semantics for some word, or to some run-time semantics. In such a case this execution token also identifies that interpretation semantics, compilation semantics, or run-time semantics.
It it is not required that every specified semantics be identified by some execution token in the system.
The execution token of a Forth definition, if available, identifies the execution semantics that are either specified by this standard for that definition or are implementation dependent (if permitted).
If the interpretation semantics for a Forth definition are defined by this standard, the execution token of that definition shall be available.
Update "Execution semantics" notion
In the section 3.4.3.1 Execution semantics,
Change the paragraph:
The execution semantics of
eachForth definition are specified in an "Execution:" section of its glossary entry. When a definition has only one specified behavior, the label is omitted.
into
The execution semantics of a Forth definition are specified in an "Execution:" section of its glossary entry. When a definition has only one specified behavior, the label is omitted.
Rationale: for some Forth definitions execution semantics are not specified.
After that, add the the following two paragraphs:
If the execution semantics for a Forth definition are specified by this standard and the glossary entry of that definition does not have an "Interpretation:" section, the execution token of that definition identifies the specified execution semantics. Otherwise the execution token of that definition, if available, identifies the implementation dependent execution semantics.
The implementation dependent execution semantics of a Forth definition, when they are performed in interpretation state, shall perform the interpretation semantics of that definition. An ambiguous condition exists if they are performed in compilation state.
Rationale: until such words like >r
are defined using an "Execution" section, we have to rely on absence of an "Interpretation:" section to refer to ordinary words (commented on 2019-06-21, 2020-08-30).
Update ambiguous conditions
In the section 4.1.2 Ambiguous conditions,
replace the phrase:
attempting to obtain the execution token, (e.g., with 6.1.0070
'
, 6.1.1550FIND
, etc. of a definition with undefined interpretation semantics;
with the phrase:
attempting to obtain the execution token with 6.1.0070
'
or 6.1.2510[']
of a definition with undefined interpretation semantics;
Rationale: find
(in interpretation state) and search-wordlist
return either execution token of the word or zero, and there is no ambiguous condition in this regard.
Update glossary entries
In the glossary entries
6.2.2295 TO
,
6.2.1725 IS
,
6.2.0698 ACTION-OF
,
replace the phrase:
An ambiguous condition exists if any of
POSTPONE
,[COMPILE]
,'
or[']
are applied.
with the phrase:
An ambiguous condition exists if
POSTPONE
or[COMPILE]
are applied.
Testing
t{ ' s" execute abc" s" abc" compare -> 0 }t
t{ 1 value x 2 ' to execute x x -> 2 }t
t{ : (to) [ ' to compile, ] ; 3 (to) x x -> 3 }t
I agree with @ruv that "translation" doesn't quite fit and finding suitable terms is a real challenge. I utilized this proposal, BerndPaysan's retired recognizer proposal, FORTH Inc.'s recognizer page, and the comments here and on the mailing list.
Suggestions short summary
Remove the "translation" term because it's obfuscates the possible outputs, explained below.
Because a "translation token" is a table of run-time actions, "run-time action table" (rat) would seem appropriate, explained below.
A recognizer definition is proposed below.
It doesn't seem that the connection between a recognizer's pattern and the rest of the steps is really discussed. Matching a text token to the pattern is the first step. The parameters fetched according to the recognizer's pattern. The run-time action table is associated with a specific pattern parameter.
Recognizer term
From this proposal and FORTH, Inc.'s write up, the following seems to be how to design a recognizer:
- Determine the text pattern of the data.
- _E.g. complex numbers follow an "a+bi" pattern.
- Create a pattern matching algorithm for the text pattern.
- Determine pattern parameters to be fetched.
- Determine the run-time action tables to pair with the fetched pattern parameters.
Recognizer definiton proposal: A recognizer attempts to match a text token to a pattern. A successful text token match invokes fetching the pattern parameters and the associated run-time action table. A failed matching attempt outputs a rat-none
. The text interpreter (and other users, such as postpone
), utilizes the run-time action table to perform either the interpreting run-time, compiling run-time, or postponing run-time.
make-rec-sequence ( xtu .. xt1 u "name" -- )rec-sequence:
rec-name
( c-addr u -- xt rat | rat-none)
rec-num
( c-addr u -- i*x rat | rat-none)
rec-none
( c-addr u -- rat-none )
I agree with @ruv's suggestions for get-recs
and set-recs
, i.e. recs@
& recs!
.
Translation Term
Translation seems to hide information. The relationship between the pattern parameters and the run-time action table is fixed. Because different recognizers produce different outputs, using "translation" as a catchall obscures the output, rather than listing the output i*x rat
, xt rat
, etc.
translation token run-time action table: Single-cell item that contains the run-time actions associated with specific pattern parameters, i.e. interpreting run-time, compiling run-time, and postponing run-time. (This has formerly been called a rectype, translation token. It's a table of run-time actions.)
translate:
make-rat
( xt-int xt-comp xt-post "name" -- )
translate-word
rat-word
( -- rat )
pattern parameters: is the optional set of data fetched after a successful text token match. The set is on various stacks below the run-time action table. (This could use a better name, not sure if it's really needed, but it helped my thinking.)
I walk through the examples below with the notes above.
Example: REC-NAME
FORTH, Inc. has this example:
' EXECUTE ' COMPILE, ' POSTPONE, TRANSLATE: TRANSLATE-WORD
' EXECUTE ' EXECUTE ' COMPILE, TRANSLATE: TRANSLATE-IMM
: REC-NAME ( c-addr len -- xt addr1 | addr2 )
(FIND) CASE
-1 OF TRANSLATE-WORD ENDOF
1 OF TRANSLATE-IMM ENDOF
0 OF TRANSLATE-NONE ENDOF
ENDCASE ;
Compared to the steps above:
- Data to be handled is "words in general".
- The pattern is a word is in the dictionary.
- The pattern parameters fetched could be:
xt 1
xt -1
cddr 0
- Pattern parameters are associated to rats as follows:
- 1 to
TRANSLATE-WORD
- -1 to
TRANSLATE-IMM
- 0 to
TRANSLATE-NONE
(originally, NOTFOUND).
- 1 to
(FIND) completes both Steps 2 & 3. The rat output is based on the pattern parameters fetched, not the pattern being matched.
Example: REC-TICK
From the proposal:
: rec-tick ( addr u -- translation ) \ gforth-experimental
over c@ '`' = if
1 /string find-name dup if
name>interpret translate-cell exit then
drop translate-none exit then
rec-none ;
Walking through the steps:
- The data to be handled is a ticked word.
- The pattern is a name in the dictionary.
- The pattern parameters fetched by
1 /string find-name
could be:nt
0
- Pattern parameters are associated to rats as follows:
nt
totranslate-cell
0
totranslate-none
The rat output is based on the pattern parameters fetched, not the pattern being matched.
2drop translate-none
seems clearer than rec-none
. I keep getting caught looking at the rec-tick example thinking "what is rec-none recognizing?"
Example Observations
- Pattern matching and pattern parameter fetching can be combined or separate words.
- It would be reasonable to have a failed pattern parameter fetch be an error. A pitfall of creating recognizers is ensuring there is little to no overlap of patterns.
- E.g.
'bob
is the name as defined, processed by rec-name.'stan
is ticked version ofstan
processed byrec-tick
.
- E.g.
rec-none
could be the final recognizer in recognizer sequences, exiting any further evaluation. Instead of creating a new sequence, one could moverec-none
earlier in the sequence.
Thank you for reading this far, hopefully there is more food for thought, than madness.
The committee has accepted this as a non-substantive change in the 2025 meeting with vote #39: 8Y:0N:0A.
Author
Ruv
Change Log
(the latest at the top)
- 2025-09-30 Add some rationale; correct some typos and grammar mistakes; minor rewording; add consequences.
- 2025-09-29 Better wording in some places; corrections; update in ambiguous conditions.
- 2025-09-28 Huge update; incorporate proposals [249], [212] (partially), [163], [122] (partially); add tests. According to my comment on 2024-09-24, this was planned.
- 2022-09-19 explicitly allow a short formula, describe what it means, better wording, fix some typos
- 2022-08-13 Initial version
Preceding history
(the latest at the top)
2022-08-12 [249] Revert rewording the term "execution token" (proposal, retracted on 2025-09-28).
2021-09-08 [157] Reword the term "execution token" (proposal, accepted on 2021-09-13).
2021-09-08 [212] Tick and undefined execution semantics - 2 (proposal, considered on 2024-09-26).
2020-10-29 [163] Tick and undefined execution semantics (proposal, retracted on 2025-09-12)
2020-09-03 [157] Reword the term "execution token" (proposal, replaced on 2021-09-08; an attempt to solve the problem by changing the term)
2020-02-20 [129]
NAME>INTERPRET
wording (proposal, in progress; indication of a problem)2019-10-08 [122] Clarify FIND, more classic approach (proposal, in progress)
Problem
By the definition of the term "execution token" in Forth-94 and Forth-2012, it's a value that identifies execution semantics. Can such value identify other behavior, e.g. some interpretation semantics or compilation semantics? It's unclear at the first glance.
Another problem is that, following unfortunate change to the term "execution token" in the very quickly accepted proposal [157], the standard does not formally state that an execution token identifies anything at all.
Yet another problem is that it is unclear what behavior are identified by the execution token of a word whose execution semantics are not specified by the standard.
Solution
Actually, an execution token can identify (and does identify) other semantics too, but only if they are equivalent to the execution semantics that this token also identifies.
Example 1
:noname postpone if ; ( xt )
- xt, which is left on the stack, identifies the execution semantics of this unnamed Forth definition.
- The execution semantics of this definition
are equivalent
to the compilation semantics for
if
. - Then, this xt also identifies the compilation semantics for
if
.
Example 2
: foo postpone if ;
- xt of
foo
identifies the execution semantics offoo
. - The execution semantics of
foo
are equivalent to the compilation semantics forif
(this follows from the standard). - Then, this xt also identifies the compilation semantics for
if
.
Note that the Forth system may provide
system-dependent execution semantics for if
that are not equivalent to the execution semantics of foo
.
Reasoning
Thus, for any execution token there exists at least one Forth definition (named or unnamed) the execution semantics of which are identified by this execution token. So, in any case, an execution token always identifies some execution semantics, but accidentally (or intentionally) these semantics can be equivalent to some interpretation semantics, or some compilation semantics, and then it identifies them too. It's unnecessary that they connected to the same Forth definition. Also, consequently, it's impossible that an execution token identifies some compilation semantics, or some interpretation semantics, but doesn't identify the equivalent execution semantics.
Note that there are cases where the semantics cannot be identified by an execution token in a Forth system, because the implementation of the system does not have or cannot have a Forth definition with equivalent execution semantics.
Examples of semantics that cannot be identified by an execution token:
- typically, the run-time semantics of
if
(an instance of); - in some systems
(in which the phrase
postpone literal postpone execute
is equivalent tocompile,
), the formally specified execution semantics of>r
; - in some systems
(where FVM does not have access to the underlying return stack,
e.g. WAForth),
the initiation semantics of
:noname
.
Of course, the standard allows such implementations and disallows programs to obtain an execution token of the corresponding semantics, or even does not provide a way to obtain it.
Roadmap
To solve the initial problem we can
- formally and explicitly state the basics described above,
- specify what particular semantics are identified by the execution token of a word (in what cases are they defined by the standard and when by the implementation, and to what extent).
Also, we should update the definition of "execution token" term to say what it identifies.
Typical use
"The execution semantics identified by xt are equivalent to the interpretation semantics of
BAZ
"- This seems pretty clear.
"xt identifies the interpretation semantics for the word
BAZ
"- This means that the execution token xt identifies the execution semantics which are equivalent to the interpretation semantics for the word
BAZ
. - At the same time, the execution semantics of the word
BAZ
may differ form the execution semantics identified by this xt.
- This means that the execution token xt identifies the execution semantics which are equivalent to the interpretation semantics for the word
"xt identifies the compilation semantics for the word
FOO
"- This means that the execution token xt identifies the execution semantics which are equivalent to the compilation semantics for the word
FOO
. - At the same time, the execution semantics of the word
FOO
may differ from the execution semantics identified by this xt.
- This means that the execution token xt identifies the execution semantics which are equivalent to the compilation semantics for the word
"xt of the word
BAR
"- This means that xt identifies the execution semantics of the word
BAR
. - Whether this xt also identifies
the compilation semantics,
or the interpretation semantics,
or both of them, or neither,
for the word
BAR
depends on the wordBAR
(on how it is defined or specified). - Regardless of how
BAR
is defined, executing xt in interpretation state performs the interpretation semantics forBAR
.
- This means that xt identifies the execution semantics of the word
Incorrect use
Actually, the standard contains only one place where the "execution token" term is used ambiguously in a normative part — the glossary entry for FIND
.
The problem is that it says that FIND
returns the execution token of the Forth definition ("its execution token", so it should identify the execution semantics of that definition), but:
FIND
may return two different tokens (one while compiling and another while not compiling) for the same string (and, formally, the same Forth definition), which may identify different semantics; then at least one of them does not identify the execution semantics of the definition (despite the statement).
In the glossary entry for
NAME>INTERPRET
,
the language is just slightly non normative,
since it uses the form "xt represents" instead of the form "xt identifies".
These glossary entries also have some other problems, so they should be corrected anyway; my other proposals on this matter are in progress.
Proposal
Update "execution token" term
In the section 2.1 Definitions of terms, change (as of 2021-09-13):
execution token: A value that can be passed to
EXECUTE
(6.1.1370)
into
execution token: A value that identifies the execution semantics of a definition.
Update "execution token" data type description
In the section 3.1.3.5 Execution tokens, add the following paragraphs to the beginning:
For any valid execution token in the system, there is at least one Forth definition (named or unnamed) whose execution semantics are identified by that execution token.
The execution semantics identified by an execution token may be equivalent to the interpretation semantics, compilation semantics, or other semantics for some named Forth definition. In such cases, the execution token also identifies those interpretation, compilation, or other semantics.
The system does not need to identify every specified semantics by any execution token.
The execution token of a Forth definition, if available, identifies the execution semantics that are either specified by this standard for that definition or are implementation dependent (if permitted).
If the interpretation semantics for a Forth definition are defined by this standard, the execution token of that definition shall be available.
Rationale
- We use the clause "valid" in "any valid execution token"
because an execution token may become invalid
after using words like
forget
and children ofmarker
. - The "if available" clause is used since
find
(in interpretation state) andname>interpret
may return zero for some exisitng (but not user-defined) words; this effectively means that the execution token is not available for those words, andsearch-wordlist
should also return zero for them. - The last paragraph guarantees that any word that is allowed to be Ticked has an execution token.
Update "Execution semantics" notion
In the section 3.4.3.1 Execution semantics,
Change the paragraph:
The execution semantics of
eachForth definition are specified in an "Execution:" section of its glossary entry. When a definition has only one specified behavior, the label is omitted.
into
The execution semantics of a Forth definition are specified in an "Execution:" section of its glossary entry. When a definition has only one specified behavior, the label is omitted.
Rationale: for some words, the execution semantics are not specified by the standard.
After that, add the following two paragraphs:
If the execution semantics for a Forth definition are specified by this standard and the glossary entry of that definition does not have an "Interpretation:" section, the execution token of that definition identifies the specified execution semantics. Otherwise the execution token of that definition, if available, identifies the implementation dependent execution semantics.
The implementation dependent execution semantics of a Forth definition, when they are performed in interpretation state, shall perform the interpretation semantics of that definition. An ambiguous condition exists if they are performed in compilation state.
Rationale
- We have to rely on the absence of an "Interpretation:" section
to refer to ordinary and immediate words
until words like
>r
will be defined using a "Run-Time:" section instead of an "Execution:" section in their glossary entries (commented on 2019-06-21, 2020-08-30). - In this section, we use the term "implementation dependent" rather than "implementation defined" because this allows implementors to avoid documenting some system-specific words altogether.
- When the standard allows to implement system-specific interpretation semantics for a standard word (by specifing that the interpretation semantics are undefined), and the system does so, executing xt of the word (if this xt is available) in interpretation state shall perform the system-specific interpretation semantics. The behavior of this xt in compilation state is not restricted by the standard. In classic Forth systems, in compilation state, it performs the specified compilation semantics for the word.
Update ambiguous conditions
In the section 4.1.2 Ambiguous conditions,
replace the phrase:
attempting to obtain the execution token, (e.g., with 6.1.0070
'
, 6.1.1550FIND
, etc. of a definition with undefined interpretation semantics;
with the phrase:
attempting to obtain the execution token with 6.1.0070
'
or 6.1.2510[']
of a definition with undefined interpretation semantics;
Rationale: find
(in interpretation state) and search-wordlist
return either
the execution token of the word or zero,
and there is no ambiguous condition in this regard.
Update glossary entries
In the glossary entries
6.2.2295 TO
,
6.2.1725 IS
,
6.2.0698 ACTION-OF
,
replace the phrase:
An ambiguous condition exists if any of
POSTPONE
,[COMPILE]
,'
or[']
are applied
with the phrase:
An ambiguous condition exists if
POSTPONE
or[COMPILE]
are applied
Consequences
This change specifies all the cases
in which an xt returned by search-wordlist
,
'
(Tick), [']
(Bracket Tick),
can (or cannot) be used by a standard program
in the general case.
This change removes prohibition on Ticking the words to
, is
, action-of
,
and specifies that executing of the returned xt in interpretation state
shall perform the interpretation semantics for the word.
Note that executing can be performed directly by execute
,
or indirectly by executing of the definition
in which this xt is compiled using compile,
.
If the system throws an error on ticking these words, or does not provide a correct xt for them, it should be updated to be compliant.
All classic Forth systems comply with this change.
This change does not affect the existing standard programs.
Testing
t{ ' s" execute abc" s" abc" compare -> 0 }t
t{ 1 value x 2 ' to execute x x -> 2 }t
t{ : (to) [ ' to compile, ] ; 3 (to) x x -> 3 }t
Reasoning
typically, the run-time semantics of if (an instance of);
- I'm still not sure how to parse this. My first thought was maybe you meant something like "typically, the interpretation semantics of if
(assuming the implementation defines interpretation semantics)". But re-reading the page on if
, maybe what is meant is more like "typically, the run-time semantics that are appended into the current definition when an instance of if
is compiled (that is, the run-time semantics that if
appends to the current compilation need not correspond to an execution token)"
Typical Use
typo: "may differ form" should be "may differ from"
Proposal 3.1.3.5...
typo: "exisitng" should be "existing"
@Josef wrote:
I agree with @ruv that "translation" doesn't quite fit and finding suitable terms is a real challenge.
Because a "translation token" is a table of run-time actions, "run-time action table" (rat) would seem appropriate, explained below.
I'm making one more attempt on this matter.
The language of the Standard already uses concepts such as data object, data type, typed data object, and subtyping (see 3.1 Data types).
Using these concepts, we can describe a successful recognition result as a pair consisting of a data object and its corresponding data type.
On the stack, data types must be represented by specific identifiers,
similar to how semantics elements are represented by xt identifiers.
We might refer to such an identifier as a type descriptor (symbol td
).
- Note: "type descriptor" is preferred over "type identifier" because, in the language of the Standard, we will need expressions like "type descriptor td identifies ...". Using "type identifier" would lead to awkward repetitions such as "type identifier ti identifies ...".
- Another option for this term could be "type token" (seems less preferable).
Additionally, we might define
a qualified data object (symbol qdo
)
as a pair consisting of a data object
and the type descriptor that identifies that object's data type.
- Note. This concept should be distinguished from the existing concept of a "typed data object".
The elegance and strength of this approach lie in the following points:
- It builds upon existing terminology, with only slight extensions.
- It incorporates existing data type symbols into naming conventions.
- It leverages subtyping relationships between data types to reduce redundancy (adhering to the DRY principle).
Type descriptors can be used to:
- Translate data objects (into the body of a Forth definition when compiling or side effects when interpreting).
- Convert data objects to different data types (casting).
- E.g., getting xt from nt (for example, of an ordinary word only)
- Check subtyping relationships between data types (or of a qualified data object).
- Define new type descriptors.
These features can be designed independently of recognizers, and recognizers only rely on them when returning a qualified data object or analyzing a qualified data object from another recognizer.
@ErikBlake wrote:
But re-reading the page on
if
, maybe what is meant is more like "typically, the run-time semantics that are appended into the current definition when an instance ofif
is compiled (that is, the run-time semantics thatif
appends to the current compilation need not correspond to an execution token)"
Yes. I literally mean the semantics specified in the "Run-time:" section in the if
glossary entry.
I used the "instance of" clause because each performing of the if
compilation semantics appends a distinct instance of the run-time semantics to the current definition due to different orig value (in the general case). Isolating that instance and providing an execution token for it is technically difficult.
If this example seems too confusing, I will delete it and maybe add another one.
Thanks also for pointing out the typos. These corrections, along with other changes, will be included in the next version.
Author
Ruv
Change Log
(the latest at the top)
- 2025-10-03 Correct some normative statements; add ambiguous conditions; make better wording. Add more rationale. Correct some typos.
- 2025-09-30 Add some rationale; correct some typos and grammar mistakes; minor rewording; add consequences.
- 2025-09-29 Better wording in some places; corrections; update in ambiguous conditions.
- 2025-09-28 Huge update; incorporate proposals [249], [212] (partially), [163], [122] (partially); add tests. According to my comment on 2024-09-24, this was planned.
- 2022-09-19 explicitly allow a short formula, describe what it means, better wording, fix some typos
- 2022-08-13 Initial version
Preceding history
(the latest at the top)
2022-08-12 [249] Revert rewording the term "execution token" (proposal, retracted on 2025-09-28).
2021-09-08 [157] Reword the term "execution token" (proposal, accepted on 2021-09-13).
2021-09-08 [212] Tick and undefined execution semantics - 2 (proposal, considered on 2024-09-26).
2020-10-29 [163] Tick and undefined execution semantics (proposal, retracted on 2025-09-12)
2020-09-03 [157] Reword the term "execution token" (proposal, replaced on 2021-09-08; an attempt to solve the problem by changing the term)
2020-02-20 [129]
NAME>INTERPRET
wording (proposal, in progress; indication of a problem)2019-10-08 [122] Clarify FIND, more classic approach (proposal, in progress)
Problem
By the definition of the term "execution token" in Forth-94 and Forth-2012, it's a value that identifies execution semantics. Can such value identify other behavior, e.g., some interpretation semantics or compilation semantics? It's unclear at the first glance.
Another problem is that, following unfortunate change to the term "execution token" in the very quickly accepted proposal [157], the standard does not formally state that an execution token identifies anything at all.
Yet another problem is that it is unclear what behavior are identified by the execution token of a word whose execution semantics are not specified by the standard.
Solution
Actually, an execution token can identify (and does identify) other semantics too, but only if they are equivalent to the execution semantics that this token also identifies.
Example 1
:noname postpone if ; ( xt )
- xt, which is left on the stack, identifies the execution semantics of this unnamed Forth definition.
- The execution semantics of this definition
are equivalent
to the compilation semantics for
if
(this follows from the standard). - Then, this xt also identifies the compilation semantics for
if
.
Example 2
: foo postpone if ;
- xt of
foo
identifies the execution semantics offoo
. - The execution semantics of
foo
are equivalent to the compilation semantics forif
(this follows from the standard). - Then, this xt also identifies the compilation semantics for
if
.
Note that the Forth system may provide
system-dependent execution semantics for if
that are not equivalent to the execution semantics of foo
.
Reasoning
Thus, for any execution token there exists at least one Forth definition (named or unnamed) the execution semantics of which are identified by this execution token. So, in any case, an execution token always identifies some execution semantics, but accidentally (or intentionally) these semantics can be equivalent to some interpretation semantics, or some compilation semantics, and then it identifies them too. It's unnecessary that they connected to the same Forth definition. Also, consequently, it's impossible that an execution token identifies some compilation semantics, or some interpretation semantics, but doesn't identify the equivalent execution semantics.
Note that there are cases where the semantics cannot be identified by an execution token in a Forth system, because the implementation of the system does not have or cannot have a Forth definition with equivalent execution semantics.
Examples of semantics that cannot be identified by any execution token:
- in some systems
(in which the phrase
postpone literal postpone execute
is equivalent tocompile,
), the formally specified execution semantics of>r
; - in some systems
(where FVM does not have access to the underlying return stack,
e.g. WAForth),
the initiation semantics of
:noname
; - typically, the run-time semantics of
if
(which are appended to the current definition by the compilation semantics ofif
);
Of course, the standard allows such implementations and disallows programs to obtain an execution token of the corresponding semantics, or even does not provide a way to obtain it.
Roadmap
To solve the initial problem, we can:
- formally and explicitly state the basics described above,
- specify what particular semantics are identified by the execution token of a word (in what cases are they defined by the standard and when by the implementation, and to what extent),
- update the definition of the "execution token" term to say what these tokens identify.
Typical use
"The execution semantics identified by xt are equivalent to the interpretation semantics of
BAZ
"- This seems pretty clear.
"xt identifies the interpretation semantics for the word
BAZ
"- This means that the execution token xt identifies the execution semantics which are equivalent to the interpretation semantics for the word
BAZ
. - At the same time, the execution semantics of the word
BAZ
may differ from the execution semantics identified by this xt.
- This means that the execution token xt identifies the execution semantics which are equivalent to the interpretation semantics for the word
"xt identifies the compilation semantics for the word
FOO
"- This means that the execution token xt identifies the execution semantics which are equivalent to the compilation semantics for the word
FOO
. - At the same time, the execution semantics of the word
FOO
may differ from the execution semantics identified by this xt.
- This means that the execution token xt identifies the execution semantics which are equivalent to the compilation semantics for the word
"xt of the word
BAR
"- This means that xt identifies the execution semantics of the word
BAR
. - Whether this xt also identifies
the compilation semantics,
or the interpretation semantics,
or both of them, or neither,
for the word
BAR
depends on the wordBAR
(on how it is defined or specified). - Regardless of how
BAR
is defined, executing xt in interpretation state performs the interpretation semantics forBAR
.
- This means that xt identifies the execution semantics of the word
Incorrect use
Actually, the standard contains only one place
where the "execution token" term is used ambiguously
in a normative part —
the glossary entry for FIND
.
The problem is that it says that FIND
returns
the execution token of the Forth definition
("its execution token", so it should identify
the execution semantics of that definition),
but:
FIND
may return two different tokens (one while compiling and another while not compiling) for the same string (and, formally, the same Forth definition), which may identify different semantics; then at least one of them does not identify the execution semantics of the definition (despite the statement).
In the glossary entry for
NAME>INTERPRET
,
the language is just slightly non normative,
since it uses the form "xt represents" instead of the form "xt identifies".
These glossary entries also have some other problems, so they should be corrected anyway; my other proposals on this matter are in progress.
Proposal
Update "execution token" term
In the section 2.1 Definitions of terms, change (as of 2021-09-13):
execution token: A value that can be passed to
EXECUTE
(6.1.1370)
into
execution token: A value that identifies the execution semantics of a definition.
Rationale: this is necessary to correctly describe the "execution token" data type.
Update "execution token" data type description
In the section 3.1.3.5 Execution tokens, add the following paragraphs to the beginning:
For any valid execution token in the system, there is at least one Forth definition (named or unnamed) whose execution semantics are identified by that execution token.
The execution semantics identified by an execution token may be equivalent to the interpretation semantics, compilation semantics, or other semantics for some named Forth definition. In such cases, the execution token also identifies those interpretation, compilation, or other semantics.
The system does not need to identify every specified semantics by any execution token.
The execution token of a Forth definition, if available, identifies the execution semantics that are either specified by this standard for that definition or are implementation dependent (if permitted).
If the interpretation semantics for a Forth definition are defined by this standard, the execution token of that definition shall be available.
Rationale
- We use the clause "valid" in "any valid execution token"
because an execution token may become invalid
after using words like
forget
and children ofmarker
. - The "if available" clause is used since
find
(in interpretation state) andname>interpret
may return zero for some existing (but not user-defined) words; this effectively means that the execution token is not available for those words, andsearch-wordlist
should also return zero for them. - The last paragraph guarantees that any word that is allowed to be Ticked has an execution token.
Add the following paragraph at the end:
See also: A.3.1.3.5 Execution tokens.
Update "Execution semantics" notion
In the section 3.4.3.1 Execution semantics,
Change the first paragraph:
The execution semantics of
eachForth definition are specified in an "Execution:" section of its glossary entry. When a definition has only one specified behavior, the label is omitted.
into
The execution semantics of a Forth definition are specified in an "Execution:" section of its glossary entry. When a definition has only one specified behavior, the label is omitted.
Rationale: for some words, the execution semantics are not specified by the standard.
After that, insert the following paragraphs:
If the execution semantics for a Forth definition are specified by this standard and the glossary entry of that definition does not have an "Interpretation:" section, the execution token of that definition identifies the specified execution semantics. Otherwise, the execution token of that definition, if available, identifies the implementation dependent execution semantics.
Rationale
- We have to rely on the absence of an "Interpretation:" section
to refer to ordinary and immediate words
until words like
>r
will be defined using a "Run-Time:" section instead of an "Execution:" section in their glossary entries (commented on 2019-06-21, 2020-08-30). - We use the term "implementation dependent" rather than "implementation defined" because this allows implementors to avoid documenting the provided execution semantics of standard words whose execution semantics are not specified by the standard.
The execution semantics identified by the execution token of a Forth definition, when they are performed in interpretation state, shall perform the interpretation semantics of that definition.
Rationale:
this guarantees that, while interpreting,
' foo execute
is equivalent to foo
,
even when the execution token (xt) of foo
identifies implementation dependent execution semantics.
An ambiguous condition exists if the execution semantics of a Forth definition are not specified by this standard and its execution token is executed in compilation state.
Rationale: In all classic Forth systems, this performs the compilation semantics for the word, but in most dual-xt Forth systems, this performs the interpretation semantics for the word.
An ambiguous condition exists if the interpretation semantics of a Forth definition are undefined by this standard and its execution token is executed.
Rationale
- The execution semantics of the phrase
s" if" forth-worldlist search-wordlist if execute then
is always ambiguous, regardless of whether they are performed while interpreting or compiling, because they execute (at least on some systems) the xt of the wordif
, whose interpretation semantics are undefined by this standard.
Update ambiguous conditions
In the section 4.1.2 Ambiguous conditions,
replace the phrase:
attempting to obtain the execution token, (e.g., with 6.1.0070
'
, 6.1.1550FIND
, etc. of a definition with undefined interpretation semantics;
with the phrase:
attempting to obtain, using the words 6.1.0070
'
or 6.1.2510[']
, the execution token of a definition with undefined interpretation semantics;
Rationale: find
and "etc." are excluded from the list,
because find
(in interpretation state),
search-wordlist
, and name>interpret
return either
the execution token of the word or zero,
and there is no ambiguous condition in this regard.
Update glossary entries
In the glossary entries
6.2.2295 TO
,
6.2.1725 IS
,
6.2.0698 ACTION-OF
,
replace the fragment:
An ambiguous condition exists if any of
POSTPONE
,[COMPILE]
,'
or[']
are applied
with the fragment:
An ambiguous condition exists if
POSTPONE
or[COMPILE]
are applied
Note: in the first case the form “are applied” is used, in the second and third cases the form “is applied” is used. They should be harmonized/corrected.
Rationale: it is now specified what semantics the execution token identifies for these words.
Consequences
This change specifies all the cases
in which an xt returned by search-wordlist
,
'
(Tick), [']
(Bracket Tick),
can (or cannot) be used by a standard program
in the general case.
This change removes prohibition on Ticking the words to
, is
, action-of
,
and specifies that executing of the returned xt in interpretation state
shall perform the interpretation semantics for the word.
Note that executing can be performed directly by execute
,
or indirectly by executing of the definition
in which this xt is compiled using compile,
.
If the system throws an error on ticking these words, or does not provide a correct xt for them, it should be updated to be compliant.
According to Anton Ertel's testing of five Forth systems
(comment [r885])
in 2022,
iForth 5.0.27 and VfxForth 5.11 did not comply with this change
with respect to the execution tokens of the word to
(and, probably, is
and action-of
).
However, they were already noncompliant without this change,
as they returned an incorrect execution token for s"
(ticking of which is allowed).
Note that all classic Forth systems natively comply with this change.
This change does not affect the existing standard programs.
Testing
t{ ' s" execute abc" s" abc" compare -> 0 }t
t{ 1 value x 2 ' to execute x x -> 2 }t
t{ : (to) [ ' to compile, ] ; 3 (to) x x -> 3 }t
Author
Ruv
Change Log
- 2023-10-22 Initial revision
- 2023-10-23 Add testing, examples, a question to discuss, change the throw code description
- 2023-10-27 Some rationales and explanations added, the throw code description changed back, better wording in some places
- 2024-06-20 Fix some typos, make some wording and formatting better, add some examples and test cases, add motivation for
LATEST-NAME-IN
, change the status to "formal". - 2024-06-20 Add a test case to check that
LATEST-NAME
returns different value after the compilation word list is switched. - 2024-06-20 Simplify the normative text description, and add a rationale for this simplification.
- 2025-09-15 Add clause about findable words, add rationale sections in the proposal, address a question re
immediate
, note a bug intraverse-wordlist
in some Forth systems, make some rewording and minor corrections, add a more general reference implementation. - 2025-09-19 Make corrections from Eric Blake, mention
find-name
instead ofsearch-wordlist
, use lowercase in the test cases in in prose when possible, make some rewording in the prose, add some links. - 2025-10-03 Rationale for returning zero and throwing an exception. Minor rewording in some places.
Problem
In some applications, mainly in libraries and extensions, the capability to obtain the most recently added definition is very useful and demanded.
To make such programs portable, we should introduce a standard method to obtain the most recently added word.
For example, if we are creating a library for decoration, tracing, support for OOP, simple DSLs (e.g., to describe Final State Machines), etc — it is always useful to have an accessor to the recent definition, instead of redefining a lot of words to define such an access method yourself, or juggling with the input buffer and search.
See some examples in the Typical use section.
Also, a number of specific examples are provided in my post on ForthHub (those examples are not inserted here so as not to bloat the text).
And additionally, there has been much discussions regarding standardization of such a method in recent decades. For example, Elizabeth D. Rather wrote on 2011-12-09 in comp.lang.forth
:
AFAIK most if not all Forths have some method for knowing the latest definition, it's kinda necessary. The problem is, that they all do it differently (at different times, in different forms, etc.), which is why it hasn't been possible to standardize it.
Although it's a system necessity, I haven't found this of much value in application programming.
Elizabeth D. Rather
Indeed, depending on the system, the internal method may return the recent word depending on the compilation word list or independent of the compilation word list, a completed definition or an incomplete definition, an unnamed definition or only a named definition, and so on.
However, I believe that a standardized method has significant value for libraries and DSLs in application programming, as my examples should demonstrate.
Some known internal methods: latest ( -- nt|0 )
, last @ ( -- nt|0 )
, latestxt ( -- xt|0 )
, etc.
Thus, although almost every Forth system contains such a method, there is no portable way for programs to obtain the latest definition.
Solution
Let's introduce the following words:
latest-name-in ( wid -- nt|0 )
latest-name ( -- nt )
The first word returns the name token for the definition whose name was placed most recently into the given word list, or zero if this word list is empty.
The second word returns the name token for the definition whose name was placed most recently into the compilation word list, or throws an exception if there is no such definition.
These words do not expose or limit any internal mechanism of the Forth system. They just provide information about word lists, like the words find-name-in
, find-name
, and traverse-wordlist
do.
It's a kind of introspection/reflection.
This words are intended for programs. The system may use them, but is not required to do so. The system may continue to use its internal last
, latest
, or whatever it was using before.
It seems, the best place for these words is the section 15.6.2 Programming-Tools extension words, where traverse-wordlist
is also placed.
Rationale
Connection with word lists
By considering definitions in the frame of a word list only, we solve several problems, namely:
A word list contains only completed definitions (see the accepted proposal #153 Traverse-wordlist does not find unnamed/unfinished definitions). This eliminates the question of whether the word of returned nt is finished — yes, it is always finished (completed).
Nameless definitions are not considered since they are not placed into the compilation word list (regardless of whether the system creates a name token for them, or places them into an internal system-specific word list).
An extension or library can create definitions in its internal word list for internal purposes. And it will not affect the compilation word list or other user-defined word lists. Thus, the user of such library always gets the expected result from
latest-name
(regardless of what words are created by this library for internal purposes on the fly). For example, when different dictionary spaces will be introduced, we can implement something like local variables (or local definitions) in portable way, and creating such a definition will not affect the value thatlatest-name
returns.
Returned values
As a matter of practice, almost all the use cases for the word latest-name
imply that the requested definition exists, and if it doesn't exist, only an error can be reported. So the option to return 0
by this word only burdens users with having to analyze this zero, or redefine this word as:
: latest-name ( -- nt ) latest-name dup 0= -80 and throw ;
If the user needs to handle the case where the compilation word list is empty, they can use the word latest-name-in
as:
get-current latest-name-in dup if ( nt ) ... else ( 0 ) drop ... then
Throwing an exception in latest-name
is slightly inconsistent
with the other words that return 0
when the requested
definition does not exist.
But this follows the general principle: throw an exception when, in most use cases, the situation is surprising or cannot be handled locally, return a value when, in most use cases, the situation is anticipated or may require local handling .
In most use cases of latest-name
usage,
the absence of definitions in the compilation word list
is unexpected and cannot be handled locally.
As for latest-name-in
, there is no practice of using it yet.
I can only imagine one use case:
in a package/module framework, the framework obtains the latest word
from a provided word list for a special purpose (like "main"),
and applies special actions if the word list is empty
(e.g., it might use some default word of the same type,
or add the default word to the provided word list,
or check for a different word list if multiple word lists are provided,
etc.).
Thus, the empty word list situation in this use case is expected
and may require local handling.
Implementation options
If the the word list structure in a Forth system contains information about the latest placed definition, the implementations for the proposed words are trivial.
In some plausible Forth systems the word list structure doesn't directly contain information about which definition was placed into the word list most recently, and this information cannot be obtained indirectly. Such systems might not provide the proposed words, or they are changed to keep this information in the word list structure. It seems, in most systems the word list structure directly contains this information, or this information can be obtained indirectly.
Some checked systems:
- Gforth, minForth, ikForth, SP-Forth, Post4 — a word list keeps information about the definition that was placed in it most recently;
- SwiftForth, VFX — the most recently placed word in a word list can be correctly obtained form the strands/threads (since nt values are monotonically increased);
- lxf/ntf 2017 — the most recently placed word in a word list can be obtained using
traverse-wordlist
(since nt values are monotonically increased).
Note that some systems have a bug in traverse-wordlist
so it can return the nt for a definition that cannot be found (namely, for the current definition). This is incorrect (see a testcase).
If a system does not implement the optional Search-Order word set, it might not provide the word latest-name-in
.
Naming
The names latest-name-in
and latest-name
of the new words are similar to find-name-in
and find-name
by the form. Stack effects are also similar.
The difference is that find-xxx is a verb phrase that starts with a verb,
but latest-xxx
is a noun phrase that starts with an adjective (see Wiktionary/latest).
Both the English words "find" and "latest" have historically been used in Forth word names, as is "name".
In Forth-84 "name" in word names denoted NFA (Name Field Address), and now it denotes a name token, which is the successor of NFA.
In all standard words, e.g. find-name
, name>string
, name>compile
, etc. (except parse-name
), "name" denotes a name token.
NB: the term "token" in "name token" does not mean a character sequence! It's used in a general sense, like "something serving as an expression of something else" (see Wiktionary).
Normative text description
The proposed normative text description is based on:
- 16.2: "compilation word list: The word list into which new definition names are placed",
- 15.3.1: "A name token is a single-cell value that identifies a named word",
- 3.4.3: "[Semantics] are largely specified by the stack notation in the glossary entries, which shows what values shall be consumed and produced. The prose in each glossary entry further specifies the definition's behavior" (there is no need to repeat in the text description what is already indicated in the stack diagrams). (emphasis added)
Throw code description
If the throw code description states that there is no latest name, it can be confusing since latest name in some sense probably always exists.
Therefore, it's better to say: "the compilation word list is empty" — it is what actually happens.
Motivation for latest-name-in
- It's a natural factor for
latest-name
. It's always possible to extract this factor from the implementation oflatest-name
, because the latter returns nt from the compilation word list, and the system should take wid of the compilation word list and extract most recent nt from this word list. - It's very important to specify the behavior of this word to avoid different behavior in different systems, since in many systems this word will exist (will be implemented as a natural factor).
- In some cases a program needs to check if a word list is empty, or obtain the latest word from a particular word list (for example, to use this word as entry point, like
main
, or as the default exported word from a module). - These both words are optional. And if
latest-name-in
is not provided, it can be implemented in a portable way vialatest-name
as:: latest-name-in ( wid -- nt|0 ) get-current >r set-current ['] latest-name catch if 0 then r> set-current ;
Things to discuss
Is it worth introducing the word latest-name-xt ( -- xt )
?
If name>interpret
never returns 0
(see my comment), this word can be implemented as:
: latest-name-xt ( -- xt ) latest-name name>interpret ;
The desired (and much discussed) pattern is:
defer bar
: foo ... ; latest-name-xt is bar
Sometimes the name "it
" has been suggested for this word, but this name is too short and has more chance for conflicts. Guido Draheim wrote in comp.lang.forth
on 2003-03-16:
I think that everyone has been thinking of using
IT
for something really clever, it's a nice short word - and I'd say that we should leave it for application usage.I want to support that argument also with real life experience in the telco world where there are a whole lot of abbreviations for various services, signals, connectors around. All too often now I see people making a SYNONYM at the file-start to get a second name for an ANS forth word that is needed in the implemenation but coincides with a common term of the application.
This seems convincing to me.
Typical use
: struct: ( "name" -- wid.compilation.prev u.offset )
get-current vocabulary
also latest-name name>interpret execute definitions
0
;
: ;struct ( wid.compilation.prev u.offset -- )
s" __size" ['] constant execute-parsing
set-current
;
The word execute-parsing ( i*x c-addr u "ccc" -- j*x )
is a well-known word, see an implemented at https://theforth.net/.
\ In the application's vocabulary
: it ( -- xt ) latest-name name>interpret ;
defer foo
\ ...
: bar ... ; it is foo
Proposal
Changes in existing sections
Add the following line into the Table 9.1: THROW code assignments:
-80
the compilation word list is empty
Editorial note: the actual throw code may change.
New glossary sections
Add the following sections into 15.6.2 Programming-Tools extension words:
15.6.2.xxxx LATEST-NAME-IN
TOOLS EXT
( wid -- nt|0 )
If the word list identified by wid is empty, then the returned value is 0
; otherwise, the name token nt identifies the definition whose name was placed most recently into the word list wid.
Note: nt can only be returned for a definition that can be found in wid.
See also:
15.6.2.xxxx LATEST-NAME
,
15.6.2.2297 TRAVERSE-WORDLIST
,
6.1.0460 ;
.
15.6.2.xxxx LATEST-NAME
TOOLS EXT
( -- nt )
If the compilation word list is not empty, the name token nt identifies the definition whose name was placed most recently into this word list.
Otherwise, the exception code -80
is thrown.
Note: nt can only be returned for a definition that can be found in the compilation word list.
See also:
15.6.2.xxxx LATEST-NAME-IN
,
15.6.2.2297 TRAVERSE-WORDLIST
,
6.1.0460 ;
.
New rationale sections
Add the following sections into A.15.6 Glossary:
A.15.6.2.xxxx LATEST-NAME-IN
The word latest-name-in
cannot return an nt
that cannot be obtained
using find-name-in
or traverse-wordlist
applied to the specified word list.
The word latest-name-in
returns 0
if the specified word list is empty,
allowing the program to handle this situation locally.
See also: A.15.6.2.xxxx LATEST-NAME
.
A.15.6.2.xxxx LATEST-NAME
The word latest-name
cannot return an nt
that cannot be obtained
using find-name-in
or traverse-wordlist
applied to the compilation word list.
In some Forth systems the word :
(colon) places an nt into the compilation word list and makes it hidden (unfindable).
This nt must not be available for traverse-wordlist
and for latest-name
.
Thus, formally, only the words ;
(semicolon), does>
, and ;code
are allowed to add the nt of a definition created with :
(colon) to the compilation word list.
If a Forth system does not provide the optional Search-Order word set,
and in that Forth system the word immediate
moves an nt
from one internal list to another,
this must not affect what latest-name
returns,
and this must not affect what find-name
returns
(for example, consider a case where two last words
have the same name and immediate
is used for the latest one).
Thus, after execution of immediate
,
latest-name
shall return the same value as before this execution.
Typical use
: var ( "name" -- )
variable
0 latest-name name>interpret execute !
;
The word latest-name
throws an exception
if the compilation word list is empty,
since in typical use cases this situation is surprising
and is not handled locally.
Reference implementation
In the following implementation for latest-name-in
we assume that a word list identifier wid is an address
that contains nt of the most recently placed definition name
into this word list.
: latest-name-in ( wid -- nt|0 ) @ ;
In the following implementation for latest-name-in
we assume that the values of nt,
interpreted as unsigned numbers,
monotonically increase when sorted chronologically
(it works on most systems):
: umax ( u1 u2 -- u.max ) 2dup u< if swap then drop ;
: latest-name-in ( wid -- nt|0 )
>r 0 [: umax true ;] r> traverse-wordlist
;
An implementation for latest-name
:
: latest-name ( -- nt )
get-current latest-name-in dup if exit then -80 throw
;
Testing
: it ( -- xt ) latest-name name>interpret ;
wordlist constant wl1
t{ : ln1 ; it ' ln1 = -> true }t
t{ get-current latest-name-in ' ln1 = -> true }t
t{ :noname [ it ] literal ; execute ' ln1 = -> true }t
t{ : ln2 [ it ] literal ; ln2 ' ln1 = -> true }t
t{ wl1 latest-name-in -> 0 }t
get-current wl1 set-current ( wid.prev )
t{ ' latest-name catch -> -80 }t
t{ : ln3 ; -> }t
set-current
t{ it ' ln2 = -> true }t
referenceImplementation - Example implementation for PICK
Discussion of the PLACE proposal is not relevant to the word that Mr. Peterson posted. To reduce distraction I'll rename it to BURY [see note], except I wish to change the API slightly, adding 1 to the index. It makes the example implementation more clumsy but I have at least one reason below:
: bury ( x_u x_u-1 ... x_1 x_0 u -- x_0 x_u-1 ... x_1 )
dup 0 = if 2drop exit then
dup 1 = if drop nip exit then
rot >r 1- recurse r>
;
\ Rationale: replace x_u with x_0 then drop x_0:
T{ 44 33 22 11 00 4 bury -> 00 33 22 11 }T
T{ 00 0 bury -> }T
PICK fetches a value from down the stack, BURY stores a value to down the stack. I suspect most systems can feasibly implement constant time BURY, one cell load and store, just like PICK and unlike words like SWAP ROT and especially ROLL.
PICK and BURY are sufficient to express every possible stack operation. No need for even >R R>:
: dup 0 pick ; : drop 0 bury ;
: over 1 pick ; : nip 1 bury ;
: 2nip 2 bury 2 bury ;
: swap dup 2 pick 2nip ;
: 3nip 3 bury 3 bury 3 bury ;
: rot over over 4 pick 3nip ;
( etc etc )
Some of the rarer ones could be demoted to be optional, since the system user can implement them with PICK BURY.
[note] About the name BURY: I seem to recall seeing some system somewhere that used the names DIG and BURY to mean ROT and -ROT. I thought it was retroforth but no, that has a ROT. As an aside: since I am reusing BURY to mean something different, ideally It'd be nice to rename PICK to DIG to match. I assume that's not appropriate for the standard though. The name is important but the functionality is useful whatever its name.
@NickMessenger wrote:
I wish to change the API slightly, adding 1 to the index. It makes the example implementation more clumsy but I have at least one reason below
PICK
andBURY
are sufficient to express every possible stack operation.
I see, but it seems to make bury
more clumsy to use.
Could you give some practical examples where the word bury
is useful? Especially, where you need to simply drop the argument ( x 0 ).
referenceImplementation - Example implementation for PICK
0 BURY is just another way to express DROP, so not useful in-and-of-itself, again except for the fact that it makes PICK/BURY sufficient to express all stack operations. You could picture a super minimal system that no stack words at all except these two, so DROP DUP NIP would be written in forth as above. Implementation-wise you could imagine 123 0 BURY would store 123 in the same cell where it already is then DROP it, so no need for any special cases. It's the dropping effect we're looking for.
As for practical examples, I listed the above words. I can list some more:
: -rot dup 3 pick 3 pick 3nip ;
: flip ( a b c -- c b a ) dup 3 pick 2 bury 3 bury ;
: 6nip 6 bury 6 bury 6 bury 6 bury 6 bury 6 bury ;
: 2over 3 pick 3 pick ;
: 2rot 2over 2over 9 pick 9 pick 6nip ;
I feel like PICK BURY could lessen perceived need for local variables. Ideally you rewrite your words to need fewer parameters but if you can't then there's a need to move values somewhere, be it locals, globals, the rstack, etc. SWAP ROT ROLL require N loads and stores so are less efficient, but this would allow you to PICK what you need to work with, BURY the results where they go and DROP any intermediates.
referenceImplementation - Example implementation for PICK
A system like gforth with SP@:
: pick ( xu..x0 u -- xu..x0 xu ) 1+ cells sp@ + @ ;
: bury ( xu..x1 x0 u -- x0..x1 ) 1+ cells sp@ + ! ;
cr 22 11 00 2 pick .s \ <4> 22 11 0 22
cr 77 2 bury .s \ <4> 22 11 77 22
this would allow you to
PICK
what you need to work with,BURY
the results where they go andDROP
any intermediates.
Do you have any programs that use bury
? I mean, other than a super-minimal system that doesn't provide any stack words other than pick
and bury
—that's interesting, but impractical )
referenceImplementation - Example implementation for PICK
gforth-internal does in fact define the 2 + variant which they call STICK and use a few times. My favorite system, durexForth, uses a split stack and so cannot support SP@ but does have PICK. I wrote a BURY in its provided assembler wordset, then used BURY to implement 2SWAP 2ROT 2NIP.
...I don't actually have any programs that use those words though. And I just benchmarked it and the PICK/BURY version of 2SWAP took about triple the time of the ROT >R ROT R> version. If I use VALUEs instead of literals it's only double the time. So maybe it's pointless. I just think it's weird that the standard has a fetch-from-downstack but no store-to-downstack.
I just think it's weird that the standard has a fetch-from-downstack but no store-to-downstack.
Yes, this can be considered as a slight gap in orthogonality. But since such a word can be defined using standard words, I don't think there's any point in standardizing it before it's used in practice.
Concerning the stack effect (API)
I would prefer the original one, because it is similar to other fetch/store words like (@
, !
), (2@
, 2!
), (defer@
, defer!
), etc.
The general stack diagrams of these words are:
- fetch ( identifier -- data-object )
- store ( data-object identifier -- )
And if you store a data object by an identifier, you should then read the same data by the same identifier. Your option violates this rule.
A feature of pick
is that it interprets the underneath stack items as an array on which it operates. The same should be for the word that stores a value.
Concerning naming
In comp.lang.forth, this word was mentioned (Andrew Haley, 2013-01-29) as poke
.
In the paper "Symbolic Stack Addressing" (Adin Tevet, 1989), this word is called post
. The idea there is that a phrase n pick n post
, where n is a non-negative integer less than the stack depth, does not change the state of the stack.
I would prefer the name poke
because:
- it is a historical name for a function that stores a value in memory, so it has the corresponding semantic association (unlike
post
,stick
andbury
); - it is mainly a verb (unlike
stick
, which is mostly a noun); - its sound is closer to
pick
thanpost
andbury
;
The second, but much less preferable, option is stick
.
comment - Interpretation of the top input parameter of PICK
The word pick has the type ( x.0 u.cntx u.cnt -- x.0 u.cntx x.0 ).
and it additionally documents that 0 pick
is the same as dup
, and 1 pick
is the same as over
.
The word roll has the type ( x.0 u.cntx u.cnt -- u.cntx x.0 ).
and it additionally documents that 1 roll
is the same as swap
, and 2 roll
is the same as rot
.
2pick must have the type ( xd.0 u.cntx u.cnt -- xd.0 u.cntx xd.0 );
2roll must have the type ( xd.0 u.cntx u.cnt -- u.cntx xd.0 );
But the standard also already has 2dup
, 2over
, 2swap
, and 2rot
. If I try to extrapolate the same mappings on the larger 2-cell types, without paying attention to your proposed stack diagram for 2pick
and 2roll
, my first guess would have been:
0 2pick
would be the same as 2dup
, and 1 2pick
the same as 2over
.
1 2roll
would be the same as 2swap
, and 2 2roll
the same as 2rot
.
That is, I would have assumed that since existing 2XXX
words have stack effects that force the entire stack to be paired off up to the point of impact, that the same pairing effect would be present in your proposed 2pick
and 2roll
.
But I was pleasantly surprised to note that your proposal does NOT do that, but is actually more powerful by letting me control how many single-cell slots to skip before finally accessing a 2-cell object.
In particular, while 0 2pick
does end up being the same as 2dup
, your 1 2pick
has an interesting single-cell stack effect ( x.a x.b x.c -- x.a x.b x.c x.a x.b )
that is not available from any one single standard word, but which I have wanted in my own code (I've ended up with things like dup 2over rot drop
or over 3 pick swap
to get the same effect). And it is not until you get to 2 2pick
that you get the effect of 2over
.
Similarly, your 1 2roll
has the single-cell stack effect ( x.a x.b x.c -- x.c x.a x.b )
which is the well-known -rot
(aka rot rot
in the standard), and it is not until you get to 2 2roll
that you finally accomplish 2swap
, or 4 2roll
that you accomplish 2rot
.
One additional problem with is
, to
and action-of
is, that there are two different implementation styles: One parses, one sets a state. Both work with your tests, but I'm sure you can come up with a distinguisher, e.g.
3 value x
: check-x ['] to execute x ;
t{ 5 check-x x -> 5 }t
The parsing variant will consume x
during execution of to
, the non-parsing one won't. Note that the standard mandates a parsing to
, anyhow (at least by how it is worded), and we should not conflate obtaining an xt of to
and executing it with suggested backdoors for non-conforming implementations.
there are two different implementation styles: One parses, one sets a state.
Excellent point, thank you! This should also be noted in the rationale. A distinguisher should use two value-flavored variables, I will add a testcase.
Formally, non-parsing implementations don't conform to either Forth-94 or Forth-2012, regardless the xt of these words, since their fail in compilation state if their immediate argument is an immediate word (testcase). In practice, such cases simply do not occur.
Note that any non-parsing implementation can easily be converted to a parsing implementation.
For example, if to
is an immediate word:
: to ['] to execute parse-name evaluate ; immediate
I think that after 30 years, the standard should stop making concessions to implementations that don't do parsing in to
, is
, action-of
(i.e., stop including the corresponding ambiguous conditions).
Rationale:
- this allows programs to redefine these words;
- this makes the standard Forth clearer and more consistent;
- this supports efforts to reduce the number of ambiguous conditions;
- changes in the affected Forth systems seems simple;
If no objections, I will add this into the next version.