Digest #309 2025-08-13

Contributions

[400] 2025-08-12 12:25:58 JimPeterson wrote:

referenceImplementation - Possible Reference Implementation

I suggest the following (somewhat obscure) reference implementation for DNEGATE:

: DNEGATE ( d1 -- d2 ) >R NEGATE DUP 0<> >R - ;

It's not as straightforward as 0. 2SWAP D-, and that sort of brings up the question of the purpose of reference implementations in this document. Do we present these to the reader as an easy means of filling gaps in an implementation, or are we more attempting to directly and precisely describe a word's behavior? While I prefer the above implementation that I'm suggesting when attempting to implement DNEGATE without raw code, the 0. 2SWAP D- version is far more communicative about what the word does (not that it's that obscure, in this case) but is not too useful for implementing words if D- is then suggested to be implemented as DNEGATE D+.

It is certainly more interesting to find the "clever" and efficient implementations of a word, but it may not be as instructive.


[401] 2025-08-12 19:45:30 EricBlake wrote:

requestClarification - Is it ambiguous to execute the use of does> after :NONAME or FORGET?

The standard is clear that runtime effects of DOES> are ambiguous if the most recent definition was not defined with CREATE:

Replace the execution semantics of the most recent definition, referred to as name, with the name execution semantics given below. Return control to the calling definition specified by nest-sys1. An ambiguous condition exists if name was not defined with CREATE or a user-defined word that calls CREATE.

However, the standard is less clear on what actually constitutes the most recent definition, and what happens if the most recent definition lacks a name; it is not even clear whether there must even be a most-recent definition at all times. The proposal on LATEST-NAME does not immediately help; it has a caveat "In some plausible Forth systems, the word list structure doesn't contain any information about the definition that was placed into this word list most recently." where the implementation of RECURSE and IMMEDIATE (two other words that require knowledge in some form of the most-recent definition) are accomplished by other means. While that proposal recommends throwing exception -80 when a compilation list is empty, it does not necessarily state whether there are other reasons that there might not be a latest name. But it does mention in feedback "Also, I would like to clarify DOES> to avoid ambiguity concerning what is the current definition after the compilation semantics of DOES> are performed." Also in feedback is this comment: " The first thing needed here is to really figure out what people actually want: The last element of a wordlist, the last incomplete definition of a wordlist (i.e. the element with the smudge bit set), or the last definition in time, regardless if it is completed or not, and what wordlist it goes into once completed."

I can think of at least two situations where the latest name might not exist:

First, where the latest definition exists but lacks a name. Right after an anonymous definition is created, implementations differ on whether space was reserved in the current compilation list, or whether the anonymous definition exists in some other region. The standard is already clear that use of IMMEDIATE after a :NONAME definition is ambiguous; and it is also clear that TRAVERSE-WORDLIST does not find nt for anonymous definitions even if the definition is stored in the same wordlist alongside named words (without regards to whether that is accomplished by a smudge bit, a separate wordlist, or some other implementation). I see nothing in the standard that precludes an implementation of :NONAME that (temporarily) switches over to a different wordlist, compiles a named definition, and then returns back to the original wordlist (as a way that I could implement :NONAME on top of a Forth that otherwise lacks it); and the act of changing the current wordlist is a situation in which the most recent definition becomes ambiguous. But I also see nothing that precludes an implementation that stores :NONAME with a smudge bit in the same compilation wordlist as named words, but which lacks the ability to track any more than the most recent addition (smudged or not) to that wordlist, and therefore the act of using :NONAME loses track of which named word was latest. By my understanding of the current wording in the LATEST-NAME proposal, we are supposed to have the situation that:

[UNDEFINED] latest-name [IF] .( latest-name proposal not yet implemented) [THEN]
: named ;
LATEST-NAME \ refers to named
:NONAME ;
LATEST-NAME \ _still_ refers to named

even though in testing, in SwiftForth 3.12.0 has LAST @ produce 0 after the anonymous definition (ie. using : latest-name LAST @ ; would not comply with the current version of the proposal); similarly, gforth 0.7.9 has LATEST produce 0 after the anonymous definition, while LATESTNT produces an nt that is identical to the xt returned by the :NONAME (that is, gforth's implementation of LATEST specifically checks whether name>string produces an empty string; and neither LATEST nor LATESTNT would comply with the current version of the proposal).

So, my request for clarification includes figuring out whether this code should become well-defined or remain ambiguous:

CREATE a
: does1 DOES> DROP ." in does body" ;
:NONAME ." in anon" ;  ( xt1 )
does1
' a ( xt1 xt2 )
." a: " CATCH (xt1 X ior ) 2DROP ( xt1 )
." anon xt: " EXECUTE

In testing, gforth 0.7.9 prints "in does body" after the "anon xt:" banner (ie. DOES> treated the anonymous xt as the most recent defined word, and rerouted its runtime semantics to the does body, even though there was no name involved). SwiftForth 3.12.0 dumps core at the line does1 (including this typo: does1 Segmenation fault at 0804E4A1 NAME> +2 on the grounds that there is currently no most-recently-defined name (let alone a name that was defined by CREATE). But I could ALSO envision a system where the banner with "a:" outputs the "in does body" message, while the "anon xt:" line outputs "in anon", on the grounds that the most recently defined name (using proposed LATEST-NAME semantics) was indeed a, despite the anonymous definition in the meantime.

Second, I see a problem after executing a word defined with MARKER (or after executing FORGET). While it makes sense for an implementation to track which word was most recently added to a wordlist, the standard is already explicit that TRAVERSE-WORDLIST may visit words in any order provided that for two nt with the same name, the newer one is visited first. It is very easy to implement a hashtable structure for a wordlist that remembers only the most-recently added (plus a linked list so that entries with the same name are visited newest first), but NOT any other temporal ordering between names. And unless MARKER explicitly requires that reverting to a point in time must recover the most-recently-defined name in that wordlist at the time the marker was created, I could easily see an implementation wanting to behave as if NO word were most-recently-defined (easier to not have to track and restore that state). Back to examples, gforth 0.7.9 started interactively with LATEST . as the very first line of input prints 0 (even though the wordlist is populated, there is no most-recent definition until the user starts adding words); while repeating the same in SwiftForth LAST @ NAME>STRING TYPE outputs %SwiftForth/src/ide/ul/linux/hi.f in my test environment. Or trying harder,

\ SwiftForth
: a ." in a" ;
marker b
: c ." in c" ;
b
[defined] c .
last @ name>interpret execute

outputs 0 (c was undefined), but then "in c" (oops - last @ still pointed to the nt of c, rather than resetting it to zero or rewinding it to a - in this case, I've hit the ambiguity of something in the system, LAST @, still maintaining a stale reference to memory that was forgotten), even though the proposed LATEST-NAME sounds like it should be set back to a (as the most-recently-defined word remaining in the wordlist after the marker undoes things) or to nothing (on the grounds that undoing the dictionary is not required to undo the notion of which word was latest at the time the marker was created). gforth 0.7.3 lacked name>interpret, but changing the last line to latest name>int execute produces the same "in c" as swiftforth; however, that was fixed for gforth 0.7.9 which now restores latest to the nt of a as a part of its marker implementation.

So back to the question: is the execution of a DOES> after a marker is executed supposed to reliably affect the most recent name (assuming that name was defined by CREATE) prior to the creation of the marker, or should the standard explicitly mention the ambiguity of not having a definitive most recent named word after a marker:

: does1 DOES> DROP ." in does body" ;
CREATE a
MARKER b
CREATE c
b
does1
a  \ should this print "in does body"?

Replies

[r1493] 2025-08-12 20:24:53 EricBlake replies:

requestClarification - Is it ambiguous to execute the use of does> after :NONAME or FORGET?

Another fun test when quotations are added to the mix:

: does1 DOES> DROP ." in does body" ;
: d ." in d " [: ." in quotation" ;] does1 execute ;
d
d
: e ." in e " [: ." in quotation" ;] [ does1 ] execute ;
e
e
: f ." in f " [: ." in quotation" [ does1 ] ;] execute ;
f
f

In gforth 0.7.9, the first execution of d outputs "in d in quotation" but assigns the does> body to the most recent completed definition (d was completed later than the quotation); the second execution of d outputs "in does body" (gforth allows you to use DOES> on a word created with : rather than CREATE). However, both lines of executing e output just "in does body"; while both lines of executing f output both "in f in does body". Apparently, gforth decided that at the time does1 is interpreted after the quotation in e, the most recent named word is "e", even though that word is not yet complete; and the behavior of e is changed before it can ever be shown with its original behavior. But at the time does1 is interpreted in the middle of the quotation of f, the current definition is the quotation rather than f proper, and even though the quotation is unnamed, gforth changed the runtime behavior of the quotation before f ever executes it. At any rate, these examples are less important - the standard is already clear that executing DOES> when the most recent definition was not made with CREATE is ambiguous (and these definitions were made with : or [:, not CREATE). On the other hand, to some extent, this is similar to the question on whether : foo ; : bar [ IMMEDIATE ] ; should be changed to have well-defined semantics of always making bar immediate, instead of the standard's current stance that it is ambiguous with existing implementations that set foo immediate; stemming back to the idea that IMMEDIATE is permitted (but not required) to modify the current wordlist, and that modifications to the current word list outside of normal compilation before ; is reached is ambiguous.

But having typed that, I wonder:

: factory ( c-addr u "name" -- ) ALIGN CREATE HERE 2! 2 ALLOT DOES> 2@ TYPE ;
: does2 DOES> DROP ." in does body" ;
S" in b" factory b
IMMEDIATE
does2

Are there any implementations where the act of making b IMMEDIATE moves it out of one wordlist (the list of normal words) and into another (the list of immediate words), such that the proposed wording of LATEST-NAME changes its view of which name is encountered first in search order, and therefore where does2 does not unambiguously redefine the behavior of the now-immediate b?


[r1494] 2025-08-12 21:26:40 EricBlake replies:

referenceImplementation - Possible Reference Implementation

: DNEGATE ( d1 -- d2 ) >R NEGATE DUP 0<> >R - ;

That assumes twos-complement utilizing all bits of the lower cell (future Forth is headed that way, but Forth-2012 still permits sign-magnitude or ones-complement, as well as implementations where valid u values are capped at fewer than the full bits in the cell). Maybe we could also show three implementations, along the lines of A.3.2.1:

: DNEGATE ( d1 -- d2 ) >R NEGATE DUP 0<> >R - ; \ correct for twos complement
: DNEGATE ( d1 -- d2 ) 2DUP OR IF SWAP INVERT SWAP INVERT THEN ; \ correct for ones complement, avoiding -0
: DNEGATE ( d1 -- d2 ) 2DUP OR IF HIGH-BIT XOR THEN ; \ correct for sign-magnitude, where HIGH-BIT is the mask of the sign bit, avoiding -0

Showing a circular reference of:

: DNEGATE ( d1 -- d2 ) 0. 2SWAP D- ; : D- ( d1 d2 -- d3 ) NEGATE D+ ;

would be "correct" regardless of encoding, but then leaves it to the implementation to be familiar with the rules of the encoding.

And all of the above still assumes that d has twice as many bits as n. As stated in A.3.2.1, "There is no requirement to implement circular unsigned arithmetic, nor to set the range of unsigned numbers to the full size of a cell." If n is not using all of the bits of a single cell, then I question whether any of the above implementations are correct for a double-cell d. (And as I mentioned here, I'm working on a niche environment where my n has at least 48 bits but no easy-to-determine most-significant bit, because the underlying implementation could be any of 64-bit twos-complement integers [overflow is circular], BigInt integers [usable as sign-magnitude, no overflow but no definitive sign bit], or IEEE doubles used as 53-bit integers [sign-magnitude, overflow loses precision but not magnitude]). Since Forth-2012 only requires n to use at least 15 bits plus sign, and d to use at least 31 bits plus sign, I'm still aiming to be a standard system by declaring that my n has at least 48 bits plus sign, and my d has at least 48 bits plus sign, where my d is encoded as all significant bits in the lower cell. My initial idea for DNEGATE in that environment is this, subject to change based on what I hit while coding other double words: : DNEGATE ( d1 -- d2 ) D>S -1 * S>D ;, where D>S includes -24 throw if the upper cell was not 0 or -1).

True, the reference implementations need not be universal. But when a reference implementation limits itself to a particular environmental constraint, like twos complement, it should be called out.