Digest #288 2024-12-04
Contributions
requestClarification - Return Stack Notation Mildly Inaccurate
The run-time stack effect is declared as:
( n1 | u1 n2 | u2 -- ) ( R: -- loop-sys )
but in actuality, the return stack may or may not have loop-sys added (unless I significantly misunderstand the function of this word). Is this more appropriate?:
( n1 | u1 n2 | u2 -- ) ( R: -- loop-sys | )
Replies
Since 0
wasn't a base switching prefix before, this will break code that has numbers starting with 0. I did a quick grep over the Gforth sources, and there are indeed places where 0-prefixed numbers are used, and only one of those is actually in a block that starts with 8 base !
. The others are hex or decimal.
It's indeed a good idea to warn if a word like 10
or 42
is defined (and the development Gforth does that). So far, while some Forth systems do print warnings, the standard doesn't talk about warnings. Warnings can always be switched off, e.g. with 0 warning !
in SwiftForth, or 0 warnings !
in Gforth, so when using a dictionary as key-value storage, you can just turn the warnings off when defining there. Technically, when a standard word prints a warning, it's behavior is outside the scope of the standard — it does something that's not specified, and there's e.g. no ambiguous condition for redefining a word that would allow a warning to be printed.
So there seems to be a need to talk about the concept of warnings.
How about NOTAWORD instead of XLERB ? Perhaps NOTAWORD is too obvious and not sufficiently geeky?
Do we know which Forth standard introduced number prefixes?
I use & as octal prefix
&10 . 8 ok
I did not invent that so there must be other systems also using it!
BR Peter
Forth-2012 introduced number prefixes.
I dimly remember seeing &
in some non-Forth place as octal prefix, and indeed, Free Pascal uses that (but my memory is certainly not from Free Pascal; probably some assembly language). However, as documented in the number prefixes proposal, there is conflicting usage in Forth systems concerning the &
prefix: BigForth, PFE, Gforth, and Win32Forth 6 use it as decimal prefix, SwiftForth and ntf/lxf use it for octal. So it's unlikely that we will standardize &
as octal prefix.
Searching a little around some assembly languages, I found that some 68000 and 6502 assembly languages use @
as octal prefix.
Implementing another prefix would be cheap, but OTOH, the benefit also appears to be miniscule.
Use cases
Concerning the supposed lack of use cases: I have mentioned use cases where the state-based interface is at the very least cumbersome in r1038.
Do you mean this example: "you cannot implement postpone
or ]]...[[
as standard-compliant code"?
Then it's unclear what specification for these words you cannot implement? Because:
- You provided a portable (standard compliant) implementation for
]] ... [[
(based onpostpone
). This implementation does not depend on howpostpone
is implemented. - A standard
postpone
can be implemented usingfind
orfind-name
. An advancedpostpone
can be also implemented in a standard-compliant way.
Could you please clarify?
Probably you mean that the user should be able to create a recognizer and assign it to the perceptor, and then postpone
(and ]] ... [[
that uses this postpone
) should be applicable to lexemes that this recognizer recognizes. But I do not see any connection to the state-based interface too.
state
is a bad idea, as demonstrated by the problems mentioned above.
I implemented postpone
in four different approaches (see fep-recognizer/implementation/variant.gamma/postpone/index.fth) in my "gamma" reference implementation for Recognizer API.
This reference implementation is portable and can be loaded in Gforth as
gforth implementation/index.fth
In every approach I defined the interpretation semantics for postpone
, so postpone
depends on state
. In every approach the words compile-postpone-qtoken ( qt -- )
and translate-postpone-qtoken ( any qt -- any )
are provided. The former does not depend on state
, the later does depend on state
.
In the variant postpone/auto.via-mmode.fth the macro-compilation mode is employed (one more state, if you like). By default, namely this variant is loaded in the current version (Commit f3b7d01). The macro-compilation mode is very useful because it also allows to implement a more useful and advanced variant than your construct ]] ... [[
.
Could you please demonstrate a problem concerning state-dependency in any of these approaches?
As for opaque vs. transparent: Opaque would only be an option if the only use of translators was really in the text interpreter and in postpone. But if we want to support other use cases (and there are other use cases, as discussed above), we should do a transparent user interface.
Could you please provide a practical example when you need a transparent token descriptor structure?
so when using a dictionary as key-value storage, you can just turn the warnings off when defining there.
If this word list is not included in the search order, Forth systems typically do not print any warning. And a standard-compliant program may use nonstandard warning
or warnings
only if it tests the Forth system vendor/version.
I see a problem only in the mentioned ambiguous condition, because it allows the system to throw an exception on such a word, and disallows a standard program to place such a word into any word list.
So there seems to be a need to talk about the concept of warnings.
Agreed.
It's indeed a good idea to warn if a word like
10
or42
is defined
This still does not make a Forth system non-standard if it provides such words. Also, a Forth system may provide a word def
, and then a standard program fragment hex def decimal constant foo
will not work correctly on this standard Forth system. It is unclear how this problem can be formally solved.
@AntonErtl Thanks, I thought it was Forth-2012 that introduced the prefixes but I'm little surprised that an octal prefix was not agreed upon considering its prevalence in computing.
I note your comments about using & as a prefix and I see how it will cause problems with earlier implementations of Forth but isn't that inconsistent with the spirit of creating a standard? I would have thought the standard would govern the implementations rather than the implementations governing the standard.
In terms of using @ as a prefix, I have some reservations considering the standalone word @ (fetch). I could see some confusion being created if such prefix was used.
Regarding your comment that the benefit of an octal prefix would be minuscule, I respectfully disagree. Some benefits of an octal prefix are-
you can readily use octal whilst having a different base; and
you can ensure that the number is octal notwithstanding any change to the base.
I find it quite common to mix binary, hexadecimal and octal numbers when writing low-level code; for example one would use binary when bit toggling flags, hexadecimal for memory addresses, and octal for encoding (especially x86 instructions).
Because of such, I make it a practice to ensure all binary, hexadecimal and decimal numbers are prefixed and all octal numbers are not prefixed but start with a 0.
The problem with my practice is that I cannot protect my code against inadvertent changes to the BASE and, therefore, I am frequently calling 8 BASE ! as a precaution.
In terms of & and/or 0 not being an acceptable prefix because of breaking existing Forth implementations, etc perhaps a 'q' prefix (looks like an o but won't be misread as a 0) might be a suitable alternative?
Using EXECUTE
instead of a special translator-specific word allows to use the rest of the recognizer API for interpreters that don't have any state at all. This actually happens and is useful; e.g. the parser in net2o's chat system uses that. There's absolutely no need for any other mode than directly interpreting. And using EXECUTE
does not mean you have to set STATE
if you call a translator for a particular state (interpreting/compiling/postponing) directly. Though there are likely confusing results if you do so and the word executed is a state-smart word. The amount of surprise level is likely small, because so far, the only direct access method actually useful is the one for the postpone action. And that never executes the word found.
I don't want to mandate a particular implementation. Choose the implementation you like. I'll add an API that allows direct and default invocation of a translator. I'm not sure if I want this in the same proposal or split it into another one, so we can vote on those separately.
requestClarification - Return Stack Notation Mildly Inaccurate
Is this more appropriate?:
( n1|u1 n2|u2 -- ) ( R: -- loop-sys | )
There is no need to indicate the case when loop-sys is not placed on the return stack, because this case is already specified by the fact that loop-sys may take zero or more cells on the return stack, see 3.1.5.2 System-execution types.