Digest #233 2023-09-23
Contributions
Rationale
Instead of mentioning SYNONYM
in many other glossary entries, it's better to use another wording, which does not require that.
In some cases, it's better to specify a glossary entry in a more general manner, than mention SYNONYM
in it (see the proposal below).
According to 2.2.3 Parsed-text notation, "name" has the specified meaning when it's mentioned literally as "name" only. So, newname and oldname are not normative. Probably, we can use indexing as for data type symbols in stack diagrams. When an index not a number, it is concatenated via a dot.
We should ensure that system-defined semantics are the same for synonyms.
If the standard defines some semantics, system-defined semantics shall be equivalent to them, so it's enough to mention system-defined semantics.
In some plausible Forth system implementations the execution token for a word
is identical to the name token for this word,
and the execution tokens for the different words are always different. So
SYNONYM
should not require the same execution token (if any) for name.new and name.old.
The part about deferred words is not enough formal.
Examples
A synonym of the word CREATE
creates a word with the data field address due to the same execution semantics as of CREATE
:
synonym mycreate create
mycreate foo 123 ,
A synonym of a word created via performing the execution semantics of CREATE
returns the same data field address due to the same execution semantics as the original word:
synonym bar foo
bar foo = . \ prints -1
Consequently, the execution token for a synonym is associated with the same data field address (if any) as the xt of the original word:
' bar >body ' foo >body = . \ prints -1
A synonym of a word created via performing the execution semantics of VALUE
returns the same value due to the same execution semantics as the original word:
1 value x
synonym y x
x y = . \ prints -1
Applying TO
to a synonym of a word created via performing the execution semantics of VALUE
changes the value assigned to the original word due to the same TO name
Run-time semantics:
2 to y
x y = . \ prints -1
Proposal (draft)
Re data-field address
In
6.1.0550 >BODY
,
6.1.1250 DOES>
,
Replace the phrase:
defined via
CREATE
by the phrase:
defined via performing
CREATE
execution semantics
Replace the phrase:
defined with
CREATE
or a user-defined word that callsCREATE
.
by the phrase:
defined via performing
CREATE
execution semantics
Re deferred words
In
6.2.1725 IS
,
6.2.0698 ACTION-OF
,
6.2.1177 DEFER@
,
6.2.1175 DEFER!
,
Replace the phrase:
defined by
DEFER
by the phrase:
defined via performing
DEFER
execution semantics
Re SYNONYM
Replace the whole normative part of the glossary entry 15.6.2.2264 SYNONYM
by the following paragraphs:
( "<spaces>name.new" "<spaces>name.old" -- )
Skip leading space delimiters, parse name.new delimited by a space, skip leading space delimiters, parse name.old delimited by a space.
Find name.old. Create a definition for name.new with the semantics defined below; name.new may be the same as name.old.
The execution token for name.new may differ from the execution token for name.old.
An ambiguous condition exists if name.old is not found or IMMEDIATE
is performed when name.new is the most recent definition.
All the ambiguous conditions that exist when using name.old, also exist when using name.new.
If execution semantics are defined by the system for name.old:
name.new Execution: Perform the execution semantics of name.old.
If interpretation semantics are defined by the system for name.old:
name.new Interpretation: Perform the interpretation semantics for name.old.
If compilation semantics are defined by the system for name.old:
name.new Compilation: Perform the compilation semantics for name.old.
If "TO name.old Run-time" semantics are defined for name.old:
TO name.new Run-time: Perform TO name.old Run-time semantics.
If name.old are defined via performing the execution semantics of DEFER
:
IS name.new
performs IS name.old
ACTION-OF name.new
performs ACTION-OF name.old
Applying DEFER@ to the xt of name.new
performs DEFER@
to the xt of name.old
Applying DEFER!
to the xt of name.new
performs DEFER!
to the xt of name.old
Replies
proposal - Input values other than true and false
This code is meant as a reference implementation for bracket_IF and friends:
\ ----------------------------------------------------------------------
\ @file : bracket_IF.fs
\ ----------------------------------------------------------------------
\ Last change: KS 15.09.2023 19:45:47
\ @author: Klaus Schleisiek
\ @copyright: public domain
\ Traditionally, string comparison has been used to process [IF].
\ This version uses FIND instead.
\ Multiline comment * ...
\ ... *\ has been added, because it is trivial.
\ Conditional clauses may be commented out using (, , or *
\ ----------------------------------------------------------------------
: ?EXIT ( flag -- ) postpone IF postpone EXIT postpone THEN ; immediate : case? ( n1 n2 -- n1 ff | tf ) over = dup IF nip THEN ;
Defer [ELSE] ( -- ) immediate
: [IF] ( flag -- ) ?EXIT postpone [ELSE] ; immediate : [THEN] ( -- ) ; immediate : [NOTIF] ( flag -- ) 0= postpone [IF] ; immediate : [IFDEF] ( <name> -- ) postpone [DEFINED] postpone [IF] ; immediate : [IFUNDEF] ( <name> -- ) postpone [DEFINED] postpone [NOTIF] ; immediate
\ ---------------------------------------------------------------------- \ NEXT-WORD returns the xt of a word in the search order. \ Words, which are not found, will be skipped. \ 0 will be returned when the end of file is reached. \ ---------------------------------------------------------------------- : next-word ( -- xt | 0 ) BEGIN BEGIN BL word dup c@ WHILE find ?EXIT drop REPEAT drop refill 0= UNTIL 0 ; : *\ ( -- ) ; immediate \ end of multi-line comment : * ( -- ) BEGIN next-word dup 0= swap ['] *\ = or UNTIL ; immediate
Variable Nestlevel 0 Nestlevel ! \ nesting level counter
: nest ( -- ) 1 Nestlevel +! ; : unnest ( -- ) Nestlevel @ 1 - 0 max Nestlevel ! ; \ don't decrement below zero
: [if]-decode ( xt -- flag ) ['] [IF] case? IF nest false EXIT THEN ['] [NOTIF] case? IF nest false EXIT THEN ['] [IFDEF] case? IF nest false EXIT THEN ['] [IFUNDEF] case? IF nest false EXIT THEN ['] [ELSE] case? IF Nestlevel @ 0= EXIT THEN ['] [THEN] case? IF Nestlevel @ 0= unnest EXIT THEN ['] \ case? IF postpone \ false EXIT THEN \ needed to be able to e.g. comment out [THEN] ['] ( case? IF postpone ( false EXIT THEN \ needed to be able to e.g. comment out [THEN] ['] * case? IF postpone * false EXIT THEN \ needed to be able to e.g. comment out [THEN] 0= abort" [THEN] missing" \ end-of-file reached? \ all oter xt's are ignored false ; :noname ( -- ) BEGIN next-word [if]-decode UNTIL ; IS [ELSE]
An example of use translators for two different purposes:
\ use "tt-lit" and "tt-2lit" just to call these token translators:
: tt-3lit ( 3*x -- 3*x | )
>r tt-2lit r> tt-lit
;
: recognize-forth-lexeme ( sd -- i*x tt ) forth-recognizer execute ;
\ use "tt-xt" to analyze a token type:
: recognize-tick ( sd -- xt tt.xt | 0 )
"'" match-head 0= if 2drop 0 exit then ( sd2 ) \ the input lexeme without the leading tick
['] recognize-forth-lexeme execute-balance2 ( i*x tt|0 n.data-stack n.float-stack )
2>r dup ['] tt-xt = if 2rdrop exit then drop 2r> fndrop ndrop 0
;
In this implementation for recognize-tick
(not tested), the phrase 'foo::bar::baz
will work correctly and returns xt of the word baz
in the wordlist bar
in the wordlist foo
, when recognize-pqname
for the syntax "::" (example) is a part of forth-recognizer
.
To implement this, we do a nesting call of the forth recognizer for another lexeme and then analyze the returned type. If the returned type is not appropriate, we drop the token (from the data stack, and from the floating-point stack, if any). So we need to be sure that calling recognize-forth-lexeme
never causes any side effect (other than stacks), even when recognizing succeeds.
NB: when recognize-tick
is a part of the current forth-recognizer
, executing of recognize-forth-lexeme
on some inputs will produce indirectly recursive call of recognize-forth-lexeme
(as intended).
Tough question: The string recognizer has a side effect, which is not good. Moving that side effect to the translator is causing other problems, because
TRANSLATE-STRING
no longer has the corresponding string on the stack, but needs parsing it later.
It's pretty allowed for a translator to parse the input buffer and/or read the input stream. Some token translators will even do nesting calls of the Forth text interpreter and can throw exceptions.
A problem that a part of the string can be in the input buffer (or even in the input stream) is solved via introducing two translators for strings: one accepts the full string from the stack (e.g.
tt-slit
), and another (e.g.tt-slit-parsing
) accepts the starting part from the stack, and the tail from the input buffer (or input stream). The string recognizer returns one or another depending whether a lexeme is a completed string, or the start of the string only.
I published a reference implementation in 2019, and now updated it for the current proposal.
A string recognizer can be as follows:
: quot ( -- sd.quot ) s\" \"" ;
: recognize-string ( sd.lexeme -- sd tt.slit|tt.slit-parsing | 0 )
quot match-head 0= if 2drop 0 exit then quot match-tail if ['] tt-slit exit then
2dup quot contains if 2drop 0 exit then \ fail if '"' is found in the middle of the string
['] tt-slit-parsing
;
The code which I have simply looks like this:
['] translate-string of json-string! endof
Returning something half-done isn't a good idea and makes maintaining this code difficult, as all other possibles results (like ints, floats or such) are fully converted into something useful at this stage.
Thinking a bit more about that, I found:
- The thing you want to nest is the translator for names
- As I said, names should be first, numbers second and the rest third
- We have nestable recognizer sequences
So one solution would be to put all recognizers that return nts+translate-nt (or variants of that, e.g. locals have a variant of translate-nt that differs for postpone) in one recognizer stack, which has a name, and can be called without calling the entire recognizer stack. These recognizers have now a predictable effect, and no side effect. Since you can't tick locals, you still have to check for translate-nt
, but that's ok. You don't have to go through all weird other recognizers.
In Gforth, .recognizers
now can handle and display nested recognizers, and if you split this up like that, it would output:
.recognizers ~names ( ~nt ( Forth Forth Root ) ~scope ) ~numbers ( ~num ~float ) ~others ( ~string ~to ~dtick ~tick ~body ~complex ~env ~meta )
The ~
is there to abbreviate recognize-
(or rec-
now).
This also makes it easier to add recognizers where they belong, e.g. when you add the scope recognizer, you just push them to the end of the the names recognizer stack. If you add the floating point recognizer, the complex recognizer (both are numbers), or the hex floating point recognizer for exact notation of floating point constants, you just push them to the back of the numbers recognizer stack, and they get ahead of the others. I like this solution.
The other solution is what Gforth does: There's a ?REC-NT
which does the nesting, the checking for translate-nt, and the cleaning up of the side effects (stacks and >IN
). There is the possibility to make this more generic, e.g. create a word TRY-RECOGNIZE
which gets an xt, passes that to the result, and if that returns false, everything is cleaned up and false is returned, otherwise whatever that xt left (including the flag) is returned.
The cleaning up is already cumbersome, because a variable number of values can be returned on both data and floating point stack, and when in addition to that also >IN
can change, it's just a little bit more hustle.
One correction. I wrote:
If we want to reflect this idea, we can use the acronym
tt
, which stands for both: token translator and token type.
It should be read as:
If we want to reflect this idea, we can use the acronym
tt
, which stands for both: "translate token" (verb) and "token type" (noun).
Data type symbol
To specify formal requirements, we have to introduce a new data type for token translators, which is a subtype of xt
. And the abbreviation tt
is a good candidate for this data type symbol.
If we will have the data type tt => xt|0
, and the symbol sd
for the string data type, the naming convention along with the stack diagram for a recognizer can be expressed as:
RECOGNIZE-{lexeme-type-symbol} ( sd.lexeme -- i*x tt ) ( F: -- j*r )
@BerndPaysan writes:
Returning something half-done isn't a good idea and makes maintaining this code difficult, as all other possibles results (like ints, floats or such) are fully converted into something useful at this stage.
This would be a valid argument if it were possible to return something useful from a recognizer in all use cases but single-line string literals. But it's impossible.
For example, a recognizer for multi-line string literals cannot parse the full string literal without refilling the source (see my PoC implementation). Should we also restore the input source state to isolate side effects of recognizers?
And it still isn't enough. A recognizer for curly-based markup like foo{ any forth code bar{ nested code }bar ... }foo
cannot return something useful, since a useful thing in this case is a created definition or just a side effect of appending some semantics to the current definition. Should we also restore the state of the dictionary?
I think, it's obvious — isolation of all possible side effects of recognizers is not fruitful.
Yes, some recognizers returns objects that are not useful by themself, but they still return information what a given lexeme means, and it's an acceptable price for absent side effects for all recognizers.
Also we separate concerns into things that do have side effects (token translators) and things that don't have side effects (recognizers). It's very useful separation.
@BerndPaysan writes:
The code which I have simply looks like this:
['] translate-string of json-string! endof
A straightforward solution is to handle each token type of string literals separately. Probably, I would write it as follows:
'tt-slit of json-string! endof
'tt-slit-parsing of parse-slit-end json-string! endof
'tt-slit-ml of parse-slit-ml json-string! endof
(I would use a recognizer for a leading tick, and naming of translators in the form tt-{token-type-symbol}
)
Or I would factor a helper word as follows:
: ?prepare-tt-slit ( i*x tt -- i*x tt | sd.transient tt.slit )
case
'tt-slit of 'tt-slit endof
'tt-slit-parsing of parse-slit-end 'tt-slit endof
'tt-slit-ml of parse-slit-ml 'tt-slit endof
endcase
;
: eval-json ( .. tag -- )
?prepare-tt-slit case
...
'tt-slit of json-string! endof
...
endcase
;
Multiple entry points for the Forth recognizer
@BerndPaysan writes:
This also makes it easier to add recognizers where they belong, e.g. when you add the scope recognizer, you just push them to the end of the the names recognizer stack. If you add the floating point recognizer, the complex recognizer (both are numbers), or the hex floating point recognizer for exact notation of floating point constants, you just push them to the back of the numbers recognizer stack, and they get ahead of the others. I like this solution.
Yes, I also consider such a solution. It's a convenient solution to implement the default Forth recognizer.
But requiring the Forth recognizer to always conform this particular structure of recognizer sequences, and even always be the same instance of this structure, is too restrictive.
And otherwise you don't know the id of the actual names recognizer sequence (and even don't know whether such a sequence exists), and so you cannot check a lexeme against only this sequence (I mean, in implementation of recognize-tick
).
Filter recognizer results
Bernd, your word try-recognize
is a good factor to filter results, regardless side effects (beyond stacks). Having recognizers without side effects, it can be also implemented in a portable way.
If this word filters for a single token type, it's better to pass a corresponding tt directly (instead of xt.filter).
If this word allows to filter for multiple token types (I assume this variant), it should not drop tt from the stack.
Also, to be more useful, this word should not be bound to the current Forth recognizer only. Then, this word can be called as
apply-recognizer-filter
( sd.lexeme xt.recognizer xt.filter -- i*x tt | 0 )`.
A usage example:
: recognize-forth-name ( sd.lexeme -- nt tt.nt | 0 )
forth-recognizer [: dup 'tt-nt = ;] apply-recognizer-filter
;
: find-forth-name ( sd.lexeme -- nt | 0 )
forth-recognizer [: dup 'tt-nt = ;] apply-recognizer-filter if exit then 0
;
: find-forth-name? ( sd.lexeme -- nt true | false )
forth-recognizer [: dup 'tt-nt = ;] apply-recognizer-filter 0<>
;
: recognize-tick ( sd.lexeme -- xt tt.xt | 0 )
"'" match-head 0= if 2drop 0 exit then ( sd2 ) \ the input lexeme without the leading tick
forth-recognizer [: dup 'tt-xt = ;] apply-recognizer-filter
;
The cleaning up is already cumbersome, because a variable number of values can be returned on both data and floating point stack, and when in addition to that also >IN can change, it's just a little bit more hustle.
Yes, but, as I show, >in
is not enough. Also, it's better to avoid such special cases in general.
proposal - Exclude zero from the data types that are identifiers
In the 2022 meeting, there was lots of discussion, especially about the validity of file-ids with the value 0. When asked, none of the participants could name a system that has a problem with disallowing 0 as address, and none claimed that he had never used 0 as impossible address.
This proposal satisfies the formality criteria and is therefore promoted to formal. Please promote it to CfV when you think that you are not going to change it anymore (proposals in CfV state must not be revised).
This wording change has been accepted with vote #31 10Y:0:1A (started at the 2022 meeting).
This proposal satisfies the formality criteria and is therefore promoted to formal. Please promote it to CfV when you think that you are not going to change it anymore (proposals in CfV state must not be revised).
This is a wording change, not a substantive change, so I don't think a CfV makes much sense. The committee voted (#32) in a vote starting at 2022-09-17, and the result is 8Y:1N:1A. This is usually considered to be enough for a consensus. Unfortunately, we did not discuss this vote at the 2023 meeting, so should we move it to Accepted, or should we discuss it at the next meeting? If the latter, please put it on the Agenda.
This was accepted with vote #30 10Y:0:1A after the 2022 meeting.
This was accepted with vote #29 10Y:0N:1A after the 2022 meeting.
Author
Ruv
Change Log
(the latest at the top)
- 2022-09-19 explicitly allow a short formula, describe what it means, better wording, fix some typos
- 2022-08-13 Initial version
Preceding history
(the latest at the top)
2022-08-12 Revert rewording the term "execution token" (proposal).
2021-09-08 Reword the term "execution token" (the accepted version).
2020-09-03 An attempt to solve the problem in
NAME>INTERPRET
by change meaning of "execution token" — Reword the term "execution token"2020-02-20 Pointing out a problem in NAME>INTERPRET wording
Problem
By the definition of the term "execution token" in Forth-94 and Forth-2012, it's a value that identifies execution semantics. Can such value identify other behavior, e.g. some interpretation semantics or compilation semantics? It's unclear at the first glance.
Solution
Actually, an execution token can identify other semantics too, but only if they are equivalent to the execution semantics that this token also identifies.
It is so because for any execution token there exists at least one named or unnamed Forth definition the execution semantics of which are identified by this execution token. So, in any case, an execution token always identifies some execution semantics, but accidentally these semantics can be equivalent to some interpretation semantics, or some compilation semantics, and then it identifies them too. It's unnecessary that they connected to the same Forth definition. Also, consequently, it's impossible that an execution token identifies some compilation semantics, or some interpretation semantics, but doesn't identify the equivalent execution semantics.
To solve the initial problem we can state these basics explicitly in a normative part.
Example
: foo postpone if ;
:noname postpone if ; ( xt )
The execution semantics of foo
are equivalent to the compilation semantics for if
.
In the same time, a Forth system may provide system-dependent execution semantics of if
that are not equivalent to the execution semantics of foo
.
xt, which is left on the stack, identifies the execution semantics of an anonymous Forth definition, and these execution semantics are equivalent to the compilation semantics for if
.
Typical use
"xt identifies the compilation semantics for the word
FOO
"- It means that the execution token xt identifies the execution semantics which are equivalent to the compilation semantics for the word
FOO
. - In the same time the execution semantics of the word
FOO
can be different form the execution semantics identified by this xt.
- It means that the execution token xt identifies the execution semantics which are equivalent to the compilation semantics for the word
"the execution token for the word
BAR
"- It means that this execution token identifies the execution semantics of the word
BAR
.
- It means that this execution token identifies the execution semantics of the word
The execution semantics identified by xt are equivalent to the interpretation semantics for the word
BAZ
.- This seems pretty clear.
Wrong use
Actually, the standard contains only one place where the "execution token" notion is used ambiguously in a normative part — the glossary entry for FIND
. Since it says that FIND
returns the execution token for a word, but actually this token cannot identify the execution semantics of this word in some cases in dual-xy systems.
In another glossary entry — for NAME>INTERPRET
— the language is just slightly non normative, since it uses the form "xt represents" instead of the form "xt identifies".
These entries have not only this but also some other problems, so they should be corrected anyway, and my proposals for them are in progress.
Proposal
Add into beginning of the section 3.1.3.5 Execution tokens the following paragraphs:
For any execution token there exists at least one Forth definition (named or unnamed) the execution semantics of which are identified by this execution token.
The execution semantics identified by an execution token can be equivalent to the interpretation semantics or compilation semantics for some word, or to some run-time semantics. In such a case this execution token also identifies the corresponding interpretation semantics, compilation semantics, or run-time semantics.
Unless otherwise indicated, the execution token for a named Forth definition identifies the execution semantics for this definition.
Author
Ruv
Change Log
- 2020-02-20 Initial comment for
NAME>INTERPRET
- 2023-09-14 Make this proposal more formal
- 2023-09-19 Add rationale, better wording, fix some typos
Problem
Currently the specification for name>interpret
says that returned "xt represents the interpretation semantics of the word nt".
But actually, in some cases a Forth system cannot provide an xt that performs the defined interpretation semantics for the corresponding word regardless of STATE
.
Particularly, when a word like s"
or to
is implemented as STATE-dependent immediate word. Technically it is possible to return a correct xt according to the current specification (e.g. via generation of the corresponding definition on the fly), but it can be too burden.
Another minor problem is that it's not clear what the term represent means. According to the language of the standard, xt identifies some semantics.
Rationale
It should be clear from the specification how, having an nt, to perform the very behavior that a Forth system performs when the Forth text interpreter encounters, in interpretation state, the name of a word identified by this nt.
Namely, to perform this behavior, must the xt returned from name>interpret
be executed only interpretation state, or it may be also executed in compilation state.
Solution
The specification for name>interpret
can be adjusted to solve the mentioned problem.
There are two options:
Allow to return
0
if the system cannot return xt that identifies the interpretation semantics for the word identified bynt
(see also the clarification re execution tokens).- Probably, in this case we have to introduce a word like
name>
(experimental in Forth-83), which returns xt that identifies the execution semantics of the word identified bynt
. Since otherwise a user-defined Forth text interpreter is impossible without correctfind
.
- Probably, in this case we have to introduce a word like
Allow to return state-dependent xt, which performs the interpretation semantics for the word in interpretation state only.
In this proposal I stick with the second option.
If anyone prefers the first option of has other objections, please feel free to share your ideas in a comment.
Proposal
Replace the following phrase in the section 15.6.2.1909.20 NAME>INTERPRET:
xt represents the interpretation semantics of the word nt. If nt has no interpretation semantics,
NAME>INTERPRET
returns 0.
by the following phrase:
xt identifies the execution semantics of the word identified by nt. When this xt is executed in interpretation state, the interpretation semantics for the word are performed. If a Forth system does not provide such execution token for the word,
NAME>INTERPRET
returns 0.
The original contributor states that 2C@ and 2C! are for 16-bit access and this is not quite correct. They are for pairs of bytes in memory (or registers on some embedded systems). FORTH, Inc. has supplied 2C@ and 2C! for at least the past 30 years with our cross compilers for embedded systems. I just did a quick search of about 20 embedded Forth applications I have here on my computer and I found 180 instances of 2C@ and 132 instances of 2C!. Here's a simple example:
2 BUFFER: LBAT \ Current and previous battery status
\ Update battery status, keeping previous
: !LBAT ( x -- ) LBAT C@ SWAP LBAT 2C! ;
\ Return true if battery status has changed
: ?LBAT ( -- flag ) LBAT 2C@ <> ;
However, 2C@ and 2C! are easily built from Standard words, so we've never proposed they be standardized.
Author
Ruv
Change Log
- 2020-02-20 Initial comment for
NAME>INTERPRET
- 2023-09-14 Make this proposal more formal
- 2023-09-19 Add rationale, better wording, fix some typos
- 2023-09-21 Add rationale re system-defined semantics, and declare an ambiguous condition
Problem
Currently the specification for name>interpret
says that returned "xt represents the interpretation semantics of the word nt".
But actually, in some cases a Forth system cannot provide an xt that performs the defined interpretation semantics for the corresponding word regardless of STATE
.
Particularly, when a word like s"
or to
is implemented as a STATE-dependent immediate word. Technically, it is possible to return a correct xt according to the current specification (e.g. via generation of the corresponding definition on the fly), but it can be too burden.
Another minor problem is that it's not clear what the term represent means. According to the language of the standard, xt identifies some semantics.
Rationale
How to perform interpretation semantics
It should be clear from the specification how, having an nt, to perform the very behavior that a Forth system performs when the Forth text interpreter encounters, in interpretation state, the name of a word identified by this nt.
Namely, to perform this behavior, must the xt returned from name>interpret
be executed only interpretation state, or it may be also executed in compilation state.
System-defined semantics
If the standard does not define interpretation semantics for a word, a Forth system may provide system-defined interpretation semantics for the word (see A.3.4.3.2).
The same is true for execution semantics — if the standard does not define them for a word, a Forth system may provide system-defined execution semantics for the word. But, due to 6.1.0070, performing these execution semantics in interpretation state must be always equivalent to performing interpretation semantics for the word (regardless whether they are standard-defined or system-defined).
Solution
The specification for name>interpret
can be adjusted to solve the mentioned problem.
There are two options:
Allow to return
0
if the system cannot return xt that identifies the interpretation semantics for the word identified bynt
(see also the clarification re execution tokens).- Probably, in this case we have to introduce a word like
name>
(experimental in Forth-83), which returns xt that identifies the execution semantics of the word identified bynt
. Since otherwise a user-defined Forth text interpreter is impossible without correctfind
.
- Probably, in this case we have to introduce a word like
Allow to return state-dependent xt, which performs the interpretation semantics for the word in interpretation state only.
In this proposal I stick to the second option.
If anyone prefers the first option or has other objections, please feel free to share your ideas in a comment.
Proposal
Replace the following paragraph in the section 15.6.2.1909.20 NAME>INTERPRET:
xt represents the interpretation semantics of the word nt. If nt has no interpretation semantics,
NAME>INTERPRET
returns 0.
by the following paragraphs:
xt identifies the execution semantics of the word identified by nt. When this xt is executed in interpretation state, the interpretation semantics for the word are performed. If a Forth system does not provide such execution token for the word,
NAME>INTERPRET
returns 0.An ambiguous condition exists in any of the following conditions:
- interpretation semantics for the word are not defined by this standard and xt is executed;
- execution semantics of the word are not defined by this standard and xt is executed in compilation state;
proposal - Tighten the specification of SYNONYM (version 1)
SwiftForth implemented SYNONYM starting in version 3.11.0 (23-Feb-2021) and updated with improvements based on a contribution from comp.lang.forth discussion sent to us by Anton Ertl in version 3.11.2 (22-Jun-2021). As of version 3.12.0 (21-Sep-2023), the same xt is returned by ' for both the original word and its synonym so their behaviors should be identical in all respects.
proposal - Tighten the specification of SYNONYM (version 1)
Author
Ruv
Change Log
- 2020-02-20 Initial comment for
NAME>INTERPRET
- 2023-09-14 Make this proposal more formal
- 2023-09-19 Add rationale, better wording, fix some typos
- 2023-09-21 Add rationale re system-defined semantics, and declare an ambiguous condition
- 2023-09-22 Require xt if interpretation semantics for the word are defined by the standard
Problem
Currently the specification for name>interpret
says that returned "xt represents the interpretation semantics of the word nt".
But actually, in some cases a Forth system cannot provide an xt that performs the defined interpretation semantics for the corresponding word regardless of STATE
.
Particularly, when a word like s"
or to
is implemented as a STATE-dependent immediate word. Technically, it is possible to return a correct xt according to the current specification (e.g. via generation of the corresponding definition on the fly), but it can be too burden.
Another minor problem is that it's not clear what the term represent means. According to the language of the standard, xt identifies some semantics.
Rationale
How to perform interpretation semantics
It should be clear from the specification how, having an nt, to perform the very behavior that a Forth system performs when the Forth text interpreter encounters, in interpretation state, the name of a word identified by this nt.
Namely, to perform this behavior, must the xt returned from name>interpret
be executed only interpretation state, or it may be also executed in compilation state.
System-defined semantics
If the standard does not define interpretation semantics for a word, a Forth system may provide system-defined interpretation semantics for the word (see A.3.4.3.2).
The same is true for execution semantics — if the standard does not define them for a word, a Forth system may provide system-defined execution semantics for the word. But, due to 6.1.0070, performing these execution semantics in interpretation state must be always equivalent to performing interpretation semantics for the word (regardless whether they are standard-defined or system-defined).
Connection with Tick
It we want ticking any word for which interpretation semantics are defined by the standard, name>interpret
cannot return 0 for these words.
Forth systems in which name>interpret
returns 0 for such a word are unknown to the author.
Solution
The specification for name>interpret
can be adjusted to solve the mentioned problem.
There are two options:
Allow to return
0
if the system cannot return xt that identifies the interpretation semantics for the word identified bynt
(see also the clarification re execution tokens).- Probably, in this case we have to introduce a word like
name>
(experimental in Forth-83), which returns xt that identifies the execution semantics of the word identified bynt
. Since otherwise a user-defined Forth text interpreter is impossible without correctfind
.
- Probably, in this case we have to introduce a word like
Allow to return state-dependent xt, which performs the interpretation semantics for the word in interpretation state only.
In this proposal I stick to the second option.
If anyone prefers the first option or has other objections, please feel free to share your ideas in a comment.
Proposal
Replace the following paragraph in the section 15.6.2.1909.20 NAME>INTERPRET:
xt represents the interpretation semantics of the word nt. If nt has no interpretation semantics,
NAME>INTERPRET
returns 0.
by the following paragraphs:
xt identifies the execution semantics of the word identified by nt. When this xt is executed in interpretation state, the interpretation semantics for the word are performed.
If and only if interpretation semantics for the word are not defined by this standard and the Forth system does not provide the execution token for the word,
NAME>INTERPRET
returns 0.An ambiguous condition exists in any of the following conditions:
- interpretation semantics for the word are not defined by this standard and xt is executed;
- execution semantics of the word are not defined by this standard and xt is executed in compilation state;
Probably, creating a synonym should not be allowed for a local variable (if it's too difficult to implement in some systems). Then, the corresponding ambiguous condition should be declared:
An ambiguous condition exists if name.old is a local variable.
If we can't assume that compile,
is equivalent to lit, ['] execute compile,
, then instead of the phrase:
xt is executed
we should use the phrase:
the execution semantics identified by xt are performed
I want to stick to the short wording, and assume this equivalency.
Change Log
- 2020-09-23 Initial version
- 2020-09-23 Factor out Problem section, add a clause re the same data-field address, better wording, better formatting, fix typos
Problem
Some additional problems
According to 2.2.3 Parsed-text notation, "name" has the specified meaning when it's mentioned literally as "name" only. So, newname and oldname are not normative.
- We can use indexing as for data type symbols in stack diagrams, and when an index not a number, it can be concatenated via a dot.
The phrase "For both strings skip leading space delimiters" is unsound, since the operation to "skip leading space delimiters" is not defined for strings, but for the input buffer only (see 3.4.1.1 Delimiters).
Rationale
Instead of mentioning SYNONYM
in many other glossary entries, it's better to use another wording, which does not require that.
In some cases, it's better to specify a glossary entry in a more general way, rather than mentioning SYNONYM
in it (see the proposal below).
If the standard does not define some semantics, we should ensure that system-defined semantics are the same for synonyms.
If the standard defines some semantics, system-defined semantics shall be equivalent to them, so it's enough to mention system-defined semantics.
In some plausible Forth system implementations the execution token for a word
is identical to the name token for this word,
and the execution tokens for the different words are always different. So
SYNONYM
should not require the same execution token (if any) for name.new and name.old.
The part about deferred words doesn't seem formal enough.
Examples
A synonym of the word CREATE
creates a word with the data field address due to the same execution semantics as of CREATE
:
synonym mycreate create
mycreate foo 123 ,
A synonym of a word, which is created via performing the execution semantics of CREATE
, returns the same data field address due to the same execution semantics as the original word:
synonym bar foo
bar foo = . \ prints -1
Consequently, the execution token for a synonym is associated with the same data field address (if any) as the xt of the original word:
' bar >body ' foo >body = . \ prints -1
It is also true for a synonym of a synonym:
synonym baz bar
' baz >body ' bar >body = . \ prints -1
A synonym of a word created via performing the execution semantics of VALUE
returns the same value due to the same execution semantics as the original word:
1 value x
synonym y x
x y = . \ prints -1
Applying TO
to a synonym of a word, which is created via performing the execution semantics of VALUE
, changes the value assigned to the original word due to the same "TO name Run-time" semantics:
2 to y
x y = . \ prints -1
Proposal (draft)
Re data-field address
In
6.1.0550 >BODY
,
6.1.1250 DOES>
,
Replace the phrase:
defined via
CREATE
with the phrase:
defined via performing
CREATE
execution semantics
Replace the phrase:
defined with
CREATE
or a user-defined word that callsCREATE
.
with the phrase:
defined via performing
CREATE
execution semantics
Re deferred words
In
6.2.1725 IS
,
6.2.0698 ACTION-OF
,
6.2.1177 DEFER@
,
6.2.1175 DEFER!
,
Replace the phrase:
defined by
DEFER
with the phrase:
defined via performing
DEFER
execution semantics
Re SYNONYM
Replace the whole normative part of the glossary entry 15.6.2.2264 SYNONYM
by the following paragraphs:
( "<spaces>name.new" "<spaces>name.old" -- )
Skip leading space delimiters, parse name.new delimited by a space, skip leading space delimiters, parse name.old delimited by a space.
Find name.old. Create a definition for name.new with the semantics defined below.
name.new and name.old may be identical. The execution token for name.new may differ from the execution token for name.old.
An ambiguous condition exists in any of the following conditions:
- name.old is not found;
- name.old is for a local variable;
IMMEDIATE
is performed when the definition for name.new is the most recent definition.
All the ambiguous conditions that exist when using name.old, also exist when using name.new.
If execution semantics are defined by the system for name.old:
name.new Execution: Perform the execution semantics of name.old.
If interpretation semantics are defined by the system for name.old:
name.new Interpretation: Perform the interpretation semantics for name.old.
If compilation semantics are defined by the system for name.old:
name.new Compilation: Perform the compilation semantics for name.old.
If "TO name.old Run-time" semantics are defined by the system for name.old:
TO name.new Run-time: Perform "TO name.old Run-time" semantics.
If the data-field address is associated by the system with the xt for for name.old:
The data-field address associated with the xt for name.new is the same as the data-field address associated with the xt for name.old.
If name.old is defined via performing the execution semantics of DEFER
, or is a synonym of such a word:
- "IS name.new": perform "IS name.old"
- "ACTION-OF name.new": perform "ACTION-OF name.old"
- Applying
DEFER@
to the xt of name.new: performDEFER@
to the xt of name.old - Applying
DEFER!
to the xt of name.new: performDEFER!
to the xt of name.old