Digest #233 2023-09-23

Contributions

[309] 2023-09-22 17:06:27 ruv wrote:

comment - A better approach for SYNONYM wording

Rationale

Instead of mentioning SYNONYM in many other glossary entries, it's better to use another wording, which does not require that.

In some cases, it's better to specify a glossary entry in a more general manner, than mention SYNONYM in it (see the proposal below).

According to 2.2.3 Parsed-text notation, "name" has the specified meaning when it's mentioned literally as "name" only. So, newname and oldname are not normative. Probably, we can use indexing as for data type symbols in stack diagrams. When an index not a number, it is concatenated via a dot.

We should ensure that system-defined semantics are the same for synonyms.

If the standard defines some semantics, system-defined semantics shall be equivalent to them, so it's enough to mention system-defined semantics.

In some plausible Forth system implementations the execution token for a word is identical to the name token for this word, and the execution tokens for the different words are always different. So SYNONYM should not require the same execution token (if any) for name.new and name.old.

The part about deferred words is not enough formal.

Examples

A synonym of the word CREATE creates a word with the data field address due to the same execution semantics as of CREATE:

synonym mycreate create
mycreate foo 123 ,

A synonym of a word created via performing the execution semantics of CREATE returns the same data field address due to the same execution semantics as the original word:

synonym bar foo
bar foo = . \ prints -1

Consequently, the execution token for a synonym is associated with the same data field address (if any) as the xt of the original word:

' bar >body ' foo >body = . \ prints -1

A synonym of a word created via performing the execution semantics of VALUE returns the same value due to the same execution semantics as the original word:

1 value x
synonym y x
x y = . \ prints -1

Applying TO to a synonym of a word created via performing the execution semantics of VALUE changes the value assigned to the original word due to the same TO name Run-time semantics:

2 to y
x y = . \ prints -1

Proposal (draft)

Re data-field address

In 6.1.0550 >BODY, 6.1.1250 DOES>,

Replace the phrase:

defined via CREATE

by the phrase:

defined via performing CREATE execution semantics

Replace the phrase:

defined with CREATE or a user-defined word that calls CREATE.

by the phrase:

defined via performing CREATE execution semantics

Re deferred words

In 6.2.1725 IS, 6.2.0698 ACTION-OF, 6.2.1177 DEFER@, 6.2.1175 DEFER!,

Replace the phrase:

defined by DEFER

by the phrase:

defined via performing DEFER execution semantics

Re SYNONYM

Replace the whole normative part of the glossary entry 15.6.2.2264 SYNONYM by the following paragraphs:

( "<spaces>name.new" "<spaces>name.old" -- )

Skip leading space delimiters, parse name.new delimited by a space, skip leading space delimiters, parse name.old delimited by a space.

Find name.old. Create a definition for name.new with the semantics defined below; name.new may be the same as name.old.

The execution token for name.new may differ from the execution token for name.old.

An ambiguous condition exists if name.old is not found or IMMEDIATE is performed when name.new is the most recent definition.

All the ambiguous conditions that exist when using name.old, also exist when using name.new.

If execution semantics are defined by the system for name.old:

name.new Execution: Perform the execution semantics of name.old.

If interpretation semantics are defined by the system for name.old:

name.new Interpretation: Perform the interpretation semantics for name.old.

If compilation semantics are defined by the system for name.old:

name.new Compilation: Perform the compilation semantics for name.old.

If "TO name.old Run-time" semantics are defined for name.old:

TO name.new Run-time: Perform TO name.old Run-time semantics.

If name.old are defined via performing the execution semantics of DEFER:

IS name.new

performs IS name.old

ACTION-OF name.new

performs ACTION-OF name.old

Applying DEFER@ to the xt of name.new

performs DEFER@ to the xt of name.old

Applying DEFER! to the xt of name.new

performs DEFER! to the xt of name.old

Replies

[r1083] 2023-09-15 17:58:21 Klaus_Schleisiek replies:

proposal - Input values other than true and false

This code is meant as a reference implementation for bracket_IF and friends:

\ ---------------------------------------------------------------------- \ @file : bracket_IF.fs \ ----------------------------------------------------------------------
\ Last change: KS 15.09.2023 19:45:47 \ @author: Klaus Schleisiek \ @copyright: public domain
\ Traditionally, string comparison has been used to process [IF]. \ This version uses FIND instead. \ Multiline comment * ... \ ... *\ has been added, because it is trivial. \ Conditional clauses may be commented out using (, , or * \ ----------------------------------------------------------------------

: ?EXIT ( flag -- ) postpone IF postpone EXIT postpone THEN ; immediate : case? ( n1 n2 -- n1 ff | tf ) over = dup IF nip THEN ;

Defer [ELSE] ( -- ) immediate

: [IF] ( flag -- ) ?EXIT postpone [ELSE] ; immediate : [THEN] ( -- ) ; immediate : [NOTIF] ( flag -- ) 0= postpone [IF] ; immediate : [IFDEF] ( <name> -- ) postpone [DEFINED] postpone [IF] ; immediate : [IFUNDEF] ( <name> -- ) postpone [DEFINED] postpone [NOTIF] ; immediate

\ ---------------------------------------------------------------------- \ NEXT-WORD returns the xt of a word in the search order. \ Words, which are not found, will be skipped. \ 0 will be returned when the end of file is reached. \ ---------------------------------------------------------------------- : next-word ( -- xt | 0 ) BEGIN BEGIN BL word dup c@ WHILE find ?EXIT drop REPEAT drop refill 0= UNTIL 0 ; : *\ ( -- ) ; immediate \ end of multi-line comment : * ( -- ) BEGIN next-word dup 0= swap ['] *\ = or UNTIL ; immediate

Variable Nestlevel 0 Nestlevel ! \ nesting level counter

: nest ( -- ) 1 Nestlevel +! ; : unnest ( -- ) Nestlevel @ 1 - 0 max Nestlevel ! ; \ don't decrement below zero

: [if]-decode ( xt -- flag ) ['] [IF] case? IF nest false EXIT THEN ['] [NOTIF] case? IF nest false EXIT THEN ['] [IFDEF] case? IF nest false EXIT THEN ['] [IFUNDEF] case? IF nest false EXIT THEN ['] [ELSE] case? IF Nestlevel @ 0= EXIT THEN ['] [THEN] case? IF Nestlevel @ 0= unnest EXIT THEN ['] \ case? IF postpone \ false EXIT THEN \ needed to be able to e.g. comment out [THEN] ['] ( case? IF postpone ( false EXIT THEN \ needed to be able to e.g. comment out [THEN] ['] * case? IF postpone * false EXIT THEN \ needed to be able to e.g. comment out [THEN] 0= abort" [THEN] missing" \ end-of-file reached? \ all oter xt's are ignored false ; :noname ( -- ) BEGIN next-word [if]-decode UNTIL ; IS [ELSE]


[r1085] 2023-09-16 01:08:02 ruv replies:

proposal - minimalistic core API for recognizers

An example of use translators for two different purposes:

\ use "tt-lit" and "tt-2lit" just to call these token translators:

: tt-3lit ( 3*x -- 3*x | )
  >r tt-2lit  r> tt-lit
;

: recognize-forth-lexeme ( sd -- i*x tt ) forth-recognizer execute ;


\ use "tt-xt" to analyze a token type:

: recognize-tick ( sd -- xt tt.xt | 0 )
  "'" match-head 0= if 2drop 0 exit then  ( sd2 ) \ the input lexeme without the leading tick
  ['] recognize-forth-lexeme execute-balance2 ( i*x tt|0 n.data-stack n.float-stack )
  2>r dup ['] tt-xt = if 2rdrop exit then drop 2r> fndrop ndrop 0
;

In this implementation for recognize-tick (not tested), the phrase 'foo::bar::baz will work correctly and returns xt of the word baz in the wordlist bar in the wordlist foo, when recognize-pqname for the syntax "::" (example) is a part of forth-recognizer.

To implement this, we do a nesting call of the forth recognizer for another lexeme and then analyze the returned type. If the returned type is not appropriate, we drop the token (from the data stack, and from the floating-point stack, if any). So we need to be sure that calling recognize-forth-lexeme never causes any side effect (other than stacks), even when recognizing succeeds.

NB: when recognize-tick is a part of the current forth-recognizer, executing of recognize-forth-lexeme on some inputs will produce indirectly recursive call of recognize-forth-lexeme (as intended).


[r1086] 2023-09-16 02:12:54 ruv replies:

proposal - minimalistic core API for recognizers

Tough question: The string recognizer has a side effect, which is not good. Moving that side effect to the translator is causing other problems, because TRANSLATE-STRING no longer has the corresponding string on the stack, but needs parsing it later.

  1. It's pretty allowed for a translator to parse the input buffer and/or read the input stream. Some token translators will even do nesting calls of the Forth text interpreter and can throw exceptions.

  2. A problem that a part of the string can be in the input buffer (or even in the input stream) is solved via introducing two translators for strings: one accepts the full string from the stack (e.g. tt-slit), and another (e.g. tt-slit-parsing) accepts the starting part from the stack, and the tail from the input buffer (or input stream). The string recognizer returns one or another depending whether a lexeme is a completed string, or the start of the string only.

I published a reference implementation in 2019, and now updated it for the current proposal.

A string recognizer can be as follows:

: quot ( -- sd.quot ) s\" \"" ;

: recognize-string ( sd.lexeme -- sd tt.slit|tt.slit-parsing | 0 )
  quot match-head 0= if 2drop 0 exit then quot match-tail if ['] tt-slit exit then
  2dup quot contains if 2drop 0 exit then \ fail if '"' is found in the middle of the string
  ['] tt-slit-parsing
;

[r1087] 2023-09-16 21:32:13 BerndPaysan replies:

proposal - minimalistic core API for recognizers

The code which I have simply looks like this:

['] translate-string of  json-string!           endof

Returning something half-done isn't a good idea and makes maintaining this code difficult, as all other possibles results (like ints, floats or such) are fully converted into something useful at this stage.

Thinking a bit more about that, I found:

  1. The thing you want to nest is the translator for names
  2. As I said, names should be first, numbers second and the rest third
  3. We have nestable recognizer sequences

So one solution would be to put all recognizers that return nts+translate-nt (or variants of that, e.g. locals have a variant of translate-nt that differs for postpone) in one recognizer stack, which has a name, and can be called without calling the entire recognizer stack. These recognizers have now a predictable effect, and no side effect. Since you can't tick locals, you still have to check for translate-nt, but that's ok. You don't have to go through all weird other recognizers.

In Gforth, .recognizers now can handle and display nested recognizers, and if you split this up like that, it would output:

.recognizers  ~names ( ~nt ( Forth Forth Root ) ~scope ) ~numbers ( ~num ~float ) ~others ( ~string ~to ~dtick ~tick ~body ~complex ~env ~meta )

The ~ is there to abbreviate recognize- (or rec- now).

This also makes it easier to add recognizers where they belong, e.g. when you add the scope recognizer, you just push them to the end of the the names recognizer stack. If you add the floating point recognizer, the complex recognizer (both are numbers), or the hex floating point recognizer for exact notation of floating point constants, you just push them to the back of the numbers recognizer stack, and they get ahead of the others. I like this solution.

The other solution is what Gforth does: There's a ?REC-NT which does the nesting, the checking for translate-nt, and the cleaning up of the side effects (stacks and >IN). There is the possibility to make this more generic, e.g. create a word TRY-RECOGNIZE which gets an xt, passes that to the result, and if that returns false, everything is cleaned up and false is returned, otherwise whatever that xt left (including the flag) is returned.

The cleaning up is already cumbersome, because a variable number of values can be returned on both data and floating point stack, and when in addition to that also >IN can change, it's just a little bit more hustle.


[r1088] 2023-09-17 16:31:14 ruv replies:

proposal - minimalistic core API for recognizers

One correction. I wrote:

If we want to reflect this idea, we can use the acronym tt, which stands for both: token translator and token type.

It should be read as:

If we want to reflect this idea, we can use the acronym tt, which stands for both: "translate token" (verb) and "token type" (noun).

Data type symbol

To specify formal requirements, we have to introduce a new data type for token translators, which is a subtype of xt. And the abbreviation tt is a good candidate for this data type symbol.

If we will have the data type tt => xt|0, and the symbol sd for the string data type, the naming convention along with the stack diagram for a recognizer can be expressed as:

RECOGNIZE-{lexeme-type-symbol} ( sd.lexeme -- i*x tt ) ( F: -- j*r )


[r1089] 2023-09-17 23:29:13 ruv replies:

proposal - minimalistic core API for recognizers

@BerndPaysan writes:

Returning something half-done isn't a good idea and makes maintaining this code difficult, as all other possibles results (like ints, floats or such) are fully converted into something useful at this stage.

This would be a valid argument if it were possible to return something useful from a recognizer in all use cases but single-line string literals. But it's impossible.

For example, a recognizer for multi-line string literals cannot parse the full string literal without refilling the source (see my PoC implementation). Should we also restore the input source state to isolate side effects of recognizers?

And it still isn't enough. A recognizer for curly-based markup like foo{ any forth code bar{ nested code }bar ... }foo cannot return something useful, since a useful thing in this case is a created definition or just a side effect of appending some semantics to the current definition. Should we also restore the state of the dictionary?

I think, it's obvious — isolation of all possible side effects of recognizers is not fruitful.

Yes, some recognizers returns objects that are not useful by themself, but they still return information what a given lexeme means, and it's an acceptable price for absent side effects for all recognizers.

Also we separate concerns into things that do have side effects (token translators) and things that don't have side effects (recognizers). It's very useful separation.


[r1090] 2023-09-17 23:52:08 ruv replies:

proposal - minimalistic core API for recognizers

@BerndPaysan writes:

The code which I have simply looks like this:

['] translate-string of  json-string!           endof

A straightforward solution is to handle each token type of string literals separately. Probably, I would write it as follows:

  'tt-slit           of                  json-string!    endof
  'tt-slit-parsing   of  parse-slit-end  json-string!    endof
  'tt-slit-ml        of  parse-slit-ml   json-string!    endof

(I would use a recognizer for a leading tick, and naming of translators in the form tt-{token-type-symbol})

Or I would factor a helper word as follows:

: ?prepare-tt-slit ( i*x tt -- i*x tt | sd.transient tt.slit )
  case
    'tt-slit           of                  'tt-slit endof
    'tt-slit-parsing   of  parse-slit-end  'tt-slit endof
    'tt-slit-ml        of  parse-slit-ml   'tt-slit endof
  endcase
;

: eval-json ( .. tag -- )
  ?prepare-tt-slit case
    ...
    'tt-slit           of                  json-string!    endof
   ...
  endcase
;

[r1091] 2023-09-18 01:21:10 ruv replies:

proposal - minimalistic core API for recognizers

Multiple entry points for the Forth recognizer

@BerndPaysan writes:

This also makes it easier to add recognizers where they belong, e.g. when you add the scope recognizer, you just push them to the end of the the names recognizer stack. If you add the floating point recognizer, the complex recognizer (both are numbers), or the hex floating point recognizer for exact notation of floating point constants, you just push them to the back of the numbers recognizer stack, and they get ahead of the others. I like this solution.

Yes, I also consider such a solution. It's a convenient solution to implement the default Forth recognizer.

But requiring the Forth recognizer to always conform this particular structure of recognizer sequences, and even always be the same instance of this structure, is too restrictive.

And otherwise you don't know the id of the actual names recognizer sequence (and even don't know whether such a sequence exists), and so you cannot check a lexeme against only this sequence (I mean, in implementation of recognize-tick).

Filter recognizer results

Bernd, your word try-recognize is a good factor to filter results, regardless side effects (beyond stacks). Having recognizers without side effects, it can be also implemented in a portable way.

If this word filters for a single token type, it's better to pass a corresponding tt directly (instead of xt.filter).

If this word allows to filter for multiple token types (I assume this variant), it should not drop tt from the stack.

Also, to be more useful, this word should not be bound to the current Forth recognizer only. Then, this word can be called as

apply-recognizer-filter ( sd.lexeme xt.recognizer xt.filter -- i*x tt | 0 )`.

A usage example:

: recognize-forth-name ( sd.lexeme -- nt tt.nt | 0 )
  forth-recognizer [: dup 'tt-nt = ;] apply-recognizer-filter
;

: find-forth-name ( sd.lexeme -- nt | 0 )
  forth-recognizer [: dup 'tt-nt = ;] apply-recognizer-filter  if exit then 0
;

: find-forth-name? ( sd.lexeme -- nt true | false )
  forth-recognizer [: dup 'tt-nt = ;] apply-recognizer-filter  0<>
;

: recognize-tick ( sd.lexeme -- xt tt.xt | 0 )
  "'" match-head 0= if 2drop 0 exit then  ( sd2 ) \ the input lexeme without the leading tick
  forth-recognizer [: dup 'tt-xt = ;] apply-recognizer-filter
;

The cleaning up is already cumbersome, because a variable number of values can be returned on both data and floating point stack, and when in addition to that also >IN can change, it's just a little bit more hustle.

Yes, but, as I show, >in is not enough. Also, it's better to avoid such special cases in general.


[r1092] 2023-09-18 06:25:26 AntonErtl replies:

proposal - Exclude zero from the data types that are identifiers

In the 2022 meeting, there was lots of discussion, especially about the validity of file-ids with the value 0. When asked, none of the participants could name a system that has a problem with disallowing 0 as address, and none claimed that he had never used 0 as impossible address.

This proposal satisfies the formality criteria and is therefore promoted to formal. Please promote it to CfV when you think that you are not going to change it anymore (proposals in CfV state must not be revised).


[r1093] 2023-09-18 18:31:44 AntonErtl replies:

proposal - Better wording for "Glossary notation"

This wording change has been accepted with vote #31 10Y:0:1A (started at the 2022 meeting).


[r1094] 2023-09-18 18:41:57 AntonErtl replies:

proposal - Tick and undefined execution semantics

This proposal satisfies the formality criteria and is therefore promoted to formal. Please promote it to CfV when you think that you are not going to change it anymore (proposals in CfV state must not be revised).


[r1095] 2023-09-19 05:42:36 AntonErtl replies:

proposal - Better wording for "data field" term

This is a wording change, not a substantive change, so I don't think a CfV makes much sense. The committee voted (#32) in a vote starting at 2022-09-17, and the result is 8Y:1N:1A. This is usually considered to be enough for a consensus. Unfortunately, we did not discuss this vote at the 2023 meeting, so should we move it to Accepted, or should we discuss it at the next meeting? If the latter, please put it on the Agenda.


[r1096] 2023-09-19 05:46:07 AntonErtl replies:

proposal - Formatting: spaces in data type symbols

This was accepted with vote #30 10Y:0:1A after the 2022 meeting.


[r1097] 2023-09-19 06:01:13 AntonErtl replies:

proposal - Executing compilation semantics

This was accepted with vote #29 10Y:0N:1A after the 2022 meeting.


[r1098] 2023-09-19 15:06:17 ruv replies:

proposal - Clarification for execution token

Author

Ruv

Change Log

(the latest at the top)

  • 2022-09-19 explicitly allow a short formula, describe what it means, better wording, fix some typos
  • 2022-08-13 Initial version

Preceding history

(the latest at the top)

Problem

By the definition of the term "execution token" in Forth-94 and Forth-2012, it's a value that identifies execution semantics. Can such value identify other behavior, e.g. some interpretation semantics or compilation semantics? It's unclear at the first glance.

Solution

Actually, an execution token can identify other semantics too, but only if they are equivalent to the execution semantics that this token also identifies.

It is so because for any execution token there exists at least one named or unnamed Forth definition the execution semantics of which are identified by this execution token. So, in any case, an execution token always identifies some execution semantics, but accidentally these semantics can be equivalent to some interpretation semantics, or some compilation semantics, and then it identifies them too. It's unnecessary that they connected to the same Forth definition. Also, consequently, it's impossible that an execution token identifies some compilation semantics, or some interpretation semantics, but doesn't identify the equivalent execution semantics.

To solve the initial problem we can state these basics explicitly in a normative part.

Example

: foo postpone if ;
:noname postpone if ; ( xt )

The execution semantics of foo are equivalent to the compilation semantics for if.

In the same time, a Forth system may provide system-dependent execution semantics of if that are not equivalent to the execution semantics of foo.

xt, which is left on the stack, identifies the execution semantics of an anonymous Forth definition, and these execution semantics are equivalent to the compilation semantics for if.

Typical use

  • "xt identifies the compilation semantics for the word FOO"

    • It means that the execution token xt identifies the execution semantics which are equivalent to the compilation semantics for the word FOO.
    • In the same time the execution semantics of the word FOO can be different form the execution semantics identified by this xt.
  • "the execution token for the word BAR"

    • It means that this execution token identifies the execution semantics of the word BAR.
  • The execution semantics identified by xt are equivalent to the interpretation semantics for the word BAZ.

    • This seems pretty clear.

Wrong use

Actually, the standard contains only one place where the "execution token" notion is used ambiguously in a normative part — the glossary entry for FIND. Since it says that FIND returns the execution token for a word, but actually this token cannot identify the execution semantics of this word in some cases in dual-xy systems.

In another glossary entry — for NAME>INTERPRET — the language is just slightly non normative, since it uses the form "xt represents" instead of the form "xt identifies".

These entries have not only this but also some other problems, so they should be corrected anyway, and my proposals for them are in progress.

Proposal

Add into beginning of the section 3.1.3.5 Execution tokens the following paragraphs:

For any execution token there exists at least one Forth definition (named or unnamed) the execution semantics of which are identified by this execution token.

The execution semantics identified by an execution token can be equivalent to the interpretation semantics or compilation semantics for some word, or to some run-time semantics. In such a case this execution token also identifies the corresponding interpretation semantics, compilation semantics, or run-time semantics.

Unless otherwise indicated, the execution token for a named Forth definition identifies the execution semantics for this definition.


[r1099] 2023-09-19 15:49:52 ruv replies:

proposal - NAME>INTERPRET wording

Author

Ruv

Change Log

  • 2020-02-20 Initial comment for NAME>INTERPRET
  • 2023-09-14 Make this proposal more formal
  • 2023-09-19 Add rationale, better wording, fix some typos

Problem

Currently the specification for name>interpret says that returned "xt represents the interpretation semantics of the word nt".

But actually, in some cases a Forth system cannot provide an xt that performs the defined interpretation semantics for the corresponding word regardless of STATE.

Particularly, when a word like s" or to is implemented as STATE-dependent immediate word. Technically it is possible to return a correct xt according to the current specification (e.g. via generation of the corresponding definition on the fly), but it can be too burden.

Another minor problem is that it's not clear what the term represent means. According to the language of the standard, xt identifies some semantics.

Rationale

It should be clear from the specification how, having an nt, to perform the very behavior that a Forth system performs when the Forth text interpreter encounters, in interpretation state, the name of a word identified by this nt.

Namely, to perform this behavior, must the xt returned from name>interpret be executed only interpretation state, or it may be also executed in compilation state.

Solution

The specification for name>interpret can be adjusted to solve the mentioned problem. There are two options:

  1. Allow to return 0 if the system cannot return xt that identifies the interpretation semantics for the word identified by nt (see also the clarification re execution tokens).

    • Probably, in this case we have to introduce a word like name> (experimental in Forth-83), which returns xt that identifies the execution semantics of the word identified by nt. Since otherwise a user-defined Forth text interpreter is impossible without correct find.
  2. Allow to return state-dependent xt, which performs the interpretation semantics for the word in interpretation state only.

In this proposal I stick with the second option.

If anyone prefers the first option of has other objections, please feel free to share your ideas in a comment.

Proposal

Replace the following phrase in the section 15.6.2.1909.20 NAME>INTERPRET:

xt represents the interpretation semantics of the word nt. If nt has no interpretation semantics, NAME>INTERPRET returns 0.

by the following phrase:

xt identifies the execution semantics of the word identified by nt. When this xt is executed in interpretation state, the interpretation semantics for the word are performed. If a Forth system does not provide such execution token for the word, NAME>INTERPRET returns 0.


[r1100] 2023-09-19 20:52:38 LeonWagner replies:

proposal - 2C! and 2C@

The original contributor states that 2C@ and 2C! are for 16-bit access and this is not quite correct. They are for pairs of bytes in memory (or registers on some embedded systems). FORTH, Inc. has supplied 2C@ and 2C! for at least the past 30 years with our cross compilers for embedded systems. I just did a quick search of about 20 embedded Forth applications I have here on my computer and I found 180 instances of 2C@ and 132 instances of 2C!. Here's a simple example:

2 BUFFER: LBAT   \ Current and previous battery status

\ Update battery status, keeping previous
: !LBAT ( x -- )   LBAT C@ SWAP LBAT 2C! ;   

\ Return true if battery status has changed
: ?LBAT ( -- flag )   LBAT 2C@ <> ;   

However, 2C@ and 2C! are easily built from Standard words, so we've never proposed they be standardized.


[r1101] 2023-09-21 14:15:59 ruv replies:

proposal - NAME>INTERPRET wording

Author

Ruv

Change Log

  • 2020-02-20 Initial comment for NAME>INTERPRET
  • 2023-09-14 Make this proposal more formal
  • 2023-09-19 Add rationale, better wording, fix some typos
  • 2023-09-21 Add rationale re system-defined semantics, and declare an ambiguous condition

Problem

Currently the specification for name>interpret says that returned "xt represents the interpretation semantics of the word nt".

But actually, in some cases a Forth system cannot provide an xt that performs the defined interpretation semantics for the corresponding word regardless of STATE.

Particularly, when a word like s" or to is implemented as a STATE-dependent immediate word. Technically, it is possible to return a correct xt according to the current specification (e.g. via generation of the corresponding definition on the fly), but it can be too burden.

Another minor problem is that it's not clear what the term represent means. According to the language of the standard, xt identifies some semantics.

Rationale

How to perform interpretation semantics

It should be clear from the specification how, having an nt, to perform the very behavior that a Forth system performs when the Forth text interpreter encounters, in interpretation state, the name of a word identified by this nt.

Namely, to perform this behavior, must the xt returned from name>interpret be executed only interpretation state, or it may be also executed in compilation state.

System-defined semantics

If the standard does not define interpretation semantics for a word, a Forth system may provide system-defined interpretation semantics for the word (see A.3.4.3.2).

The same is true for execution semantics — if the standard does not define them for a word, a Forth system may provide system-defined execution semantics for the word. But, due to 6.1.0070, performing these execution semantics in interpretation state must be always equivalent to performing interpretation semantics for the word (regardless whether they are standard-defined or system-defined).

Solution

The specification for name>interpret can be adjusted to solve the mentioned problem. There are two options:

  1. Allow to return 0 if the system cannot return xt that identifies the interpretation semantics for the word identified by nt (see also the clarification re execution tokens).

    • Probably, in this case we have to introduce a word like name> (experimental in Forth-83), which returns xt that identifies the execution semantics of the word identified by nt. Since otherwise a user-defined Forth text interpreter is impossible without correct find.
  2. Allow to return state-dependent xt, which performs the interpretation semantics for the word in interpretation state only.

In this proposal I stick to the second option.

If anyone prefers the first option or has other objections, please feel free to share your ideas in a comment.

Proposal

Replace the following paragraph in the section 15.6.2.1909.20 NAME>INTERPRET:

xt represents the interpretation semantics of the word nt. If nt has no interpretation semantics, NAME>INTERPRET returns 0.

by the following paragraphs:

xt identifies the execution semantics of the word identified by nt. When this xt is executed in interpretation state, the interpretation semantics for the word are performed. If a Forth system does not provide such execution token for the word, NAME>INTERPRET returns 0.

An ambiguous condition exists in any of the following conditions:

  • interpretation semantics for the word are not defined by this standard and xt is executed;
  • execution semantics of the word are not defined by this standard and xt is executed in compilation state;

[r1102] 2023-09-21 20:56:49 LeonWagner replies:

proposal - Tighten the specification of SYNONYM (version 1)

SwiftForth implemented SYNONYM starting in version 3.11.0 (23-Feb-2021) and updated with improvements based on a contribution from comp.lang.forth discussion sent to us by Anton Ertl in version 3.11.2 (22-Jun-2021). As of version 3.12.0 (21-Sep-2023), the same xt is returned by ' for both the original word and its synonym so their behaviors should be identical in all respects.


[r1103] 2023-09-21 20:58:52 LeonWagner replies:

proposal - Tighten the specification of SYNONYM (version 1)


[r1104] 2023-09-22 14:13:39 ruv replies:

proposal - NAME>INTERPRET wording

Author

Ruv

Change Log

  • 2020-02-20 Initial comment for NAME>INTERPRET
  • 2023-09-14 Make this proposal more formal
  • 2023-09-19 Add rationale, better wording, fix some typos
  • 2023-09-21 Add rationale re system-defined semantics, and declare an ambiguous condition
  • 2023-09-22 Require xt if interpretation semantics for the word are defined by the standard

Problem

Currently the specification for name>interpret says that returned "xt represents the interpretation semantics of the word nt".

But actually, in some cases a Forth system cannot provide an xt that performs the defined interpretation semantics for the corresponding word regardless of STATE.

Particularly, when a word like s" or to is implemented as a STATE-dependent immediate word. Technically, it is possible to return a correct xt according to the current specification (e.g. via generation of the corresponding definition on the fly), but it can be too burden.

Another minor problem is that it's not clear what the term represent means. According to the language of the standard, xt identifies some semantics.

Rationale

How to perform interpretation semantics

It should be clear from the specification how, having an nt, to perform the very behavior that a Forth system performs when the Forth text interpreter encounters, in interpretation state, the name of a word identified by this nt.

Namely, to perform this behavior, must the xt returned from name>interpret be executed only interpretation state, or it may be also executed in compilation state.

System-defined semantics

If the standard does not define interpretation semantics for a word, a Forth system may provide system-defined interpretation semantics for the word (see A.3.4.3.2).

The same is true for execution semantics — if the standard does not define them for a word, a Forth system may provide system-defined execution semantics for the word. But, due to 6.1.0070, performing these execution semantics in interpretation state must be always equivalent to performing interpretation semantics for the word (regardless whether they are standard-defined or system-defined).

Connection with Tick

It we want ticking any word for which interpretation semantics are defined by the standard, name>interpret cannot return 0 for these words.

Forth systems in which name>interpret returns 0 for such a word are unknown to the author.

Solution

The specification for name>interpret can be adjusted to solve the mentioned problem. There are two options:

  1. Allow to return 0 if the system cannot return xt that identifies the interpretation semantics for the word identified by nt (see also the clarification re execution tokens).

    • Probably, in this case we have to introduce a word like name> (experimental in Forth-83), which returns xt that identifies the execution semantics of the word identified by nt. Since otherwise a user-defined Forth text interpreter is impossible without correct find.
  2. Allow to return state-dependent xt, which performs the interpretation semantics for the word in interpretation state only.

In this proposal I stick to the second option.

If anyone prefers the first option or has other objections, please feel free to share your ideas in a comment.

Proposal

Replace the following paragraph in the section 15.6.2.1909.20 NAME>INTERPRET:

xt represents the interpretation semantics of the word nt. If nt has no interpretation semantics, NAME>INTERPRET returns 0.

by the following paragraphs:

xt identifies the execution semantics of the word identified by nt. When this xt is executed in interpretation state, the interpretation semantics for the word are performed.

If and only if interpretation semantics for the word are not defined by this standard and the Forth system does not provide the execution token for the word, NAME>INTERPRET returns 0.

An ambiguous condition exists in any of the following conditions:

  • interpretation semantics for the word are not defined by this standard and xt is executed;
  • execution semantics of the word are not defined by this standard and xt is executed in compilation state;

[r1105] 2023-09-22 17:19:10 ruv replies:

comment - A better approach for SYNONYM wording

Probably, creating a synonym should not be allowed for a local variable (if it's too difficult to implement in some systems). Then, the corresponding ambiguous condition should be declared:

An ambiguous condition exists if name.old is a local variable.


[r1106] 2023-09-22 18:00:09 ruv replies:

proposal - NAME>INTERPRET wording

If we can't assume that compile, is equivalent to lit, ['] execute compile,, then instead of the phrase:

xt is executed

we should use the phrase:

the execution semantics identified by xt are performed

I want to stick to the short wording, and assume this equivalency.


[r1107] 2023-09-22 22:31:29 ruv replies:

comment - A better approach for SYNONYM wording

Change Log

  • 2020-09-23 Initial version
  • 2020-09-23 Factor out Problem section, add a clause re the same data-field address, better wording, better formatting, fix typos

Problem

Some additional problems

  • According to 2.2.3 Parsed-text notation, "name" has the specified meaning when it's mentioned literally as "name" only. So, newname and oldname are not normative.

    • We can use indexing as for data type symbols in stack diagrams, and when an index not a number, it can be concatenated via a dot.
  • The phrase "For both strings skip leading space delimiters" is unsound, since the operation to "skip leading space delimiters" is not defined for strings, but for the input buffer only (see 3.4.1.1 Delimiters).

Rationale

Instead of mentioning SYNONYM in many other glossary entries, it's better to use another wording, which does not require that.

In some cases, it's better to specify a glossary entry in a more general way, rather than mentioning SYNONYM in it (see the proposal below).

If the standard does not define some semantics, we should ensure that system-defined semantics are the same for synonyms.

If the standard defines some semantics, system-defined semantics shall be equivalent to them, so it's enough to mention system-defined semantics.

In some plausible Forth system implementations the execution token for a word is identical to the name token for this word, and the execution tokens for the different words are always different. So SYNONYM should not require the same execution token (if any) for name.new and name.old.

The part about deferred words doesn't seem formal enough.

Examples

A synonym of the word CREATE creates a word with the data field address due to the same execution semantics as of CREATE:

synonym mycreate create
mycreate foo 123 ,

A synonym of a word, which is created via performing the execution semantics of CREATE, returns the same data field address due to the same execution semantics as the original word:

synonym bar foo
bar foo = . \ prints -1

Consequently, the execution token for a synonym is associated with the same data field address (if any) as the xt of the original word:

' bar >body ' foo >body = . \ prints -1

It is also true for a synonym of a synonym:

synonym baz bar
' baz >body ' bar >body = . \ prints -1

A synonym of a word created via performing the execution semantics of VALUE returns the same value due to the same execution semantics as the original word:

1 value x
synonym y x
x y = . \ prints -1

Applying TO to a synonym of a word, which is created via performing the execution semantics of VALUE, changes the value assigned to the original word due to the same "TO name Run-time" semantics:

2 to y
x y = . \ prints -1

Proposal (draft)

Re data-field address

In 6.1.0550 >BODY, 6.1.1250 DOES>,

Replace the phrase:

defined via CREATE

with the phrase:

defined via performing CREATE execution semantics

Replace the phrase:

defined with CREATE or a user-defined word that calls CREATE.

with the phrase:

defined via performing CREATE execution semantics

Re deferred words

In 6.2.1725 IS, 6.2.0698 ACTION-OF, 6.2.1177 DEFER@, 6.2.1175 DEFER!,

Replace the phrase:

defined by DEFER

with the phrase:

defined via performing DEFER execution semantics

Re SYNONYM

Replace the whole normative part of the glossary entry 15.6.2.2264 SYNONYM by the following paragraphs:

( "<spaces>name.new" "<spaces>name.old" -- )

Skip leading space delimiters, parse name.new delimited by a space, skip leading space delimiters, parse name.old delimited by a space.

Find name.old. Create a definition for name.new with the semantics defined below.

name.new and name.old may be identical. The execution token for name.new may differ from the execution token for name.old.

An ambiguous condition exists in any of the following conditions:

  • name.old is not found;
  • name.old is for a local variable;
  • IMMEDIATE is performed when the definition for name.new is the most recent definition.

All the ambiguous conditions that exist when using name.old, also exist when using name.new.

If execution semantics are defined by the system for name.old:

name.new Execution: Perform the execution semantics of name.old.

If interpretation semantics are defined by the system for name.old:

name.new Interpretation: Perform the interpretation semantics for name.old.

If compilation semantics are defined by the system for name.old:

name.new Compilation: Perform the compilation semantics for name.old.

If "TO name.old Run-time" semantics are defined by the system for name.old:

TO name.new Run-time: Perform "TO name.old Run-time" semantics.

If the data-field address is associated by the system with the xt for for name.old:

The data-field address associated with the xt for name.new is the same as the data-field address associated with the xt for name.old.

If name.old is defined via performing the execution semantics of DEFER, or is a synonym of such a word:

  • "IS name.new": perform "IS name.old"
  • "ACTION-OF name.new": perform "ACTION-OF name.old"
  • Applying DEFER@ to the xt of name.new: perform DEFER@ to the xt of name.old
  • Applying DEFER! to the xt of name.new: perform DEFER! to the xt of name.old