Digest #274 2024-08-06

Contributions

[355] 2024-08-05 11:19:09 ruv wrote:

requestClarification - Temporary removing system-compilation items

These data types denote zero or more items on the control-flow stack (see 3.2.3.2). The possible presence of such items on the data stack means that any items already there shall be unavailable to a program until the control-flow-stack items are consumed.

In the following definition I temporary consume the system-compilation items from the data stack using n>r, to access the item x already there:

: const ( x "name" -- )
  depth >r          ( x )                             ( R: n.depth )
  :                 ( x colon-sys )                   ( R: n.depth )
  depth r> -        ( x colon-sys n.colon-sys-size )  ( R: )
  n>r               ( x )                             ( R: i*x n.colon-sys-size )
  postpone literal  (   )                             ( R: i*x n.colon-sys-size )
  nr>               ( colon-sys n.colon-sys-size )    ( R: )
  drop              ( colon-sys )
  postpone ;        (   )
;

I think it's compliant with 3.1.5.1, because the item x is really "unavailable to a program until the control-flow-stack items are consumed" by n>r.

But in the message r1281 Anton says that it is non-standard.

It's unclear, why a standard program is not allowed to temporary remove system-compilation items from the data stack?

Does "consume" have a special meaning other than "remove from the stack"? For example, this term is informally used in phrases like "Forth words consume their arguments".

It this approach is non-standard, what changes to the wording of the section 3.1.5.1 could make it standard, theoretically?

BTW, I saw many case when this approach is used to access xt that :noname ( x colon-sys ) leaves on the data stack, under colon-sys.

Typical usage:

: :foo
  depth >r  :noname  depth r> - 1- roll ( colon-sys xt )
  ... \ consume xt
;
:foo ... ;

A real example: gforth/minos2/md-viewer.fs

: md-char: ( "char" -- )
    depth >r :noname depth r> - 1- roll md-char ;

Replies

[r1278] 2024-08-05 02:18:50 ruv replies:

requestClarification - Unspecified ambiguous condition in /STRING

The rationale says: "/STRING is used to remove or add characters relative to the current position in the character string. Positive values of n will exclude characters from the string while negative values of n will include characters to the left of the string."

The specification says that resulting character string has length "u1 minus n characters".

It's impossible to remove any character from a string whose length is zero. Therefore, if If n is greater than u1, the operation cannot be interpreted as removing characters from the string, and the result cannot be interpreted as a character string, despite the specification says that it's a character string.

So, I think, the case when n is greater than u1 should be somehow described to eliminate this confusion.

For example:

Note: if n is greater than u1, u2 is the result of wraparound on underflow, and c-addr2 u2 does not represent a character string.

[r1279] 2024-08-05 03:28:07 ruv replies:

requestClarification - definion and use of colon-sys and nest-sys

I couldn't find any rule or requisite relative to colon-sys or nest-sys wich has sense since it is implementation dependent but then why to mention?

They are mentioned to relax systems and tighten programs. Namely, to allow Forth systems to use the data stack and return stack in the certain cases, and to restrict programs from direct access their data on a stack (if any) in these cases.

For example, due to nest-sys you know why it's incorrect to pass parameters via the return stack as:

\ incorrect example
: foo r> . ;
: bar 123 >r foo ;

And due to colon-sys you know that this definition is incorrect:

\ incorrect example
: const ( x "name" -- )
  :  postpone literal  postpone ;
;

And this is correct:

: const ( x "name" -- )
  >r  :  r> postpone literal  postpone ;
;

A program can also use depth before and after a colon-sys is placed on the data stack to calculate its size and access data on the data stack via roll, pick, and between n>r and nr>. For example:

: const ( x "name" -- )
  depth >r  :  depth r> - n>r  postpone literal  nr> drop postpone ;
;

Also, due to the system-compilation types colon-sys and orig you know why the following program is incorrect

: foo if ;

Because "An ambiguous condition exists if an incorrectly typed data object is encountered", and in this case:

\ stack diagrams for compilation-time
: foo  ( colon-sys )
  if ( colon-sys orig )
  \ expected top value by ";"  ( colon-sys )
  \ actual top value ( orig )
; ( colon-sys -- )

[r1280] 2024-08-05 04:56:59 ruv replies:

requestClarification - definion and use of colon-sys and nest-sys

But reading carefully I think C: really means control-flow stack

Yes, it's described in 2.2.2 Stack notation.

there's no Compilation section because word colon has no compilation semantics.

No. If "Compilation" section is absent in a glossary entry, the word has default compilation semantics, see 3.4.3.3 Compilation semantics.

I think it shoud be interpretation semantics

Formally, it's execution semantics for : (Colon). When it's only one section, the label "Execution" is omitted (see 3.4.3.1).

And the interpretation semantics for : (Colon) are the same as the execution semantics (because "Interpretation" section is absent, and the execution semantics does not depend on state). See also 3.4.3.2.

you should see the initiation semantics in the definition list when you perform a see on the defined word,

It is not necessary, it may show just the source code, see 15.6.1.2194 SEE.

Example

Let's look at an example:

: foo 123 . ;

Formally, : (Colon) does the following steps in this case:

Skip leading space delimiters. Parse "foo" delimited by a space.
Start compilation for the new definition foo (so foo is now the current definition).
Enter compilation state and produce colon-sys.
Append the Initiation semantics (specified in 6.1.0450 :) to the execution semantics of foo.

NB: The section "Initiation" is not a part of execution semantics of : (Colon).

It does not matter what the Forth system actually does, as long as a standard program cannot detect the difference. So the order of these steps may be different, and some steps may be missed.

Usually, the Initiation semantics are not actually appended into foo, but are performed as a part of execute and appended by compile, (as a part of an internal call instruction or the address interpreter), when they are applied to the xt of foo. A standard program cannot detect this difference, so it does not matter.

Formally, ; (Semicolon) does the following steps in this case:

Append the Run-time semantics specified in 6.1.0460 ; to the execution semantics of foo.
End compilation of foo (so foo is not the current definition anymore).
Place foo into the compilation word list (16.2).
Enter interpretation state, consuming colon-sys.
Align the data-space pointer.

The order of these steps does not matter as long as a standard program cannot detect that.

[r1281] 2024-08-05 08:31:13 AntonErtl replies:

requestClarification - definion and use of colon-sys and nest-sys

Colon-sys is described in more detail in 3.1.5.1:

The implementation-dependent data generated upon beginning to compile a definition and consumed at its close is represented by the symbol colon-sys throughout this standard.

Nest-sys is described in more detail in 3.1.5.2:

The implementation-dependent data generated upon beginning to execute a definition and consumed upon exiting it is represented by the symbol nest-sys throughout this standard.

Furthermore, both are used in the definitions of :, ;, does> and nest-sys is used in the definition of exit, and for nest-sys some of the uses in ; and exit clarify their role.
Colon-sys is mentioned so that standard systems are allowed to push something on the data stack when starting to compile a colon definition (and standard programs must be written to deal with this situation), and also to specify that every executed : must be closed with a ;, possibly with a does> in between.

Nest-sys is also mentioned to allow standard systems to push something at run-time on the return stack and to prevent standard programs from accessing return stack entries from before the call. In addition it is needed to describe where exit and the run-time semantics of ; return to.

As described in 3.1.5.1, colon-sys is on the control-flow stack, which may be the same as the data stack. Note also that 3.1.5.1 says:

The possible presence of such items on the data stack means that any items already there shall be unavailable to a program until the control-flow-stack items are consumed.

Which means that ruv's third definition of const is non-standard; his second definition is standard and simpler anyway. So, yes, a standard program always has to assume that : puts a colon-sys on the data stack even if it can determine with depth that on this particular system the colon-sys has 0 items on the data stack.

As described in 3.1.5.2, nest-sys is on the return stack.

System types like colon-sys and nest-sys are not first-class types. E.g., there is no way to store them to buffers in memory.
Exactly, the standard defines colon-sys and nest-sys in order to allow standard systems to push something to the data stack (colon-sys) and return stack (nest-sys), so that standard programs must deal with this possibility. They are also there to specify the closing of colon definitions (colon-sys) and nesting of calls (nest-sys).
The stack comment for the execution semantics of : could actually be specified as
```
( input-stream: "\<spaces\>name" -- ; control-flow stack: -- colon-sys )
```
where "control-flow stack" is ususally writteh as "C". Concerning the input stream, the standard document never produces a separate part of the stack effect description for that, but always lets it run with some other stack effect description, this time the control-flow stack effect.

Yes, when you text-interpret :, its interpretation semantics are executed, which for : is the same as the execution semantics, which is the part that you are citing.

The part about appending the initiation semantics means, in a classical indirect-threaded implementation, that you store the address of the machine-code routine docol in the CFA of the new word, and docol then performs the initiation semantics.

Various other implementation techniques inline the initiation semantics into the caller in some or all cases. E.g., Gforth inlines it into the caller when you compile, the colon definition, but still uses docol when you execute the colon definition. Many native-code systems on IA-32 and AMD64 CPUs always call (or inline) all kinds of words, both in compile,d code and with execute, so they always inline all of the initiation semantics in the caller. I have not seen the code that native-code compilers for RISC architectures produce, but I expect that they put a part of the initiation semantics (the part that stores the return address on the return stack) at the start of the native code of the colon definition.

The initiation semantics is the first part of the execution semantics (and therefore also of the interpretation semantics) of a colon definition. The newly defined colon definition also has compilation semantics (by default, to append the execution semantics to the definition current at the time when the compilation semantics are performed). : itself also has default compilation semantics.

I think we answered all the questions, so I am closing this request. If you think that anything is unclear, please reopen it.