Proposal: S( "Request for Discussion" (revised 2018-08-16)
This proposal has been moved into this section. Its former address was: /standard/file/Sq
This page is dedicated to discussing this specific proposal
ContributeContributions
UlrichHoffmann [65] S( "Request for Discussion" (revised 2018-08-16)Proposal2018-08-17 16:27:53
S( "Request for Discussion"
Change History
2018-08-16 Improve with additional explanation, rewording
2018-07-09 First version
Problem
There have been extensive discussions about correct implementations (cf. [Ertl98], [Pelc17], [clf18]) of Forth words that have both
- an interpretation semantics and
- a compilation semantics that is different from adding the word's execution semantics to the current definition (sometimes called non-default compilation semantics, NDCS).
The Forth-94 word S"
is an example for this as it has a defined compilation semantics
(and an explicitly undefined interpretation semantics) in the CORE word set
and a defined interpretation semantics in the FILE word set. (cf. [Forth94])
Words like this can have the so called copy&paste property: Program phrases that have been tested in interpretation mode can be copied unchanged into definitions and continue to work there in a seemingly identical way. This is attractive to quite some developers.
The desire to have words with diverging compilation semantics and interpretation semantics lead many system implementors to state-smart immediate words (which inspect the variable STATE to distinguish between compilation and interpretation) but these have the drawback to fail unexpectedly in corner cases [Ertl98].
To identify whether or not a word is state-smart typically requires studying its implementation or documentation. Problems arise when these state-smart words are used as buildung blocks for more sophisticated words, e.g. by means of POSTPONE-ing them. The distinction between interpret and compile state is then also postponed and happens at inapropriate times.
There are ways to structure the outer interpreter so that words with non-default compilation semantics can be implemented without failures in corner cases. Systems such as gforth development version and VFX version 5 deal with this issue (cf. [Pelc17], [clf18]).
Looking at the Forth-94 standard only few words actually have non default compilation semantics, namely the already mentioned S" and TO. Forth-2012 adds others.
In general Forth-94 and Forth-2012 define execution semantics for normal words. When the word is interpreted this execution semantics is performed, when it is compiled its execution semantics is appended to the current definition (the standard compilation semantics). For most compiling words the standards explicitly leave the interpretation semantics undefined and define only a compilation semantics. This compilation semantics might include appending an additionally defined runtime semantics to the current definition. Colloquially these words are called compile-only. The standards also explicitly define immediate words: Words that have identical interpretation and compilation semantics: both perform the word's execution semantics. Finally there are special words with non-default compilation semantics (NDCS) where interpretation semantics and compilation semantics diverge.
The following table summarizes this situation:
word kind | interpretation semantics | compilation semantics | example | comment |
---|---|---|---|---|
normal | perform execution semantics | compile execution semantics | DUP | normal :-definitions |
immediate | perform execution semantics | perform execution semantics | .( | IMMEDIATE definitions |
compile-only | undefined | perform execution semantics | IF | interpretation semantics undefined |
NDCS | perform execution semantics for interpretation | perform execution semantics for compilation | S" | divergent interpretation, compilation |
S" and TO seem to be very special: For other words such as '
(tick) or CHAR
, or .(
(dot-paren)
the standards define similar compilation words namely [']
(bracket-tick), [CHAR]
(bracket-char), or ."
(dot-quote)
for use in compiling mode inside definitions.
Some argue that complicating the outer interpreter or dictionary design is not beneficial especially
in memory restricted small system and either live with the failures of state-smart words
in corner cases or avoid implementing an interpretive S"
to stay standard compliant.
Solution
For small systems it might be reasonable and simpler to not implement an S"
with non default compilation semantics
as defined in the FILE word set but to define two diffently named words
that captures the compilation semantics and the interpretation semantics of FILE S"
respectively.
Possible naming choices could be
S"
(s-quote) for interpretation and[S"]
(bracket-s-quote-bracket) for compilation
used in the formS" hello"
and: xxx ... [S"] hello" ... ;
orS"
(s-quote) for interpretation and
[S"
(bracket-s-quote) for compilation
used in the formS" hello"
and: xxx ... [S" hello"] ... ;
neither of which is really appealing.
Instead --- similar to ."
for string output in definitions and. .(
for
string output outside definitions --- we propose to keep CORE S"
with its behaviour and to standardize
S(
with interpretation semantics of FILE S"
but using )
(right parenthesis) as delimiter.
Proposal
Add the word S(
to the CORE Extension Word Set (CORE EXT):
S(
"s-paren" CORE EXT
Interpretation: Perform the execution semantics given below.
Compilation: Perform the execution semantics given below.
Execution:
( "ccc\<paren\>" -- c-addr u )
Parse ccc delimited by ) (right parenthesis). Store the resulting string in a transient buffer c-addr u.S(
is an immediate word.
See: Parsing, S"
Typical Use
Typical use of S(
would be interactively when temporary strings are required
for example for use with INCLUDED
:
S( s-paren.fs) INCLUDED
Remarks
As S(
is proposed to be standardized in the CORE extension word set no standard
system is required to provide S(
. However if a system chooses to implement the S"
compilation
and interpretation semantics with two separately named words, it could choose the standard name
S"
and the (not yet standardized) name S(
for this.
Defining the interpretation semantics explicitly in the glossary entry above is not strictly necessary as both Forth-94 and Forth-2012 state:
Unless otherwise specified in an “Interpretation:” section of the glossary entry, the interpretation semantics of a Forth definition are its execution semantics.
Reference implementation
With only a single string buffer, a minimal S(
-implementation could
look like this:
CREATE buf DECIMAL 80 CHARS ALLOT
: S( ( "ccc\<paren\>"-- c-addr len )
[CHAR] ) PARSE 80 MIN >R buf R@ MOVE buf R> ;
A more elabortated implementation using mutliple string buffers in a circular fashion is:
DECIMAL 80 CHARS CONSTANT |buf|
4 CONSTANT #bufs
CREATE bufs |buf| #bufs * ALLOT
VARIABLE buf# 0 buf# !
: buf ( -- c-addr )
bufs buf# @ |buf| * + ;
: bump ( -- )
buf# @ 1+ #bufs MOD buf# ! ;
: str ( char "ccc\<char\>" -- c-addr u )
bump buf SWAP PARSE |buf| MIN >R OVER R@ MOVE R> ;
: S( ( "ccc\<paren\>"-- c-addr len )
[CHAR] ) str ;
Testing
The following tests assure that S(
pushes the desired c-addr u
CREATE s 3 c, CHAR a c, CHAR b c, CHAR c c,
t{ 99 S( abc) SWAP DROP -> 99 3 }t
t{ 99 S( abc) s COUNT COMPARE -> 99 0 }t
Experience
S(
is not yet defined in any of the contemporary systems such as
- gForth
- VFX
- PFE
- DXForth
- FLT
- SwiftForth
- Win32Forth
- noForth
- amForth
- camelForth
- ciForth
- mecrisp
so it has no common use. However the name S(
seems to be available
in all these systems.
Discussion
The proposal avoids issues with the NDCS word S"
by providing S(
as an alternative notation for an interpretive S"
.
It is intended for ressource restricted standard systems that want to support interpretive strings but which
are not able to provide the FILE word set S"
.
S(
is very simple to implement so this proposal is rather about standarizing the name S(
with the intended
functionality than a sophisticated feature.
One can argue to remove S"
from the FILE word set, however this is not proposed here. Forth systems that provide
the FILE word set are hopefully capable of providing a complete and correct S"
implementation.
References
[Forth-94]: "American National Standard for Information Systems — Programming Languages — Forth", ANSI X3.215-1994
[clf18]: discussion about special words in comp.lang.forth, https://groups.google.com/forum/#!topic/comp.lang.forth/Gb9Hvj3Wm_Y%5B1-25%5D
[Ertl98]: "State-smartness - Why it is Evil and How to Exorcise it", Anton Ertl, euroForth 1998
[Pelc17]: "Special Words in Forth", Stephen Pelc, euroForth 2017
Author
Ulrich Hoffmann uho@xlerb.de