Digest #200 2022-09-09
Contributions
Forth Standards Meeting Draft (1) Agenda
14-15 Sept 2022 15:00-19:00 UTC
Online - for latest details see chat.forth-standard.org
See also: euro.theforth.net, forth-standard.org
Wednesday, 14th September (UTC)
- 14:30 Get together - Setup your gear and smalltalk
- 14:50 Call to order - get ready (please be online by now)
- 15:00 Session 3
- 17:00 Bio Break
- 17:15 Session 4
- 19:00 End of main session
- Workshops
Thursday, 15th September (UTC)
- Workshops
- 14:30 Get together - Setup your gear and smalltalk
- 14:50 Call to order - get ready (please be online by now)
- 15:00 Session 5
- 17:00 Bio Break
- 17:15 Session 6
- 19:00 End of Standards Meeting
Friday: euroForth conference
Agenda
2022-09-08
Participants
- Welcome
- Determine the persons present
- Meeting transcript
Review of Procedures
How we organize this meeting
Progress of current work
- Draft Document update (last draft is from 2019)
- Are we ready for a new Standard snapshop (Forth2023)?
- How can we speed up our work?
- How we can better serve the Forth community?
- How can we encourage Forthers to submit proposals?
Pending Topics include:
with some progress:- recognizers
- multi-threaded multitasking with little progress:
- memory access (16/32/64-Bit, RAM/ROM)
- reduce ambiguous conditions
What are addtional topics for future standardisation?
Reports
- Chair
- Editor
- Technical
- Treasurer
Election/Confirmation of officers
If you would like to stand for election, please suggest your name and please shortly introduce yourself.
We have to elect (by secret ballot) a Chair, Editor, Technical Officer and a Treasurer.
- Chair (currently Ulrich Hoffman)
- Editor (currently Peter Knaggs)
- Technical (currently Gerald Wodni)
- Treasurer (currently Bernd Paysan)
Review of Proposals/Contributions
Proposals from forth-standard.org/proposals
Proposals in the state formal
- Specify that 0 THROW pops the 0 (https://forth-standard.org/proposals/specify-that-0-throw-pops-the-0#reply-794) 2022-02-19 19:04:45 - AntonErtl
Proposals in the state voting
- PLACE +PLACE (https://forth-standard.org/proposals/place-place#reply-745) 2021-09-08 21:15:27 - UlrichHoffmann
Proposals in the state informal
We have a lots of informal proposals with open status (moved to an appendix at the end for clarity).
We should discuss we handle them best.
Contributions on forth-standard.org since last meeting
There are a lot of contributions since the interim March meeting. Find them in the appendix.
Workshop Topics
Workshops are topics for discussion outside the formal meeting. We will collect topics on the fly during the meeting's discussions.
Consideration of proposals + CfV votes
- Which proposals should go for vote?
- Any topics for proposal in the pipiline?
Workshop reports
Let's collect the results of our workshops.
Matters arising
Whats up?
Any other business
Something else?
Date of next meeting
When shall we three meet again? In thunder, lightning, or in rain?
Appendix to Review of Proposals/Contributions
Proposals in the state informal (most recent first)
Pronounciations (pronounciations #261) 2022-08-19 18:00:05 - AntonErtl
Exclude zero from the data types that are identifiers (exclude-zero-from-the-data-types-that-are-identifiers #252) 2022-08-13 23:24:52 - ruv
Clarification for execution token (clarification-for-execution-token #251) 2022-08-13 20:16:29 - ruv
Formatting: spaces in data type symbols (formatting-spaces-in-data-type-symbols #250) 2022-08-12 15:04:29 - ruv
Revert rewording the term "execution token" (revert-rewording-the-term-execution-token- #249) 2022-08-12 14:18:35 - ruv
Better wording for "Glossary notation" (better-wording-for-glossary-notation- #215) 2021-09-24 11:33:41 - ruv
Better wording for "data field" term (better-wording-for-data-field-term #214) 2021-09-14 08:55:49 - ruv
Tick and undefined execution semantics - 2 (tick-and-undefined-execution-semantics-2 #212) 2021-09-08 10:15:49 - StephenPelc
EMIT and non-ASCII values (emit-and-non-ascii-values #184) 2021-04-03 15:34:40 - AntonErtl
Tick and undefined execution semantics (tick-and-undefined-execution-semantics #163) 2020-10-29 00:28:43 - ruv
Common terminology for recognizers discurse and specifications (common-terminology-for-recognizers-discurse-and-specifications #161) 2020-09-07 13:56:43 - ruv
minimalistic core API for recognizers (https://forth-standard.org/proposals/minimalistic-core-api-for-recognizers#reply-867)
An alternative to the RECOGNIZER proposal (https://forth-standard.org/proposals/an-alternative-to-the-recognizer-proposal#reply-493) 2020-09-05 15:09:39 AndrewHaley
Call for Vote - Ambiguous condition in 16.3.3 (https://forth-standard.org/proposals/call-for-vote-ambiguous-condition-in-16-3-3#reply-460) 2020-09-02 11:16:03 - StephenPelc
XML Forth Standard - migration from LaTeX to DocBook (xml-forth-standard-migration-from-latex-to-docbook #154) 2020-09-01 21:16:26 - GeraldWodni
Nestable Recognizer Sequences (nestable-recognizer-sequences #149) 2020-08-22 16:09:52 - AntonErtl
OPTIONAL IEEE 754 BINARY FLOATING-POINT WORD SET (https://forth-standard.org/proposals/optional-ieee-754-binary-floating-point-word-set#reply-420) 2020-08-24 23:38:37 - KrishnaMyneni
Recognizer (recognizer #142) 2020-07-20 20:36:30 - BerndPaysan
Same name token for different words (same-name-token-for-different-words #136)
Recognizer RfD rephrase 2020 (recognizer-rfd-rephrase-2020 #131)
NAME>INTERPRET wording (name-interpret-wording #129) 2020-02-20 09:55:14 - ruv
Clarify FIND, more classic approach (https://forth-standard.org/proposals/clarify-find-more-classic-approach#reply-682) 2019-10-08 11:01:25 - ruv
Remove the “rules of FIND” (https://forth-standard.org/proposals/remove-the-rules-of-find-#reply-465 ) 2019-09-12 09:09:51 - BerndPaysan
Case insensitivity (case-insensitivity #114) 2019-09-06 18:27:48 - AntonErtl
CS-DROP (revised 2019-08-22) (https://forth-standard.org/proposals/cs-drop-revised-2019-08-22-#reply-471) 2019-09-06 08:24:28 - UlrichHoffmann
Right-justified text output (right-justified-text-output #101) 2019-08-01 22:07:03 - mcondron
Executing compilation semantics (executing-compilation-semantics #94) 2019-07-12 04:16:14 - ruv
Revise Rationale of Buffer: (https://forth-standard.org/proposals/revise-rationale-of-buffer-#reply-247) 2019-07-06 15:45:25 AntonErtl
F>R and FR> to support dynamically-scoped floating point variables (f-r-and-fr-to-support-dynamically-scoped-floating-point-variables #75) 2019-03-03 06:20:52 - kc5tja
Case sensitivity (case-sensitivity #73) 2018-11-03 13:15:53 - ruv
Revised Proposal Process (revised-proposal-process #71) 2018-09-21 06:49:42 - PeterKnaggs
Multi-Tasking Proposal (https://forth-standard.org/proposals/multi-tasking-proposal#reply-186) 2018-09-06 17:19:38 - AndrewHaley
CS-DROP (revised 2018-08-20) (https://forth-standard.org/proposals/cs-drop-revised-2018-08-20-#reply-302 ) 2018-08-20 20:22:25 - UlrichHoffmann
S( "Request for Discussion" (revised 2018-08-16) (s-request-for-discussion-revised-2018-08-16- #65) 2018-08-17 16:27:53 - UlrichHoffmann
Let us adopt the Gerry Jackson test suite as part of Forth 200x (let-us-adopt-the-gerry-jackson-test-suite-as-part-of-forth-200x #63) 2018-07-10 14:38:46 - StephenPelc
Tighten the specification of SYNONYM (version 1) (tighten-the-specification-of-synonym-version-1- #60) 2018-06-08 10:09:18 - GerryJackson
EXCEPTION LOCALs (exception-locals #36) 2017-10-28 07:04:49 - AndrewRead
BL rationale is wrong (bl-rationale-is-wrong #34) 2017-10-25 11:35:46 - AntonErtl
The value of STATE should be restored (the-value-of-state-should-be-restored #32) 2017-09-03 11:07:49 - AlexDyachenko
Core-ext S"; should reference File-ext S"; (core-ext-s-should-reference-file-ext-s- #29) 2017-04-16 08:03:17 - AntonErtl
Implementations requiring BOTH 32 bit single floats and 64 bit double floats. (implementations-requiring-both-32-bit-single-floats-and-64-bit-double-floats- #26) 2016-12-21 14:39:40 - zhtoor
Directory experiemental proposal (https://forth-standard.org/proposals/directory-experiemental-proposal#reply-59) 2016-12-12 15:42:57 - GeraldWodni
DEFER this not :-) (defer-this-not- #22) 2016-09-02 16:14:36 - enoch
WLSCOPE -- wordlists switching made easier (wlscope-wordlists-switching-made-easier #20) 2016-06-18 04:19:03 - enoch
Contributions on forth-standard.org since last meeting (most recent first)
Etymology of SYNONYM (tools, SYNONYM #267) 2022-09-07 21:15:16 - AntonErtl
Support several versions of the standard in parallel ( #266) 2022-09-07 11:41:37 - ruv
Bogus Test Case for SAVE-INPUT (core, SAVE-INPUT #265) 2022-09-06 14:36:57 - flaagel
Incorrect Test Pattern (file, SOURCE-ID #264) 2022-09-06 14:26:04 - flaagel
Test Proposal (test-proposal #263) 2022-08-28 19:24:27 - GeraldWodni
>NUMBER Test Patterns (core, toNUMBER #262) 2022-08-28 11:10:27 - flaagel
Pronounciations (pronounciations #261) 2022-08-19 18:00:05 - AntonErtl
Exception word set is not optional any more (exception #260) 2022-08-18 13:50:15 - ruv
Should QUIT propagate exceptions? (core, QUIT #259) 2022-08-18 12:09:37 - ruv
Pronounciation (xchar, PlusXDivSTRING #258) 2022-08-15 14:07:41 - AntonErtl
Pronounciation (xchar, MinusTRAILING-GARBAGE #257) 2022-08-15 14:04:40 - AntonErtl
Pronounciation (double, DUless #256) 2022-08-15 13:51:36 - AntonErtl
Pronounciation (float, FtoS #255) 2022-08-15 13:48:50 - AntonErtl
Pronounciation (float, StoF #254) 2022-08-15 13:29:09 - AntonErtl
Pronounciation (xchar, XSTRINGMinus #253) 2022-08-14 17:47:05 - AntonErtl
Exclude zero from the data types that are identifiers (exclude-zero-from-the-data-types-that-are-identifiers #252) 2022-08-13 23:24:52 - ruv
Clarification for execution token (clarification-for-execution-token #251) 2022-08-13 20:16:29 - ruv
Formatting: spaces in data type symbols (formatting-spaces-in-data-type-symbols #250) 2022-08-12 15:04:29 - ruv
Revert rewording the term "execution token" (revert-rewording-the-term-execution-token- #249) 2022-08-12 14:18:35 - ruv
Implementing COMPILE, via EXECUTE (core, COMPILEComma #248) 2022-08-12 10:21:25 - ruv
Better API for multitasking (multi-tasking-proposal #247) 2022-07-18 00:03:00 - ruv
Ambiguous conition for MARKER (core, MARKER #246) 2022-07-16 10:55:40 - ruv
:NONAME Primitives (core, ColonNONAME #245) 2022-07-05 16:22:37 - flaagel
Interactions with MARKER and KILL-TASK (multi-tasking-proposal #244) 2022-06-25 15:54:21 - kc5tja
Stack Sizes? (multi-tasking-proposal #243) 2022-06-25 15:38:51 - kc5tja
Round-robin vs Preemptive (multi-tasking-proposal #242) 2022-06-25 15:26:37 - kc5tja
Suggested reference implementation ROT (core, ROT #241) 2022-06-23 21:59:20 - poggingfish
Suggested reference implementation R@ (core, RFetch #240) 2022-06-20 17:40:33 - poggingfish
Suggested reference implementation 2* (core, TwoTimes #239) 2022-06-20 17:34:22 - poggingfish
Same execution token (usage #238) 2022-06-13 22:40:38 - ruv
3.4.5 conflicts with [: … ;] (usage #237) 2022-05-11 12:45:05 - AtH
Trigonmetric Functions in Forth (float, FSIN #236) 2022-04-11 17:49:49 - OldSpoon
F.3 Seems in Error (testsuite #235) 2022-04-08 17:45:25 - JimPeterson
Possible Reference Implementation (core, ALIGN #234) 2022-04-05 17:44:08 - JimPeterson
Possible Reference Implementation (core, MIN #233) 2022-04-05 14:43:43 - JimPeterson
Possible Reference Implementation (core, VARIABLE #232) 2022-04-05 14:05:53 - JimPeterson
Double> (core, MTimes #231) 2022-04-04 21:04:46 - AdrianMcMenamin
Question about final test (core, UMTimes #230) 2022-04-02 21:58:34 - AdrianMcMenamin
inconsistent naming (search, FORTH-WORDLIST #229) 2022-03-18 14:04:40 - LSchmidt
Accessing Remaining Data Stack? (locals #228) 2022-03-08 20:58:26 - JimPeterson
Contradiction With do-loops (locals #227) 2022-03-08 20:39:59 - JimPeterson
c-addr used in stack diagrams (core, Cq #226) 2022-03-06 21:06:16 - LSchmidt
Using a . suffix to specify a double (double, DZeroEqual #225) 2022-03-05 19:45:14 - flaagel
many tests appear to only assess interpretation semantics of test subjects (testsuite #224) 2022-02-27 21:23:05 - LSchmidt
chasing for dangling words referred to (testsuite #223) 2022-02-27 20:58:46 - LSchmidt
many tests appear to only assess interpretation semantics of test subjects (testsuite #222) 2022-02-27 18:43:57 - LSchmidt
I suggest to complete the test (core, POSTPONE #221) 2022-02-27 14:26:42 - LSchmidt
Replies
A recognized xt acts on the state passed to it on the stack
A proper term for "recognized xt" ("recognized execution token") should be chosen. "recognized xt" means "xt that is recognized", but we don't recognize execution tokens, but recognize lexemes. This xt just is a result of recognizing a lexeme. And it should be named according what it does, not according who produces it.
There is no reason to pass state on the stack — we discussed that, and the reference implementation reflect that.
Anton is correct, this is a small part of a wider issue. I believe in small government and small standards. Let’s avoid unnecessary complexity.
As I have been ask where the current implementation lives (not finished), it is on github GeraldWodni/directories
The STATE
discussion in the 2021 workshop concluded that words or xt executed should not depend on STATE
. The reference implementation needs to be adjusted.
For the name of the result values we might want to have another round of bikeshedding. In particular with more native speakers. The current wording represents the last round of bikeshedding.
Author:
Bernd Paysan
Change Log:
- 2020-09-06 initial version
- 2020-09-08 taking ruv's approach and vocabulary at translators
- 2020-09-08 replace the remaining rectypes with translators
- 2022-09-08 add the requested extensions, integrate results of bikeshedding discussion
- 2022-09-08 adjust reference implementation to results of last bikeshedding discussion
Problem:
The current recognizer proposal has received a number of critics. One is that its API is too big. So this proposal tries to create a very minimalistic API for a core recognizer, and allows to implement more fancy stuff as extensions. The problem this proposal tries to solve is the same as with the original recognizer proposal, this proposal is therefore not a full proposal, but sketches down some changes to the original proposal.
Solution:
Define the essentials of the recognizer in a RECOGNIZER word set, and allow building upon that. Common extensions go to the RECOGNIZER EXT wordset.
Important changes to the original proposal:
- Make the recognizer types executable to dispatch the methods (interpret, compile, postpone) themselves
- Make the recognizer sequence executable with the same effect as a recognizer
- Make the system's
forth-recognizer
a deferred word to allow plugging in new recognizer sequences
This replaces one poor man's method dispatch with another poor man's method dispatch, which is maybe less daunting and more flexible.
The core principle is still that the recognizer is not aware of state, and the returned translator is. If you have for some reason legacy code that looks like
: rec-nt ( addr u -- translator )
here place here find dup IF
0< state @ and IF compile, ELSE execute THEN ['] drop
ELSE drop ['] notfound THEN ;
then you should factor the part starting with state @ out and return it as translator:
: word-translator ( xt flag -- )
0< state @ and IF compile, ELSE execute THEN ;
: rec-word ( addr u -- ... translator )
here place here find dup IF ['] word-translator
ELSE drop ['] notfound THEN ;
Typical use
TBD
Proposal:
XY. The optional Recognizer Wordset
A recognizer takes the string of a lexeme and returns a recognized xt and additional data on the stack (no additional data for NOTFOUND
):
REC-SOMETYPE ( addr len -- i*x recognized | NOTFOUND )
XY.3 Additional usage requirements
XY.3.1 Recognized
recognized: subtype of xt, and executes with the following stack effect:
RECOGNIZED-THING ( j*x i*x state -- k*x )
A recognized xt acts on the state passed to it on the stack
- 0 for interpretation
- -1 for compilation
- -2 for POSTPONE
i*x
is the additional information provided by the recognizer, jx and kx are the stack inputs and outputs of interpreting/compiling or postponing the thing.
XY.6 Glossary
XY.6.1 Recognizer Words
FORTH-RECOGNIZE ( addr len -- i*x recognized-xt | NOTFOUND-xt ) RECOGNIZER
Takes a string and tries to recognize it, returning the recognized xt and additional information if successful, or NOTFOUND
if not.
NOTFOUND ( -- ) RECOGNIZER
Performs -13 THROW
. An ambiguous condition exists if the exception word set is not available.
XY.6.2 Recognizer Extension Words
SET-FORTH-RECOGNIZE ( xt -- ) RECOGNIZER EXT
Assign the recognizer xt to FORTH-RECOGNIZE.
Rationale:
FORTH-RECOGNIZE is likely a deferred word, but systems that implement it otherwise, can use this word to change the behavior instead of using IS FORTH-RECOGNIZE.
GET-FORTH-RECOGNIZE ( -- xt ) RECOGNIZER EXT
Obtain the recognizer xt that is assigned to FORTH-RECOGNIZE.
Rationale:
FORTH-RECOGNIZE is likely a deferred word, but systems that implement it otherwise, can use this word to change the behavior instead of using ACTION-OF FORTH-RECOGNIZE.
REC-SEQUENCE: ( xt1 .. xtn n "name" -- ) RECOGNIZER EXT
Create a named recognizer sequence under the name "name", which, when executed, tries to recognize strings starting with xtn and proceeding towards xt1 until successful.
SET-REC-SEQUENCE ( xt1 .. xtn n xt-seq -- ) RECOGNIZER EXT
Set the recognizer sequence of xt-seq to xt1 .. xtn.
GET-REC-SEQUENCE ( xt-seq -- xt1 .. xtn n ) RECOGNIZER EXT
Obtain the recognizer sequence xt-seq as xt1 .. xtn n.
RECOGNIZED: ( xt-int xt-comp xt-post "name" -- ) RECOGNIZER EXT
Create a recognized word under the name "name", which performs xt-int for state=0, xt-comp for state=-1 and xt-post for state=-2.
Reference implementation:
This is a minimalistic core implementation for a recognizer-enabled system, that handles only words and single numbers without base prefix:
Defer forth-recognizer ( addr u -- i*x translator / notfound )
: interpret ( i*x -- j*x )
BEGIN
?stack parse-name dup WHILE
forth-recognizer execute
REPEAT ;
: lit, ( n -- ) postpone literal ;
: notfound ( state -- ) -13 throw ;
: recognized-nt ( nt state -- )
case
0 of name>interpret execute endof
-1 of name>compile execute endof
-2 of name>compile swap lit, compile, endof
nip // do nothing if state is unknown; possible error handling goes here
endcase ;
: recognized-num ( n state -- )
case
-1 of lit, endof
-2 of lit, postpone lit, endof
endcase ;
: rec-nt ( addr u -- nt nt-translator / notfound )
forth-wordlist find-name-in dup IF ['] recognized-nt ELSE drop ['] notfound THEN ;
: rec-num ( addr u -- n num-translator / notfound )
0. 2swap >number 0= IF 2drop ['] recognized-num ELSE 2drop drop ['] notfound THEN ;
: minimal-recognizer ( addr u -- nt nt-translator / n num-translator / notfound )
2>r 2r@ rec-nt dup ['] notfound = IF drop 2r@ rec-num THEN 2rdrop ;
' minimal-recognizer is forth-recognizer
Extensions reference implementation:
: set-forth-recognize ( xt -- )
is forth-recognize ;
: get-forth-recognize ( -- xt )
action-of forth-recognize ;
: recognized: ( xt-interpret xt-compile xt-postpone "name" -- )
create , , ,
does> state @ 2 + cells + @ execute ;
Stacks TBD.
Testing
TBD
I agree with this proposal.
One argument to have "than" (i.e., to prefer "less-than" over "less" for the less-than sign <
) is that "less" can be a literal part of a word name (e.g. show-less
), but there is little chance that a word name will contain "less-than" literally.
Then probably all the corresponding cases should contain "than" part.
The STATE discussion in the 2021 workshop concluded that words or xt executed should not depend on STATE.
I see the following in the report by @ulli on 2021-09-08:
Given the two variations to handle STATE (either in RECOGNIZER:'s DOES> part or in INTERPRET), yesterdays participants favoured to have the single occurrence of STATE in INTERPRET. Further investigation and model implementations will show whether on or the other is beneficial.
So it implies further investigation and model implementations.
Could someone provide a rationale in favor to pass state (better say "mode") via the stack?
My rationale against mode on the stack is following:
- It makes combination of token translators cumbersome. E.g. a definition
: tt-3lit ( 3*x -- 3*x | ) >r tt-2lit r> tt-lit ;
becomes far more complex. - In most cases a program doesn't need to execute a token translator in a mode that is different from the current mode (counter examples are welcome, except
postpone
). - The current mode is already held by the system anyway.
- (most importantly) It introduces unnecessary coupling between the Forth text interpreter loop and the Recognizer API. This loop does not need to know anything about modes and
STATE
at all. If we are replacing the system's lexeme translator (along with the system's set of token translators), we should be able to replace it along with the system'sSTATE
(and the set of the system's modes) too. Moreover, a token translator can technically ignore the passed value and use it's own set of modes. And even such a simpler mode-beyond-stack API can be implemented over one that passes mode via the stack.
On the other hand I don't think that including (mentioning) STATE
in a new API is a good choice. STATE
returns a read-only address, and it's provided for back compatibility only. So a better method instead of STATE
is required anyway.
Actually, the system's token translators are the only ones who depend on the system's set of modes. In most cases user-defined token translators are defined via system's token translators (which should be standardized) and they need to know nothing about system's set of modes, and about STATE
at all. In the same time, a user is able to define own set of recognizers and set of token translators that don't depend on system's set of modes, but introduce own set of modes.
So, the specification for Recognizer API should not mention nether STATE
nor a set of magic values like {0, -1, -2}.
Concerning your mode -2
— I believe, the standard word postpone
doesn't need an own mode. But in postponing mode, if any, string literals (like s" foo bar"
) and comments should be properly treated.
- It introduces unnecessary coupling between the Forth text interpreter loop and the Recognizer API.
See a block-based illustration of this idea in my Gist.