6.2.1950 OF CORE EXT

Interpretation:

Interpretation semantics for this word are undefined.

Compilation:

( C: -- of-sys )

Put of-sys onto the control flow stack. Append the run-time semantics given below to the current definition. The semantics are incomplete until resolved by a consumer of of-sys such as ENDOF.

Run-time:

( x1 x2 -- | x1 )

If the two values on the stack are not equal, discard the top value and continue execution at the location specified by the consumer of of-sys, e.g., following the next ENDOF. Otherwise, discard both values and continue execution in line.

See:

Rationale:

Typical use:
: X ...
   CASE
   test1 OF ... ENDOF
   testn OF ... ENDOF
   ... ( default )
   ENDCASE ...
;

Testing:

ContributeContributions

EricBlakeavatar of EricBlake [392] May OF modify the enclosing case-sys on the control stack?Request for clarification2025-08-01 21:33:04

The stack diagram for OF compilation, ( C: -- of-sys ), implies that it can only add an of-sys, even though presumably the intent is that OF should never be called outside of a CASE/ENDCASE structure and thus there should always be a case-sys somewhere higher on the control stack (whether or not that is also part of the data stack). On the other hand, ENDOF is clear that the case-sys must be adjacent to the of-sys (at least in terms of the control stack), and that it can be modified, since it specifies ( C: case-sys1 of-sys -- case-sys2 ).

And yet, the standard also includes a sample implementation of OF that does modify the enclosing case-sys: https://forth-standard.org/standard/rationale#paragraph.A.3.2.3.2, when it gives : OF ( #of -- orig #of+1 / x -- ). Viewing the reference in context, it is clear that it implements case-sys as a variable length structure consisting of 0 or more orig and one u (and in fact, where the representation may be split across the data stack for u and the control stack for the array of orig). If OF is permitted to modify the case-sys, then one could view that implementation as having an of-sys that occupies zero cells. And the entire reason that the reference implementation modifies case-sys in OF is because it is careful to use >R and R> to move #of out of the was so before doing any work that manipulates the control stack, in case the POSTPONE ELSE of ENDOF is using 1 CS_ROLL, and regardless of whether the control stack shares the data stack.

See also the enlightening discussion on whether LEAVE may modify its enclosing do-sys; https://forth-standard.org/standard/core/LEAVE#contribution-185; to which the committee replied that it is not compliant for LEAVE to modify the control stack (because there, the standard mandated ( -- ) rather than permitting ( C: do-sys1 i*x -- do-sys2 i*x ), even though such an implementation would be unlikely to trip up any production program.

I'm left scratching my head as to how to portably implement an OF that only appends an of-sys, and then ENDOF that merges the of-sys into the adjacent case-sys, without requiring carnal knowledge of whether the control stack shares the data stack. Trying to use of CS_PICK and CS_ROLL, even with the addition of CS_DROP https://forth-standard.org/proposals/cs-drop-revised-2019-08-22-#reply-1370, doesn't help; those are specified to operate on dest and orig without any correlation to how they might fare on case-sys or of-sys in the mix. That said, on a simple a system where orig and dest are one cell each, and where the control stack resides on the data stack, a solution with the same variable-length case-sys (namely, orig*n n) is trivial: Just move the +1 out of OF and into ENDOF; and at that point you might as well rely on SWAP instead of >R and R>:

: OF ( C: -- of-sys ) ( runtime: x1 x2 -- | x1 )
  POSTPONE OVER POSTPONE  = ( runtime: x1 flag )
  POSTPONE IF ( C: of-sys ) ( runtime: x1 ) \ where of-sys and orig are identical
  POSTPONE DROP ( runtime: -- )
; IMMEDIATE
: ENDOF ( C: case-sys1 of-sys -- case-sys2 )
  POSTPONE ELSE ( C: orig*n n orig1 -- orig*n n orig2 ) \ convert of-sys to orig2, where case-sys is a variable-length orig*n n
  SWAP ( C: orig*n orig2 n ) \ abuse carnal knowledge of relation between n and orig2
  1+ ( C: orig*[n+1] n+1 ) \ case-sys2 is now one cell larger than case-sys1
; IMMEDIATE

AntonErtlavatar of AntonErtl

At some point I came to the conclusion that the technique outlined in A.3.2.3.2 does not comply with standard requirements, but at the moment I don't see why; i.e., I don't see how to write a standard program where the implementation in A.3.2.3.2 would not behave (in ways that a standard program can determine) as specified.

In any case, I have implemented a slightly different technique in compat/caseext.fs; this implementation should work on any standard system where control-flow stack items consume at least one cell on the data stack. This is the implementation used by Gforth. As additional benefit, you also get ?of, contof, and next-case, which allow to use case also for arbitrary conditions and for loops.

AntonErtlavatar of AntonErtl

Thinking about it again: Only endof can consume an of-sys, and endof takes case-sys1 of-sys as input, so at of there has to be a case-sys on top. A standard program cannot tell whether of modified that case-sys, so it seems to me that doing it in the A.3.2.3.2 way is standard-compliant (i.e., no standard program will fail on such an implementation).

ruvavatar of ruv

I would change the stack diagram for of to ( C: case-sys1 -- case-sys2 of-sys ) — this stack diagram will formally allow the system to throw an exception if case-sys1 is absent on the top of the control-flow stack, and it will require a program to restore case-sys1 if it is temporary removed (and the data stack is used for the control-flow stack items).

A possible option: ( C: case-sys -- case-sys of-sys ), assuming that two occurrences of the same data type symbol without an index in a stack diagram does not mean the same actual value in the corresponding positions of the stack.

AntonErtlavatar of AntonErtl

An occurence of the same name (wether it contains a number or not) before and after the -- means that the value is the same (e.g., dup, swap). So if we change the stack effect of of, it should be changed to ( case-sys1 -- case-sys2 of-sys ).

ruvavatar of ruv

An occurence of the same name (wether it contains a number or not) before and after the -- means that the value is the same

This is not formally specified in the Stack notation, it only says: "«before» represents the stack-parameter data types before execution of the definition and «after» represents them after execution".

(e.g., dup, swap).

In all such cases, including dup, 2dup, etc, the fact that two actual parameters in the stack at different positions or at different times are identical formally follows only from the prose, not from the stack diagram.

I think that indices on data type symbols in stack diagrams are used only to avoid ambiguity when referencing an anonymous formal stack parameter in the prose. They don't need if there is no ambiguity. And they don't have any formal meaning.

Reply New Version