Digest #145 2021-04-22
Contributions
comment - How to avoid default compilation semantics in the specification for [COMPILE]
The specification for [COMPILE]
is based on the notion of default compilation semantics (this notion implicitly follows from 3.4.3.3 Compilation semantics).
For example:
[COMPILE] IF
is equivalent toPOSTPONE IF
, since IF has non default compilation semantics.[COMPILE] EXIT
is equivalent toEXIT
, since EXIT has default compilation semantics.
A side note. It looks like a user should still know an implementation detail of a word to correctly apply (or not apply) [COMPILE]
to it. Before Forth-94 a user should know whether a word is immediate or not, and since Forth-94 a user should know whether a word has default compilation semantics or not default.
We have a number of words that have default compilation semantics and undefined interpretation semantics in their glossary entries (i.e. specifications). At least for some of them we would want to reword the specification in this regard (see also a comment).
For example, if we rename "Execution" section into "Run-Time" section and introduce "Compilation" section into the glossary entry for EXIT
, then this word formally becomes a word with non default compilation semantics. And then the semantics of [COMPILE] EXIT
phrase is changed.
How we can solve this problem with [COMPILE]
?
Allow to apply [COMPILE]
to the only words that have default compilation semantics?
Or perhaps it's a time to destandardize [COMPILE]
at all?
NB: if EXIT
is implemented as an immediate word in a Forth system, then actually it has non default compilation semantics in this Forth system, and then [COMPILE] EXIT
will produce incorrect code, if [COMPILE]
isn't specially tweaked for such cases.
Replies
I didn't implement my DO/LOOP to store limits/counters on the return stack
In the case of your do-loop, a system-specific solution is to keep in the exception frame the pointer of the loop control stack (or the number of loop control parameters that are updated by do-loop run-time), and restore the pointer (or unloop the parameters) in THROW
.
Certainly, any similar user-made constructs have identical issues [...] How else would user code that acquires global resources properly interoperate in the presence of
THROW
?
A proper way for programs to free resources is to use CATCH
(see an example in A.9.6.1.2275).
Actually, even your do-loop can be implemented in a portable way using quotations:
: do c{ [: do }c ; immediate
: ?do c{ [: ?do }c ; immediate
variable no-exit \ it should be a user variable in case of multitasking
: execute-loop-safely ( xt -- flag-to-exit ) catch no-exit @ no-exit off swap dup if (unloop) then throw 0= ;
: (finalize-loop) c{ no-exit on ;] execute-loop-safely if exit then }c ; immediate
: loop c{ loop (finalize-loop) }c ; immediate
: +loop c{ +loop (finalize-loop) }c ; immediate
Here I use c{ ... }c
notation (construct) to compile code fragments, and in this case it's equal to ]] ... [[
notation, that is to apply postpone
to each contained word.
The variable no-exit
is used to properly handle unloop exit
phrase. So, a word
: test 10 0 do i . i 4 > if unloop exit then loop ." fail " ;
is compiled as
: test 10 0 [: _do i . i 4 > if unloop exit then _loop no-exit on ;] execute-loop-safely if exit then ." fail " ;
Where _do
and _loop
are the original versions of these words.
@StephenPelc: thanks for insightful reply! The documents on Cross Compilation are very useful! The following example from XCapp5 clearly shows the potential - and the different machine semantics - of a ROMable:
: PRINTS ( n -- )
CDATA \ Select code section.
CREATE , \ New definition with value n.
IDATA \ Restore default iData section.
DOES> ( -- ) \ Target execution behavior.
@ . ;\ Fetch value and display it.
A self-contained ROMable Forth system is, of course, quite a bit less complicated than capturing the pitfalls of the mixed host-target semantics of a XC scenario. In any case, much of the problem of the data space is shared. Memory sections, managed partitions of the memory CDATA
, IDATA
or UDATA
can serve as a reminder that embedded machines can harbor more complexity - it's certainly also a valuable concept.
@MitraArdron, thanks for sharing your approach. Indicating the target memory with vCREATE
and vALLOT
is akin to the memory types. The programmer can easily state the intent of a memory allocation. The flexibility of the XC approach above is very nice, though.
@AntonErtl: thanks for the analysis of the standard with respect to the question raised! Right now the machine architecture of a Forth VM is, I assume, in the "von Neumann family". A unified data space for code and data somehow implies that data is mutable but µCs certainly blur the distinction between "Harvard" and "von Neumann" architectures. Memory protection blurs the lines even more.
What I had in mind is implementing the standard Core word set for a self-contained µC based Forth system so that packages with, e.g. the CRC-8 package or other code that doesn't depend on an OS can be used without changing it first (instead of "clone and own"). The CRC-8 code is a good example for a "non mutable" use case for CREATE
.
Provided that the "stage for target memory" can be set by the integrator before "WANT
ing" a package (e.g. XC style) then code that's "embedded friendly" should work. A coding guideline and package tags might be a solution.
@JimPeterson: I understand that the issue is not clear-cut - maybe that's due to my limited understanding of the "standard jargon" (please bear with me), or because of my focus on certain types of "embedded systems".
The example you wrote, 10 ALLOT <other_stuff> 10 ALLOT
shows that there is no simple solution to mixing mutable and non-mutable code. I'm not suggesting that a magic solution is possible, something that works for everybody without breaking anything. What I'm looking for is some clarity with regard to the standard (alignment of assumptions about machine architectures).
@ruv: thanks for the references!
3.3.3.2 makes it clear that memory contiguousness takes precedence over implied mutability (I assume that most programmers will expect CREATE array 8 ALLOT
to produce a mutable (but uninitialized) array).
3.3.3.3 is a clear warning that VARIABLE array 6 ALLOT
is not the safe harbor that I had hoped it is. That makes "clearly non-standard" solutions like @MitraArdron 's vCREATE
and vALLOT
or @StephenPelc 's CDATA
, IDATA
and UDATA
much more attractive.
My solution has been to use a global target defining word (RAM
or NVM
for ROM space) that controls the behavior of VARIABLE
and ALLOT
. This works pretty well, unless someone uses CREATE
(which will produce a mutable array only in RAM
mode - but that can maybe be fixed by generalizing the solution from VARIABLE
).
@MitchBradley: thanks for sharing your approach! I also first thought of something as radical as what you describe (and Dr. Ting had proposed something similar when he presented the code I now work with). Using BUFFER:
and a set of constants instead of VARIABLE
is possible but it requires a special programming style which wouldn't be a good fit for "regular" targets.
comment - How to avoid default compilation semantics in the specification for [COMPILE]
Typo: "Allow to apply [COMPILE]
to the only words that have non default compilation semantics".
Yet another problem with [COMPILE]
.
Usually, a user can redefine almost any word in a Forth system.
For example, to implement a profiler, EXIT
can be redefined as an immediate word. But after that [COMPILE] EXIT
works in a non standard way.
So it needs to also redefine [COMPILE]
, and either tweak handling of some words ad hoc, or just raise an error for any word in the argument (to avoid silently generating incorrect code by original [COMPILE]
).