Digest #145 2021-04-22

Contributions

[192] 2021-04-21 09:46:05 ruv wrote:

comment - How to avoid default compilation semantics in the specification for [COMPILE]

The specification for [COMPILE] is based on the notion of default compilation semantics (this notion implicitly follows from 3.4.3.3 Compilation semantics).

For example:

  • [COMPILE] IF is equivalent to POSTPONE IF, since IF has non default compilation semantics.
  • [COMPILE] EXIT is equivalent to EXIT, since EXIT has default compilation semantics.

A side note. It looks like a user should still know an implementation detail of a word to correctly apply (or not apply) [COMPILE] to it. Before Forth-94 a user should know whether a word is immediate or not, and since Forth-94 a user should know whether a word has default compilation semantics or not default.

We have a number of words that have default compilation semantics and undefined interpretation semantics in their glossary entries (i.e. specifications). At least for some of them we would want to reword the specification in this regard (see also a comment).

For example, if we rename "Execution" section into "Run-Time" section and introduce "Compilation" section into the glossary entry for EXIT, then this word formally becomes a word with non default compilation semantics. And then the semantics of [COMPILE] EXIT phrase is changed.

How we can solve this problem with [COMPILE]?

Allow to apply [COMPILE] to the only words that have default compilation semantics?

Or perhaps it's a time to destandardize [COMPILE] at all?


NB: if EXIT is implemented as an immediate word in a Forth system, then actually it has non default compilation semantics in this Forth system, and then [COMPILE] EXIT will produce incorrect code, if [COMPILE] isn't specially tweaked for such cases.

Replies

[r668] 2021-04-20 14:20:32 ruv replies:

comment - Throwing past DO/LOOP

I didn't implement my DO/LOOP to store limits/counters on the return stack

In the case of your do-loop, a system-specific solution is to keep in the exception frame the pointer of the loop control stack (or the number of loop control parameters that are updated by do-loop run-time), and restore the pointer (or unloop the parameters) in THROW.

Certainly, any similar user-made constructs have identical issues [...] How else would user code that acquires global resources properly interoperate in the presence of THROW?

A proper way for programs to free resources is to use CATCH (see an example in A.9.6.1.2275).

Actually, even your do-loop can be implemented in a portable way using quotations:

: do c{ [: do }c ; immediate
: ?do c{ [: ?do }c ; immediate
variable no-exit \ it should be a user variable in case of multitasking
: execute-loop-safely ( xt -- flag-to-exit ) catch no-exit @ no-exit off swap dup if (unloop) then throw 0= ;
: (finalize-loop) c{ no-exit on ;] execute-loop-safely if exit then }c ; immediate
: loop c{ loop (finalize-loop) }c ; immediate
: +loop c{ +loop (finalize-loop) }c ; immediate

Here I use c{ ... }c notation (construct) to compile code fragments, and in this case it's equal to ]] ... [[ notation, that is to apply postpone to each contained word.

The variable no-exit is used to properly handle unloop exit phrase. So, a word

  : test 10 0 do i .  i 4 > if unloop exit then  loop ."  fail " ;

is compiled as

  : test 10 0 [: _do i .  i 4 > if unloop exit then  _loop  no-exit on ;] execute-loop-safely if exit then ."  fail " ;

Where _do and _loop are the original versions of these words.


[r669] 2021-04-20 21:06:38 TG9541 replies:

requestClarification - ALLOT in ROMable systems

@StephenPelc: thanks for insightful reply! The documents on Cross Compilation are very useful! The following example from XCapp5 clearly shows the potential - and the different machine semantics - of a ROMable:

: PRINTS ( n -- )
   CDATA   \ Select code section.
   CREATE ,   \ New definition with value n.
   IDATA   \ Restore default iData section.
   DOES> ( -- )   \ Target execution behavior.
         @ . ;\ Fetch value and display it.

A self-contained ROMable Forth system is, of course, quite a bit less complicated than capturing the pitfalls of the mixed host-target semantics of a XC scenario. In any case, much of the problem of the data space is shared. Memory sections, managed partitions of the memory CDATA, IDATA or UDATA can serve as a reminder that embedded machines can harbor more complexity - it's certainly also a valuable concept.

@MitraArdron, thanks for sharing your approach. Indicating the target memory with vCREATE and vALLOT is akin to the memory types. The programmer can easily state the intent of a memory allocation. The flexibility of the XC approach above is very nice, though.

@AntonErtl: thanks for the analysis of the standard with respect to the question raised! Right now the machine architecture of a Forth VM is, I assume, in the "von Neumann family". A unified data space for code and data somehow implies that data is mutable but µCs certainly blur the distinction between "Harvard" and "von Neumann" architectures. Memory protection blurs the lines even more.

What I had in mind is implementing the standard Core word set for a self-contained µC based Forth system so that packages with, e.g. the CRC-8 package or other code that doesn't depend on an OS can be used without changing it first (instead of "clone and own"). The CRC-8 code is a good example for a "non mutable" use case for CREATE.

Provided that the "stage for target memory" can be set by the integrator before "WANTing" a package (e.g. XC style) then code that's "embedded friendly" should work. A coding guideline and package tags might be a solution.

@JimPeterson: I understand that the issue is not clear-cut - maybe that's due to my limited understanding of the "standard jargon" (please bear with me), or because of my focus on certain types of "embedded systems".

The example you wrote, 10 ALLOT <other_stuff> 10 ALLOT shows that there is no simple solution to mixing mutable and non-mutable code. I'm not suggesting that a magic solution is possible, something that works for everybody without breaking anything. What I'm looking for is some clarity with regard to the standard (alignment of assumptions about machine architectures).

@ruv: thanks for the references!

3.3.3.2 makes it clear that memory contiguousness takes precedence over implied mutability (I assume that most programmers will expect CREATE array 8 ALLOT to produce a mutable (but uninitialized) array).

3.3.3.3 is a clear warning that VARIABLE array 6 ALLOT is not the safe harbor that I had hoped it is. That makes "clearly non-standard" solutions like @MitraArdron 's vCREATE and vALLOT or @StephenPelc 's CDATA, IDATA and UDATA much more attractive.

My solution has been to use a global target defining word (RAM or NVM for ROM space) that controls the behavior of VARIABLE and ALLOT. This works pretty well, unless someone uses CREATE (which will produce a mutable array only in RAM mode - but that can maybe be fixed by generalizing the solution from VARIABLE).

@MitchBradley: thanks for sharing your approach! I also first thought of something as radical as what you describe (and Dr. Ting had proposed something similar when he presented the code I now work with). Using BUFFER: and a set of constants instead of VARIABLE is possible but it requires a special programming style which wouldn't be a good fit for "regular" targets.


[r670] 2021-04-21 14:38:53 ruv replies:

comment - How to avoid default compilation semantics in the specification for [COMPILE]

Typo: "Allow to apply [COMPILE] to the only words that have non default compilation semantics".


Yet another problem with [COMPILE].

Usually, a user can redefine almost any word in a Forth system.

For example, to implement a profiler, EXIT can be redefined as an immediate word. But after that [COMPILE] EXIT works in a non standard way.

So it needs to also redefine [COMPILE], and either tweak handling of some words ad hoc, or just raise an error for any word in the argument (to avoid silently generating incorrect code by original [COMPILE]).