6.1.0450 : colon CORE

( C: "<spaces>name" -- colon-sys )

Skip leading space delimiters. Parse name delimited by a space. Create a definition for name, called a "colon definition". Enter compilation state and start the current definition, producing colon-sys. Append the initiation semantics given below to the current definition.

The execution semantics of name will be determined by the words compiled into the body of the definition. The current definition shall not be findable in the dictionary until it is ended (or until the execution of DOES> in some systems).

Initiation:

( i * x -- i * x ) ( R: -- nest-sys )

Save implementation-dependent information nest-sys about the calling definition. The stack effects i * x represent arguments to name.

name Execution:

( i * x -- j * x )

Execute the definition name. The stack effects i * x and j * x represent arguments to and results from name, respectively.

See:

Rationale:

Typical use: : name ... ;

In Forth 83, this word was specified to alter the search order. This specification is explicitly removed in this standard. We believe that in most cases this has no effect; however, systems that allow many search orders found the Forth-83 behavior of colon very undesirable.

Note that colon does not itself invoke the compiler. Colon sets compilation state so that later words in the parse area are compiled.

Testing:

T{ : NOP : POSTPONE ; ; -> }T
T{ NOP NOP1 NOP NOP2 -> }T
T{ NOP1 -> }T
T{ NOP2 -> }T

The following tests the dictionary search order:

T{ : GDX   123 ;    : GDX   GDX 234 ; -> }T
T{ GDX -> 123 234 }T

ContributeContributions

ruvavatar of ruv [128] Better wording for ColonProposal2020-02-06 02:14:04

This contribution has been moved to the proposal section.

ruvavatar of ruv

This reply has been moved to the proposal section.

AntonErtlavatar of AntonErtl

This reply has been moved to the proposal section.
Accepted
Reply New Version

ruvavatar of ruv [130] The parts of execution semantics and the calling definitionRequest for clarification2020-02-21 15:32:19

Headnote

The specification says: a) Append the initiation semantics given below to the current definition. b) The execution semantics of name will be determined by the words compiled into the body of the definition.

Question I

Is the initiation semantics part of the execution semantics?

It seems yes, but slightly vague.

Since 1) These initiation semantics are not a word compiled into the body. And 2) "the current definition" and "the execution semantics of the current definition" is not the same.

Re 2: COMPILE, appends execution semantics not just "to the current definition", but "to the execution semantics of the current definition". Perhaps it should be said somewhere that "appending semantics to the current definition" means appending these semantics to the execution semantics of the current definition. Or just use the same wording as in the specification for COMPILE,.

Question II

What definition is a calling definition?

The initiation semantics is to save information about the calling definition. But it seems the standard nowhere says how to call a definition, or what definition is the calling definition.

EXECUTE performs execution semantics. COMPILE, appends execution semantics. EXIT (and the code appended by Semicolon ;) returns control to the calling definition. But nobody calls the definition.

It looks like this issue reason is a gap between two different models (abstractions). We should find some bridge between this models, or use only one model.


NB: My interest is purely formal, it relates to the only wording of the specifications. From the implementation point of view, it's clear what definition is a calling definition. But the question is how to formulate it in the language of the Standard.

ruvavatar of ruv

Just an idea

calling definition: the definition whose performing has been started most recently among those whose performing hasn't been ended.

to perform a definition: to perform, step by step, all parts of the execution semantics for this definition.

AntonErtlavatar of AntonErtl

Question I Re 1: The intent (and common practice) is certainly that the execution semantics includes pushing nest-sys (i.e., the return address) on the return stack. Otherwise EXECUTE could not work on a colon definition.

The wording suggests otherwise. Looks like a bug in the wording to me that should be fixed.

Question I Re 2 does not appear to be a question to me. Yes, one could refine the wording as suggested by you, but then, it does not appear to be unclear now.

Question II does not appear to be a question to me, either. The wording could certainly be improved, but I don't think your suggestions would make things clearer. I would look for inspiration in related work (in this case, probably other programming language specifications).

Reply New Version

lmravatar of lmr [329] are colon-defs supposed to be compiled in data space?Request for clarification2024-01-05 17:07:09

This may be very basic, but: are colon-definitions supposed to be compiled in data space (addressable by @, ! etc)? Can the thread of compiled XTs (or whatever the implementation uses) for a definition reside in some other address space, visible only to the inner interpreter or the equivalent? I suppose the answer is obvious, since presumably an implementation could compile everything to a primitive.

AntonErtlavatar of AntonErtl

I don't find it written clearly enough in the document (Chapter 3 is what I looked at), that's why I leave this request open; in particular, the document says very little about code space and does not say that compile, allocates in code space or that the compilation semantics of various words allocate in code space.

Anyway, the intention is that the current definition and the stuff that is appended to it lives in code space, except the header which lives in name space (which, as has been noted, is in conflict with he usual meaning of "name space" in programming languages). Both name space and code space may be interleaved with each other and with data space, but they can also be separate. Traditionally, they are interleaved, but for a native-code system I recommend separating out at least the code space, because writing to data close to code causes serious performance penalties (I measured 400 cycles per ! at one point).

ruvavatar of ruv

are colon-definitions supposed to be compiled in data space (addressable by @, ! etc)?

It is not supposed, but allowed. Under the hood, an implementation may allocate data space during compilation. 3.3.3.2 Contiguous regions says: «an implementation is free to allocate data space for use by code»

On the other hand, a program is only allowed to access the address units that are either explicitly allocated by the program, or provided by the system as variables, text-literal regions, input buffers, and other transient regions — see 3.3.3 Data space.

Also, the section 4.1.2 Ambiguous conditions says that an ambiguous condition exists if a program addresses a region not listed in 3.3.3 Data space.

Reply New Version

kumaymavatar of kumaym [354] definion and use of colon-sys and nest-sysRequest for clarification2024-08-04 19:34:46

This text is for clarification of the intended use of colon-sys and nest-sys and also its use taking into account the differents steps or phases involved.

I will expose my understandings and more probably my misuderstanding about this in order to get knowledge and clarify. Maybe my confussion is related to the way it is redacted or maybe to my lack of knowledge.

The questions I want to get clarified for colon-sys and nest-sys are basically:

  1. semantics and definition
  2. implementation
  3. need to be defined as a concept in the standard
  4. semantics in terms of compile time and excution time

You can follow the comments in word literal about this issue but it's not necessary for the following discussion

  1. semantics and definition

The only place in standard where I can find a kind of definition is in 3.1 Data types where it is said that both colon-sys and nest-sys datatypes are implementation dependent and define colon-sys as a "definition compilation". The same for nest-sys which is defined in the same place as "definition cells" which I find confuse since it doesn't even mention is a return address or a return from call

I couldn't find any rule or requisite relative to colon-sys or nest-sys wich has sense since it is implementation dependent but then why to mention?

What I supposed is colon-sys is an abstract concept refering to a word definition meaning more or less all the stuff needed to start a definition leaving its real meaning to implementation, similar for nest-sys

  1. implementation

Anyway colon-sys is a real thing which is pushed in data stack (or control-flow stack if any) so I think it should be precissely defined, in particular this page says colon word "Enter compilation state and start the current definition, producing colon-sys", producing colon-sys suggests to me there is something real. It could be said colon-sys may be zero length and thus it produces nothing, this is implementation dependant, but then if size, structure and meaning of colon-sys it totally implementation defined, why to even mention it in the standar? it's implementation dependant.

There are at least two questions interesting to know about the colon-sys and nest-sys implementation:

  • where is it stored? i.e. is it possible to access to the colon-sys object if needed or it only leaves temporaly on stack? is it a first class object?
  • since it is on stack on word definition time, may a word definition a stack is empty or has to assume there's always a colon-sys object even if being zero sized?
  1. need to be defined as a concept in the standard

the mission of colon word is to create a new entry in the dictionary filling all slots and start to compile each xt it encounters in definition to dictionary area pointed by HERE. My understandin is all those steps can be done without polluting the stack from the point of view of definition words. At the time colon word starts compiling the words in the definition, the stack shoud be untouched so words in definition list can assume stack is "clear" (empty or with data pushed for last word execute prior to excute colon word) Finally, word ; just have to compile EXIT and switch to interpretation state, again it doesn't need anything in the stack to do its action.

So I think it's not necessary to define colon-sys or nest-sys in the standard since it is completely implementation dependant and thus it should be a black box.

The only reason to define or use a colon-sys concept is as a flag to users to take into acount than you have to assume stack is "polluted" into a defined word and you shouldn't assume nothing about the stack and in particular you shouldn't assume it is empty or under you control. In fact you should assume there's stuff on the stack when inside a definition and everything needed on the stack whould be pushed inside the definition.

  1. semantics in terms of compile time and excution time

In this page about word colon it is said:

( C: "<spaces>name" -- colon-sys ) Skip leading space delimiters. Parse name delimited by a space. Create a definition for name, called a "colon definition". Enter compilation state and start the current definition, producing colon-sys. Append the initiation semantics given below to the current definition. The execution semantics of name will be determined by the words compiled into the body of the definition. The current definition shall not be findable in the dictionary until it is ended (or until the execution of DOES> in some systems).

The use of C: confuses me, I understand it as taking about compilation semantics, the "C:" in the stack comment.

But reading carefully I think C: really means control-flow stack rather tha compilaton semantics, and there's no Compilation section because word colon has no compilation semantics.

I think it shoud be interpretation semantics because the behaviour describes is runtime behaviour, what word colon does when executed:

Skip leading space delimiters. Parse name delimited by a space. Create a definition for name, called a "colon definition". Enter compilation state and start the current definition, producing colon-sys.

Anyway, assuming interpretation semantics, my problem is with the sentence

Append the initiation semantics given below to the current definition.

what I understand for "append the initiation semantics to the current definition" is to compile the initiation semanctics to the current definition, but if doing so you should see the initiation semantics in the definition list when you perform a see on the defined word, since it is compiled there otherwise, if to append the initiation semantics really means to execute the initiation semantics I think it is redacted not quite clear.

but sure I misunderstood it and really it's all about interpretation semantics because being that initiation semantics is written as:

Initiation: ( i * x -- i * x ) ( R: -- nest-sys ) Save implementation-dependent information nest-sys about the calling definition. The stack effects i * x represent arguments to name.

I undestand initiation is a interpretation semantics, part of interpretion semantics of colon word, it is talking about the behaviour when you're going to execute a word in the definition list in order to save the return address (nest-sys strcuture)

Maybe everything will be more clear including a Compilation sections stating there¡s no compilation semantics and thus everything in the page refers to interpreation semantics

In a first reading I was assuming Initiation semantics is compiled into definied word because "append initiation semantics to current definition" sounds to me the same as "compile initiation semantics into current definition"

Under this assumption I was supposing nest-sys is about nested definitions, and since nested definition of colon words are not allowed in forth, I supposed it was related to quotations since you "compile" a quotation is some way inside a current definition.

ruvavatar of ruv

I couldn't find any rule or requisite relative to colon-sys or nest-sys wich has sense since it is implementation dependent but then why to mention?

They are mentioned to relax systems and tighten programs. Namely, to allow Forth systems to use the data stack and return stack in the certain cases, and to restrict programs from direct access their data on a stack (if any) in these cases.

For example, due to nest-sys you know why it's incorrect to pass parameters via the return stack as:

\ incorrect example
: foo r> . ;
: bar 123 >r foo ;

And due to colon-sys you know that this definition is incorrect:

\ incorrect example
: const ( x "name" -- )
  :  postpone literal  postpone ;
;

And this is correct:

: const ( x "name" -- )
  >r  :  r> postpone literal  postpone ;
;

A program can also use depth before and after a colon-sys is placed on the data stack to calculate its size and access data on the data stack via roll, pick, and between n>r and nr>. For example:

: const ( x "name" -- )
  depth >r  :  depth r> - n>r  postpone literal  nr> drop postpone ;
;

Also, due to the system-compilation types colon-sys and orig you know why the following program is incorrect

: foo if ;

Because "An ambiguous condition exists if an incorrectly typed data object is encountered", and in this case:

\ stack diagrams for compilation-time
: foo  ( colon-sys )
  if ( colon-sys orig )
  \ expected top value by ";"  ( colon-sys )
  \ actual top value ( orig )
; ( colon-sys -- )

ruvavatar of ruv

But reading carefully I think C: really means control-flow stack

Yes, it's described in 2.2.2 Stack notation.

there's no Compilation section because word colon has no compilation semantics.

No. If "Compilation" section is absent in a glossary entry, the word has default compilation semantics, see 3.4.3.3 Compilation semantics.

I think it shoud be interpretation semantics

Formally, it's execution semantics for : (Colon). When it's only one section, the label "Execution" is omitted (see 3.4.3.1).

And the interpretation semantics for : (Colon) are the same as the execution semantics (because "Interpretation" section is absent, and the execution semantics does not depend on state). See also 3.4.3.2.

you should see the initiation semantics in the definition list when you perform a see on the defined word,

It is not necessary, it may show just the source code, see 15.6.1.2194 SEE.


Example

Let's look at an example:

: foo 123 . ;

Formally, : (Colon) does the following steps in this case:

  1. Skip leading space delimiters. Parse "foo" delimited by a space.
  2. Start compilation for the new definition foo (so foo is now the current definition).
  3. Enter compilation state and produce colon-sys.
  4. Append the Initiation semantics (specified in 6.1.0450 :) to the execution semantics of foo.

NB: The section "Initiation" is not a part of execution semantics of : (Colon).

It does not matter what the Forth system actually does, as long as a standard program cannot detect the difference. So the order of these steps may be different, and some steps may be missed.

Usually, the Initiation semantics are not actually appended into foo, but are performed as a part of execute and appended by compile, (as a part of an internal call instruction or the address interpreter), when they are applied to the xt of foo. A standard program cannot detect this difference, so it does not matter.

Formally, ; (Semicolon) does the following steps in this case:

  1. Append the Run-time semantics specified in 6.1.0460 ; to the execution semantics of foo.
  2. End compilation of foo (so foo is not the current definition anymore).
  3. Place foo into the compilation word list (16.2).
  4. Enter interpretation state, consuming colon-sys.
  5. Align the data-space pointer.

The order of these steps does not matter as long as a standard program cannot detect that.

AntonErtlavatar of AntonErtl

  1. Colon-sys is described in more detail in 3.1.5.1:

    The implementation-dependent data generated upon beginning to compile a definition and consumed at its close is represented by the symbol colon-sys throughout this standard.

    Nest-sys is described in more detail in 3.1.5.2:

    The implementation-dependent data generated upon beginning to execute a definition and consumed upon exiting it is represented by the symbol nest-sys throughout this standard.

    Furthermore, both are used in the definitions of :, ;, does> and nest-sys is used in the definition of exit, and for nest-sys some of the uses in ; and exit clarify their role.

  2. Colon-sys is mentioned so that standard systems are allowed to push something on the data stack when starting to compile a colon definition (and standard programs must be written to deal with this situation), and also to specify that every executed : must be closed with a ;, possibly with a does> in between.

    Nest-sys is also mentioned to allow standard systems to push something at run-time on the return stack and to prevent standard programs from accessing return stack entries from before the call. In addition it is needed to describe where exit and the run-time semantics of ; return to.

    As described in 3.1.5.1, colon-sys is on the control-flow stack, which may be the same as the data stack. Note also that 3.1.5.1 says:

    The possible presence of such items on the data stack means that any items already there shall be unavailable to a program until the control-flow-stack items are consumed.

    Which means that ruv's third definition of const is non-standard; his second definition is standard and simpler anyway. So, yes, a standard program always has to assume that : puts a colon-sys on the data stack even if it can determine with depth that on this particular system the colon-sys has 0 items on the data stack.

    As described in 3.1.5.2, nest-sys is on the return stack.

    System types like colon-sys and nest-sys are not first-class types. E.g., there is no way to store them to buffers in memory.

  3. Exactly, the standard defines colon-sys and nest-sys in order to allow standard systems to push something to the data stack (colon-sys) and return stack (nest-sys), so that standard programs must deal with this possibility. They are also there to specify the closing of colon definitions (colon-sys) and nesting of calls (nest-sys).

  4. The stack comment for the execution semantics of : could actually be specified as

    ( input-stream: "\<spaces\>name" -- ; control-flow stack: -- colon-sys )
    

    where "control-flow stack" is ususally writteh as "C". Concerning the input stream, the standard document never produces a separate part of the stack effect description for that, but always lets it run with some other stack effect description, this time the control-flow stack effect.

    Yes, when you text-interpret :, its interpretation semantics are executed, which for : is the same as the execution semantics, which is the part that you are citing.

    The part about appending the initiation semantics means, in a classical indirect-threaded implementation, that you store the address of the machine-code routine docol in the CFA of the new word, and docol then performs the initiation semantics.

    Various other implementation techniques inline the initiation semantics into the caller in some or all cases. E.g., Gforth inlines it into the caller when you compile, the colon definition, but still uses docol when you execute the colon definition. Many native-code systems on IA-32 and AMD64 CPUs always call (or inline) all kinds of words, both in compile,d code and with execute, so they always inline all of the initiation semantics in the caller. I have not seen the code that native-code compilers for RISC architectures produce, but I expect that they put a part of the initiation semantics (the part that stores the return address on the return stack) at the start of the native code of the colon definition.

    The initiation semantics is the first part of the execution semantics (and therefore also of the interpretation semantics) of a colon definition. The newly defined colon definition also has compilation semantics (by default, to append the execution semantics to the definition current at the time when the compilation semantics are performed). : itself also has default compilation semantics.

I think we answered all the questions, so I am closing this request. If you think that anything is unclear, please reopen it.

Closed
Reply New Version