2 Terms, notation, and references

The phrase "See:" is used throughout this standard to direct the reader to other sections of the standard that have a direct bearing on the current section.

In this standard, "shall" states a requirement on a system or program; conversely, "shall not" is a prohibition; "need not" means "is not required to"; "should" describes a recommendation of the standard; and "may", depending on context, means "is allowed to" or "might happen".

Throughout the standard, typefaces are used in the following manner:

This proportional serif typeface is used for text, with italic used for symbols and the first appearance of new terms;
A bold proportional sans-serif typeface is used for headings;
A bold monospaced serif typeface is used for Forth-language text.

2.1 Definitions of terms

Terms defined in this section are used generally throughout this standard. Additional terms specific to individual word sets are defined in those word sets. Other terms are defined at their first appearance, indicated by italic type. Terms not defined in this standard are to be construed according to the Dictionary for Information Systems, ANSI X3.172-1990.

address unit:: Depending on context, either 1) the units into which a Forth address space is divided for the purposes of locating data objects such as characters and variables; 2) the physical memory storage elements corresponding to those units; 3) the contents of such a memory storage element; or 4) the units in which the length of a region of memory is expressed.
aligned:: Divisible by a type-dependent power of 2 (typically used as "<type>-aligned address" or "<type>-aligned value").
aligned address:: The address of a memory location at which a character, cell, cell pair, or double-cell integer can be accessed.
ambiguous condition:: A circumstance for which this standard does not prescribe a specific behavior. See section 4.1.2 Ambiguous conditions for a list of such circumstances and 3.4.4 Possible actions on an ambiguous condition.
cell:: The primary unit of information in the architecture of a Forth system.
cell pair:: Two cells that are treated as a single unit.
character:: Depending on context, either 1) a storage unit capable of holding a character; or 2) a member of a character set.
character-aligned address:: The address of a memory location at which a character can be accessed.
character string:: Data space that is associated with a sequence of consecutive character-aligned addresses. Character strings usually contain text. Unless otherwise indicated, the term "string" means "character string".
code space:: The logical area of the dictionary in which word semantics are implemented.
compile:: To transform source code into dictionary definitions.
compilation semantics:: The behavior of a Forth definition when its name is encountered by the text interpreter in compilation state.
counted string:: A data structure consisting of one character containing a length followed by zero or more contiguous data characters. Normally, counted strings contain text.
cross compiler:: A system that compiles a program for later execution in an environment that may be physically and logically different from the compiling environment. In a cross compiler, the term "host" applies to the compiling environment, and the term "target" applies to the run-time environment.
current definition:: The definition whose compilation has been started but not yet ended.
data field:: The data space associated with a word defined via CREATE.
data space:: The logical area of the dictionary that can be accessed.
data-space pointer:: The address of the next available data space location, i.e., the value returned by HERE.
data stack:: A stack that may be used for passing parameters between definitions. When there is no possibility of confusion, the data stack is referred to as "the stack". Contrast with return stack.
data type:: An identifier for the set of values that a data object may have.
defining word:: A Forth word that creates a new definition when executed.
definition:: A Forth execution procedure compiled into the dictionary.
dictionary:: An extensible structure that contains definitions and associated data space.
display:: To send one or more characters to the user output device.
environmental dependencies:: A program's implicit assumptions about a Forth system's implementation options or underlying hardware. For example, a program that assumes a cell size greater than 16 bits is said to have an environmental dependency.
execution semantics:: The behavior of a Forth definition when it is executed.
execution token:: A value that identifies the execution semantics of a definition.
find:: To search the dictionary for a definition name matching a given string.
immediate word:: A Forth word whose compilation semantics are to perform its execution semantics.
implementation defined:: Denotes system behaviors or features that must be provided and documented by a system but whose further details are not prescribed by this standard.
implementation dependent:: Denotes system behaviors or features that must be provided by a system but whose further details are not prescribed by this standard.
initiation semantics:: Describes the behavior at the start of some word definitions (those of words defined with :, :NONAME, CREATE DOES>). Other parts of the specification of these defining words (and nothing else) refer to initiation semantics.
input buffer:: A region of memory containing the sequence of characters from the input source that is currently accessible to a program.
input source:: The device, file, block, or other entity that supplies characters to refill the input buffer.
input source specification:: A set of information describing a particular state of the input source, input buffer, and parse area. This information is sufficient, when saved and restored properly, to enable the nesting of parsing operations on the same or different input sources.
interpretation semantics:: The behavior of a Forth definition when its name is encountered by the text interpreter in interpretation state.
keyboard event:: A value received by the system denoting a user action at the user input device. The term "keyboard" in this document does not exclude other types of user input devices.
line:: A sequence of characters followed by an actual or implied line terminator.
name space:: The logical area of the dictionary in which definition names are stored.
number:: In this standard, "number" used without other qualification means "integer". Similarly, "double number" means "double-cell integer".
parse:: To select and exclude a character string from the parse area using a specified set of delimiting characters, called delimiters.
parse area:: The portion of the input buffer that has not yet been parsed, and is thus available to the system for subsequent processing by the text interpreter and other parsing operations.
pictured-numeric output:: A number display format in which the number is converted using Forth words that resemble a symbolic "picture" of the desired output.
program:: A complete specification of execution to achieve a specific function (application task) expressed in Forth source code form.
receive:: To obtain characters from the user input device.
return stack:: A stack that may be used for program execution nesting, do-loop execution, temporary storage, and other purposes.
standard word:: A named Forth procedure, formally specified in this standard.
user input device:: The input device currently selected as the source of received data, typically a keyboard.
user output device:: The output device currently selected as the destination of display data.
variable:: A named region of data space located and accessed by its memory address.
word:: Depending on context, either 1) the name of a Forth definition; or 2) a parsed sequence of non-space characters, which could be the name of a Forth definition.
word list:: A list of associated Forth definition names that may be examined during a dictionary search.
word set:: A set of Forth definitions grouped together in this standard under a name indicating some shared aspect, typically their common functional area.

2.2 Notation

2.2.1 Numeric notation

Unless otherwise stated, all references to numbers apply to signed single-cell integers. The inclusive range of values is shown as {from ... to}. The allowable range for the contents of an address is shown in double braces, particularly for the contents of variables, e.g., BASE {{2 ... 36}}.

2.2.2 Stack notation

Stack parameters input to and output from a definition are described using the notation:

( stack-id: before -- after )

where stack-id specifies which stack is being described, before represents the stack-parameter data types before execution of the definition and after represents them after execution. The symbols used in before and after are shown in table 3.1.

The control-flow-stack stack-id is "C:", the data-stack stack-id is "S:", and the return-stack stack-id is "R:". When there is no confusion, the data-stack stack-id may be omitted.

When there are alternate after representations, they are described by "after₁ | after₂". The top of the stack is to the right. Only those stack items required for or provided by execution of the definition are shown.

2.2.3 Parsed-text notation

If, in addition to using stack parameters, a definition parses text, that text is specified by an abbreviation from table 2.1, shown surrounded by double-quotes and placed between the before parameters and the "--" separator in the first stack described, e.g.,

( S: before "parsed-text-abbreviation" -- after )

Table 2.1: Parsed text abbreviations


Abbreviation	Description

<char>	the delimiting character marking the end of the string being parsed
<chars>	zero or more consecutive occurrences of the character <char>
<space>	a delimiting space character
<spaces>	zero or more consecutive occurrences of the character <space>
<quote>	a delimiting double quote
<paren>	a delimiting right parenthesis
<eol>	an implied delimiter marking the end of a line
ccc	a parsed sequence of arbitrary characters, excluding the delimiter character
name	a token delimited by space, equivalent to ccc<space> or ccc<eol>

2.2.4 Glossary notation

The glossary entries for each word set are listed in the standard ASCII collating sequence. Each glossary entry specifies a Forth word and consists of two parts: an index line and the semantic description of the definition.

2.2.4.1 Glossary index line

The index line is a single-line entry containing, from left to right:

Section number, the last four digits of which assign a unique sequential number to all words included in this standard;
DEFINITION-NAME in upper-case, mono-spaced, bold-face letters;
Natural-language pronunciation in quotes if it differs from English;
Word-set designator from table 2.2. The designation for extensions word sets includes "EXT".
Extension designator in sans-serif font under the Word-set designator for words which have been added to the standard via the named extension.

Table 2.2: Word set designators


Word set	Designator

Core word set	CORE
Block word set	BLOCK
Double-Number word set	DOUBLE
Exception word set	EXCEPTION
Facility word set	FACILITY
File-Access word set	FILE
Floating-Point word set	FLOATING
Locals word set	LOCALS
Memory-Allocation word set	MEMORY
Programming-Tools word set	TOOLS
Search-Order word set	SEARCH
String-Handling word set	STRING
Extended-Character word set	XCHAR

2.2.4.2 Glossary semantic description

The first paragraph of the semantic description contains a stack notation for each stack affected by execution of the word. The remaining paragraphs contain a text description of the semantics. See 3.4.3 Semantics.

2.2.5 BNF notation

The following notation is used to define the syntax of some elements within the document:

Each component of the element is defined with a rule consisting of the name of the component (italicized in angle-brackets, e.g., <decdigit>), the characters := and a concatenation of tokens and metacharacters;
Tokens may be literal characters (in bold face, e.g., E) or rule names in angle brackets (e.g., <decdigit>);
The metacharacter * is used to specify zero or more occurrences of the preceding token (e.g., <decdigit>*);
Tokens enclosed with [ and ] are optional (e.g., [-]);
Vertical bars separate choices from a list of tokens enclosed with braces (e.g., { 0 | 1 }).

See: 3.4.1.3 Text interpreter input number conversion, 12.3.7 Text interpreter input number conversion, 12.6.1.0558 >FLOAT, 12.6.2.1613 FS., 13.6.2.2550 {:.

2.3 References

The following national and international standards are referenced in this standard:

ISO/IEC 15145:1997 Information technology. Programming languages. FORTH;
ANSI X3.215-1994 Programming Languages – Forth;
ANSI X3.172-1990 Dictionary for Information Systems, (2.1 Definitions of terms);
ANSI X3.4-1974 American Standard Code for Information Interchange (ASCII), (3.1.2.1 Graphic characters);
ISO 646-1983 ISO 7-bit coded characterset for information interchange, International Reference Version (IRV) (3.1.2.1 Graphic characters);
ANSI/IEEE 754-1985 Floating-point Standard, (12.2.1 Definition of terms).

ContributeContributions

ruv [82] Terminology and wording regarding "dictionary"Request for clarification2019-06-04 08:26:12

1.

Have a look at the code space, data space and name space term definitions — they all are logical areas of the dictionary. But dictionary defined as:

dictionary: An extensible structure that contains definitions and associated data space.

What is the rationale that only the data space is mentioned (but the code space and name space are missed)? It looks like an inconsistency.

2.

Have a look at the following term definitions:

name space: The logical area of the dictionary in which definition names are stored.

find: to search the dictionary for a definition name matching a given string.

What is the rationale that a "definition name" is searched not in the name space but in the dictionary?

It seems that referring to the name space would be more precise in such cases. OTOH, in practice, the term "name space" is often used as a hyperonym for the term "word list". Perhaps it should be taken into account in the further revisions and there is a sense to involve the term "header space" as the logical area in which definition names and meta-information are stored.

AntonErtl [r220] 2019-06-05 21:22:51

My guess is that code space and name space are not really visible in the standard (no way to allocate them or access them), while data space is managed by standard words: ALLOT , @ ! etc.
That's a good point: "name space" has another meaning that's used by the Forth and wider programming community, and this standard does not use "name space" much (and I don't remember seeing it used with this meaning anywhere else; more common is to talk about separated headers etc.) So, yes, "header space" instead of "name space" would be clearer.

Reply New Version

ruv [89] What is a standard programComment2019-07-03 11:42:27

"Standard program" term is often mentioned in the discussions and it is used in the specification of TO word. It seems there is a sense to add a definition for this term.

A possible variant:

standard programme: a programme that is expressed in the standard words only and does not contain the ambiguous conditions (unspecified behavior).

ruv [r234] 2019-07-03 12:32:11

hot fix: read program in place of programme.

AntonErtl [r241] 2019-07-05 09:10:58

"Standard Program" is defined in Section 5.2.1. There have been no problems with it not being in the definition of terms since 1994, so I doubt that including it will help much.

ruv [r244] 2019-07-05 12:47:36

I see now, thanks for the reference. I expected to find it it in (2.1) though.

It should be noted that (5.2.1) defines Standard Program in large — as end user application. But we used this term just for the fragments of code. More correct use in such case should look like "can this code fragment be a part of a standard program?". It seems too verbose. Perhaps standard code term could be more appropriate.

Reply New Version

TKurtBond [340] Data space seems to be used invonsistently when comparing the definitions of "character string" and "data space".Request for clarification2024-05-29 20:11:10

The definition of "character string" says "Data space that is associated with a sequence of consecutive character-aligned addresses." However, the definition of "data space" says "The logical area of the dictionary that can be accessed." Character strings don't have to be part of the dictionary: for instance, when they created using ALLOCATE, contradicting the definition of "data space".

Is "data space" supposed to be any arbitrary memory, or just the memory in the dictionary? If the later, is there a better to use in the definition of "character string"?

AntonErtl [r1223] 2024-05-30 11:49:16

These definitions were written for the standard without any extension wordsets. However, in that case the memory allocation wordset should redefine "data space" (or "character string"). Data space has come up several times as being confusing, so maybe a better definition is in order.

Concerning "character string", another issue is the strings returned by the run-time semantics of s", s\", sliteral, and likewise for the counted string returned by the run-time semantics of c". Typical systems will return strings that reside in what is arguably code space (or arguably data space). Maybe we should define "character string" in a way that makes such arguing unnecessary.

ruv [r1224] 2024-05-30 17:31:47

The definition of "character string" says "Data space that is associated with a sequence of consecutive character-aligned addresses."

I think, it should say "A data space region that is associated with a sequence of consecutive character-aligned addresses." It's the same problem that was fixed for the "data field" term.

However, the definition of "data space" says "The logical area of the dictionary that can be accessed." Character strings don't have to be part of the dictionary: for instance, when they created using ALLOCATE, contradicting the definition of "data space".

I agree that some definitions can be be made better.

But conceptually, why not? Data space is not a single memory region, but a set of memory regions. See my explanation in the post "Data space notion in Forth Standard".

Reply New Version

EricBlake [382] Wording on ccc prohibits \" inside S\"Request for clarification2025-07-02 03:31:11

Table 2.1 in 2.2.3 Parsed text notation states

 ccc        a parsed sequence of arbitrary characters, excluding the delimiter character

Meanwhile, S" states that it parses: ( "ccc<quote>" -- )

By a strict reading of the former, a "ccc<quote>" cannot include a " character in the ccc portion of the word, and yet the specification of the latter requires " to be a supported escape sequence. A related question affects whether ( and .( can have nested () in comments: https://forth-standard.org/standard/core/p#contribution-380

Maybe all that is needed is adding a normative clarification that some definitions document context-dependent scenarios under which a given character does or does not act as a delimiter character. Then in the S" page, document that an odd number of \ followed by " is one such context where " is not a delimiter; and if nested comments are allowed in (, then document that for implementations that allow nested comments, ")" is not a delimiter if it balances out a "(" present earlier in the ccc text.

ruv [r1451] 2025-07-02 12:09:00

By a strict reading of the former, a "ccc<quote>" cannot include a " character in the ccc portion of the word

Yes.

Probably, the glossary entry 6.2.2266 S\" should introduce and use another notation instead of ccc. Or use a more complicated definition for the delimiter character.

achowe [r1452] 2025-07-11 13:18:22

Forth 200x Draft 19.1 section 6.2.2266 S\" Translation Rules allow \" as an escape sequence to embed a literal " in a string. I expect the description here extends the definition of ccc to account for escaped characters.

achowe [r1453] 2025-07-11 13:19:49

Quick test case

t{ S\" 123 S\" 432\" TYPE " EVALUATE -> 123 }t

Reply New Version