16 The optional Search-Order word set

16.1 Introduction

16.2 Additional terms and notation

compilation word list:
The word list into which new definition names are placed.

search order:
A list of word lists specifying the order in which the dictionary will be searched.

16.3 Additional usage requirements

16.3.1 Data types

Word list identifiers are implementation-dependent single-cell values that identify word lists.

Append table 16.1 to table 3.1.

Table 16.1: Data types

Symbol Data type Size on stack

wid word list identifiers 1 cell

See: 3.1 Data types, 3.4.2 Finding definition names, 3.4 The Forth text interpreter.

16.3.2 Environmental queries

Append table 16.2 to table 3.4.

See: 3.2.6 Environmental queries.

Table 16.2: Environmental Query Strings

String Value data type Constant? Meaning

WORDLISTS n yes maximum number of word lists usable in the search order

16.3.3 Finding definition names

When searching a word list for a definition name, the system shall search each word list from its last definition to its first. The search may encompass only a single word list, as with SEARCH-WORDLIST, or all the word lists in the search order, as with the text interpreter and FIND.

Changing the search order shall only affect the subsequent finding of definition names in the dictionary. A system with the Search-Order word set shall allow at least eight word lists in the search order.

An ambiguous condition exists if a program changes the compilation word list during the compilation of a definition or before modification of the behavior of the most recently compiled definition with ;CODE, DOES>, or IMMEDIATE.

A program that requires more than eight word lists in the search order has an environmental dependency.

See: 3.4.2 Finding definition names.

16.3.4 Contiguous regions

The regions of data space produced by the operations described in 3.3.3.2 Contiguous regions may be non-contiguous if WORDLIST is executed between allocations.

16.4 Additional documentation requirements

16.4.1 System documentation

16.4.1.1 Implementation-defined options

16.4.1.2 Ambiguous conditions

16.4.1.3 Other system documentation

  • no additional requirements.

16.4.2 Program documentation

16.4.2.1 Environmental dependencies

16.4.2.2 Other program documentation

  • no additional requirements.

16.5 Compliance and labeling

16.5.1 Forth-2012 systems

The phrase "Providing the Search-Order word set" shall be appended to the label of any Standard System that provides all of the Search-Order word set.

The phrase "Providing name(s) from the Search-Order Extensions word set" shall be appended to the label of any Standard System that provides portions of the Search-Order Extensions word set.

The phrase "Providing the Search-Order Extensions word set" shall be appended to the label of any Standard System that provides all of the Search-Order and Search-Order Extensions word sets.

16.5.2 Forth-2012 programs

The phrase "Requiring the Search-Order word set" shall be appended to the label of Standard Programs that require the system to provide the Search-Order word set.

The phrase "Requiring name(s) from the Search-Order Extensions word set" shall be appended to the label of Standard Programs that require the system to provide portions of the Search-Order Extensions word set.

The phrase "Requiring the Search-Order Extensions word set" shall be appended to the label of Standard Programs that require the system to provide all of the Search-Order and Search-Order Extensions word sets.

16.6 Glossary

16.6.1 Search-Order words

16.6.2 Search-Order extension words

ContributeContributions

enochavatar of enoch WLSCOPE -- wordlists switching made easierProposal2016-06-18 04:19:03

Proposal

It is proposed that Forth 200x compliant implementations would be required to offer a deferred word, wlscope (addr len -- addr' len' wid), whose initial implementation is get-current. All the mechanisms that create new dictionary entries would be required to pass the new name ( addr len ) through wlscope to obtain the destination wordlist ( wid ); the name itself may also be altered, hence ( addr' len' ).

Problem

Make it easier to manage inclusion of created words on different wordlists. All large software projects carefully regulate their name spaces. Openfirmware, for example, to separate between the "public" words and the "private" words often groups them as follows:

only forth also hidden also definitions
...
also forth definitions
...

With wlscope we would like to take this one step further and allow wordlist switches to be driven by the new name itself, in a way that is completely under the programmer control, through a chain of user defined wlscope functions which analyze and perhaps modify the newly created name and determine to which wordlist it is to be added.

Typical use

The following shows two wlscope "rules" and how to chain them.

Put helper (factorization) words onto a _local wordlist

\ Wordlist scope that puts words with underscore (_) prefix
\ on a _local wordlist.

:noname  ( addr len -- addr' len' wid ) 
   2dup
      1 >  if                        \ name length check
      s" _" tuck icompare  if        \ name prefix check
    _local exit
      then
   else
      drop
   then
   [ ' wlscope defer@ ]l execute
; is wlscope

Grow a special use gui wordlist

vocabulary gui 

:noname  ( addr len -- addr' len' wid ) 
   2dup
   4 >  if                            \ name length check
      s" gui_" tuck icompare  if    \ name prefix check
    swap 4 + swap 4 -                \ remove gui_ prefix
    gui exit
      then
   else
      drop
   then

   [ ' wlscope defer@ ]l execute
; is wlscope

Thus,

: _helper ... ;

would automatically add _helper to the _local wordlist while

: gui_init-gl ... ;

would add init-gl to the gui wordlist.

Reference Implementation

amforth-shadow

VM asm code at: core/create/docreate.asm and elsewhere.

Experience

amforth and amforth-shadow

BerndPaysanavatar of BerndPaysan 2016-06-19 13:26:04

The reference implementation should be clearer than pointing to amForth's assembler listings. The actual definition of WLSCOPE itself is trivial:

DEFER WLSCOPE
' GET-CURRENT IS WLSCOPE \ default

Integrating WLSCOPE into a system is system-specific, the GET-CURRENT in a word like HEADER needs to be replaced with WLSCOPE at a place where it can still modify the name. A simple dictionary might work like this:

Variable last
: string, ( addr u -- ) dup c, here over allot swap move ;
: header ( "name" -- ) \ create header with link field and name
  parse-name align here last ! wlscope 1 or , string, align ;
: reveal ( -- ) \ add last definition to linked list
  last @ dup @ dup 1 and IF
       -2 and 2dup @ swap ! !
  ELSE   2drop  THEN ;

ruvavatar of ruv 2016-06-19 13:52:09

It seems that such typical use is applicable only in special case when you don't need to use the words from the target wordlist in the definition (e.g. when : gui_init-gl ... ; doesn't use the words from gui in its definition). Moreover it makes sense when definitions from several wordlists are interleaved. By my experience such use case is very rare.

Chaining via deferred word is awkward — usually we need to revert back any special behavior to limit its scope.

ruvavatar of ruv 2016-06-19 14:08:36

enochavatar of enoch 2016-06-20 03:50:08

  • Indeed, a reference implementation of wlscope in Gforth would be better than AmForth* cryptic VM ASM.

  • In defense of wlscope rule chaining via static defer-s I say, give me please a simpler idea. KISS.

  • Another wlscope example rule which I did not bring for its weirdness turns Forth variable step=100 into C language static int step = 100; equivalent. P/S In AmForth value* is EEPROM based...

AntonErtlavatar of AntonErtl 2016-06-21 10:24:01

A simpler interface (but slightly more complex implementation) is

add-wlscope ( xt -- ) \ where xt has the stack effect ( c-addr1 u1 -- c-addr2 u2 wid|c-addr1 u1 0 )

where 0 is returned if the xt does not match. The chaining is done by the system, until a match is found. The final wlscope is GET-CURRENT.

With this interface, if we later find the need to do it, we can blow it up to a full stack (like the search order), without needing to rewrite all the words written for this interface.

Another, more pedestrian way (but not requiring extra system work) to deal with deactivation of wlscopes is to have a flag variable for each wlscope, and the wlscope is only active if the flag is true. This would have to be hand-coded in every wlscope word.

Your two wordlist examples could be generalized to e generic "dot-parser" wlscope. I.e., if the input string is "FOO.BAR", it looks if FOO exists, and if so, there's a match, FOO is executed (should return a wid), and returns "BAR" for c-addr2 u2 and the wid. If we have such a generic wlscope, it may be ok to always keep it active.

ruvavatar of ruv 2016-06-21 17:44:50

Minor clarification: I meant that chaining via deferred word is awkward not by itself, but as API when you need to revert the chaining back. Although it can be confidently used under the hood.

Even a generic wlscope cannot be always kept active. What if you need some library that defines and uses both: FOO vocabulary and FOO.BAR word (and this library does not know anything about wlscope mechanism)?

Yes, active-flag variable for each wlscope is possible solution. But I think that we should prefer easy API usage to easy API implementation.

Simple API

push-wlscope ( xt -- )
drop-wlscope ( -- )

can be clear implemented using linked list in the dictionary with leaking the last element in drop-wlscope.

On the other hand the proposed wlscope mechanism is just partial solution. What is really needed is straight-forward postfix API to create vocabulary entries. And having such API there is no need for wlscope mechanism. You can just define your own alternative for colon-word : in your limited scope.

AntonErtlavatar of AntonErtl 2016-06-22 12:39:33

I also prefer easy usage over easy implementation. In particular, the active-variable convention has a social problem: If someone does not follow the convention and fails to make his wlscope deactivatable, it's often not him, but the users of his wlscope (or even further downstream) who suffer the consequences. So I also prefer a proper API; I just wanted to point out this other option.

enochavatar of enoch 2016-06-22 17:49:42

  • Perhaps an obvious comment: In a multi developer project each contributor can have his own wlscope rule chain while still using the simple defer based chaining. Thus, I strongly recommend to Keep It Sweet and Simple :-)
  • Semi related: Once there is a convenient wordlist switching method the committee may consider recommending the use of a dedicated wordlist for each library/task. For example, _local to collect words which have no global interest or those that are just a result of factorization. OpenGL to collect words related to...

BerndPaysanavatar of BerndPaysan 2016-09-08 14:22:36

The committee discussed this proposal and we came up with a one-liner that provides the same one-shot functionality without changes to the internals of header creation. It only requires quality of implementation on : definitions that they go into the vocabulary that is current at the time of : instead of ; (which all quality implementations done by the committee members do).

: in ( "voc" "defining-word" -- )
  get-current >r also ' execute definitions previous ' execute r> set-current ;

Use would be

in gui : init-gl ( .. -- .. ) ... ;
in gui variable foo
in gui defer bar

We perceive this solution to be more elegant (quite portable one-liner with existing words instead of changing some internals), but nonetheless thank you for pointing out the need for this functionality. Of course, it requires quality implementations for : to work, so we encourage people to implement their : without relying on the ambiguous condition.

mtruteavatar of mtrute 2016-09-11 17:04:33

Why did you choose vocabularies? The Forth Standard does not contain them, it specified wordlists instead. A better interface and more standard like would be

IN ( wid "action" -- )

with wid beeing a wordlist identifier as returned by WORDLIST.

: IN get-current >r set-current ' execute r> set-current ;

mtruteavatar of mtrute 2016-09-24 17:44:05

Why do you use vocabularies? They are not part of the standard and they can do harmful things to the environment. I'd suggest using word lists instead. They are standard and they are much simpler to use (and define).

  : IN ( wid "action" -- ) 
     GET-CURRENT >R SET-CURRENT ' EXECUTE R> SET-CURRENT ;

Use it as follows

WORDLIST CONSTANT gui
IN gui : init-gl ( .. -- .. ) ... ;
IN gui VARIABLE foo
IN gui DEFER bar

The constant for the word list can be wrapped as a vocabulary easily. Many Forth's have tools to show the content of a wordlist by it's id already (in amforth it's called show-wordlist) so there is no need for vocabularies, at least not at the standard level.

ruvavatar of ruv 2016-12-10 10:25:25

mtrute, it seems that you meant

WORDLIST CONSTANT gui
gui IN : init-gl ( .. -- .. ) ... ;
gui IN VARIABLE foo
gui IN DEFER bar
  1. For CONSTANT it is quite awkward: 10 gui IN CONSTANT ten
  2. In some Forth-systems CURRENT wordlist should no be changed while a word is being defined, or it should be the same on : and ;

AntonErtlavatar of AntonErtl 2016-12-10 11:26:27

Which Forth systems do not put the word in the wordlist current during ":" if the current wordlist ist changed before ";"?

It seems to me that WLSCOPE requires at least as much effort to work as the requirement above. Therefore, standardizing that requirement is preferable to standardizing WLSCOPE.

JennyBrienavatar of JennyBrien 2016-12-10 17:12:39

I used something like this in my JenX XML parser (EuroForth 2001) by defining CREATION and DEF: as versions of CREATE and : respectively that took a wid as a parameter.

It wasn't strictly necessary, but it did mean I could easily control where particular types of words were defined without explicitly having to use SET-CURRENT.

Example:

Wordlist ENTITY?

: CENTITY entity? creation c, DOES> ... ;

All words defined with Centity go on the Entity wordlist.

Like IN, DEF: doesn't work if ; reads the Current wordlist, so I also would like that behaviour to be standardised.

mtruteavatar of mtrute 2016-12-12 18:12:30

@ruv: my IN is a parsing word (uses tick), so I think, my example is valid. But that is not the point I am after. What I find strange is the use of vocabularies when we have wordlists at hand that are simpler to use and to understand.

ruvavatar of ruv 2016-12-13 11:56:28

@AntonErtl, In SP-Forth (up to 4.21 at least) the current wordlist on ; should be the same as on : since the SMUDGE word (some alternative to REVEAL) reveals (unhides) the last word in the current wordlist only. So, if the current wordlist was changed — the defined word will not be revealed (despite it is defined in the proper wordlist). Also when multiple "storages" (data+code spaces, "sections" in your paper ) are used — word lists are bound to sections and the current section is changed when the current word list is changed (in case if they belong to the different sections). By the way, the temporary word lists (that reside in temporary sections) are used in the locals implementation from the they first version in 2001 :).

@mtrute, I even tested your definition of IN — your example doesn't work. Regarding word lists -- I prefer them over vocabularies too.

Reply