Proposal: [20] WLSCOPE -- wordlists switching made easier

Informal

This proposal has been moved into this section. Its former address was: /standard/search

This page is dedicated to discussing this specific proposal

ContributeContributions

enochavatar of enoch [20] WLSCOPE -- wordlists switching made easierProposal2016-06-18 04:19:03

Proposal

It is proposed that Forth 200x compliant implementations would be required to offer a deferred word, wlscope (addr len -- addr' len' wid), whose initial implementation is get-current. All the mechanisms that create new dictionary entries would be required to pass the new name ( addr len ) through wlscope to obtain the destination wordlist ( wid ); the name itself may also be altered, hence ( addr' len' ).

Problem

Make it easier to manage inclusion of created words on different wordlists. All large software projects carefully regulate their name spaces. Openfirmware, for example, to separate between the "public" words and the "private" words often groups them as follows:

only forth also hidden also definitions
...
also forth definitions
...

With wlscope we would like to take this one step further and allow wordlist switches to be driven by the new name itself, in a way that is completely under the programmer control, through a chain of user defined wlscope functions which analyze and perhaps modify the newly created name and determine to which wordlist it is to be added.

Typical use

The following shows two wlscope "rules" and how to chain them.

Put helper (factorization) words onto a _local wordlist

\ Wordlist scope that puts words with underscore (_) prefix
\ on a _local wordlist.

:noname  ( addr len -- addr' len' wid ) 
   2dup
      1 >  if                        \ name length check
      s" _" tuck icompare  if        \ name prefix check
    _local exit
      then
   else
      drop
   then
   [ ' wlscope defer@ ]l execute
; is wlscope

Grow a special use gui wordlist

vocabulary gui 

:noname  ( addr len -- addr' len' wid ) 
   2dup
   4 >  if                            \ name length check
      s" gui_" tuck icompare  if    \ name prefix check
    swap 4 + swap 4 -                \ remove gui_ prefix
    gui exit
      then
   else
      drop
   then

   [ ' wlscope defer@ ]l execute
; is wlscope

Thus,

: _helper ... ;

would automatically add _helper to the _local wordlist while

: gui_init-gl ... ;

would add init-gl to the gui wordlist.

Reference Implementation

amforth-shadow

VM asm code at: core/create/docreate.asm and elsewhere.

Experience

amforth and amforth-shadow

BerndPaysanavatar of BerndPaysan

The reference implementation should be clearer than pointing to amForth's assembler listings. The actual definition of WLSCOPE itself is trivial:

DEFER WLSCOPE
' GET-CURRENT IS WLSCOPE \ default

Integrating WLSCOPE into a system is system-specific, the GET-CURRENT in a word like HEADER needs to be replaced with WLSCOPE at a place where it can still modify the name. A simple dictionary might work like this:

Variable last
: string, ( addr u -- ) dup c, here over allot swap move ;
: header ( "name" -- ) \ create header with link field and name
  parse-name align here last ! wlscope 1 or , string, align ;
: reveal ( -- ) \ add last definition to linked list
  last @ dup @ dup 1 and IF
       -2 and 2dup @ swap ! !
  ELSE   2drop  THEN ;

ruvavatar of ruv

It seems that such typical use is applicable only in special case when you don't need to use the words from the target wordlist in the definition (e.g. when : gui_init-gl ... ; doesn't use the words from gui in its definition). Moreover it makes sense when definitions from several wordlists are interleaved. By my experience such use case is very rare.

Chaining via deferred word is awkward — usually we need to revert back any special behavior to limit its scope.

ruvavatar of ruv

enochavatar of enoch

  • Indeed, a reference implementation of wlscope in Gforth would be better than AmForth* cryptic VM ASM.

  • In defense of wlscope rule chaining via static defer-s I say, give me please a simpler idea. KISS.

  • Another wlscope example rule which I did not bring for its weirdness turns Forth variable step=100 into C language static int step = 100; equivalent. P/S In AmForth* value is EEPROM based...

AntonErtlavatar of AntonErtl

A simpler interface (but slightly more complex implementation) is

add-wlscope ( xt -- ) \ where xt has the stack effect ( c-addr1 u1 -- c-addr2 u2 wid|c-addr1 u1 0 )

where 0 is returned if the xt does not match. The chaining is done by the system, until a match is found. The final wlscope is GET-CURRENT.

With this interface, if we later find the need to do it, we can blow it up to a full stack (like the search order), without needing to rewrite all the words written for this interface.

Another, more pedestrian way (but not requiring extra system work) to deal with deactivation of wlscopes is to have a flag variable for each wlscope, and the wlscope is only active if the flag is true. This would have to be hand-coded in every wlscope word.

Your two wordlist examples could be generalized to e generic "dot-parser" wlscope. I.e., if the input string is "FOO.BAR", it looks if FOO exists, and if so, there's a match, FOO is executed (should return a wid), and returns "BAR" for c-addr2 u2 and the wid. If we have such a generic wlscope, it may be ok to always keep it active.

ruvavatar of ruv

Minor clarification: I meant that chaining via deferred word is awkward not by itself, but as API when you need to revert the chaining back. Although it can be confidently used under the hood.

Even a generic wlscope cannot be always kept active. What if you need some library that defines and uses both: FOO vocabulary and FOO.BAR word (and this library does not know anything about wlscope mechanism)?

Yes, active-flag variable for each wlscope is possible solution. But I think that we should prefer easy API usage to easy API implementation.

Simple API

push-wlscope ( xt -- )
drop-wlscope ( -- )

can be clear implemented using linked list in the dictionary with leaking the last element in drop-wlscope.

On the other hand the proposed wlscope mechanism is just partial solution. What is really needed is straight-forward postfix API to create vocabulary entries. And having such API there is no need for wlscope mechanism. You can just define your own alternative for colon-word : in your limited scope.

AntonErtlavatar of AntonErtl

I also prefer easy usage over easy implementation. In particular, the active-variable convention has a social problem: If someone does not follow the convention and fails to make his wlscope deactivatable, it's often not him, but the users of his wlscope (or even further downstream) who suffer the consequences. So I also prefer a proper API; I just wanted to point out this other option.

enochavatar of enoch

  • Perhaps an obvious comment: In a multi developer project each contributor can have his own wlscope rule chain while still using the simple defer based chaining. Thus, I strongly recommend to Keep It Sweet and Simple :-)
  • Semi related: Once there is a convenient wordlist switching method the committee may consider recommending the use of a dedicated wordlist for each library/task. For example, _local to collect words which have no global interest or those that are just a result of factorization. OpenGL to collect words related to...

BerndPaysanavatar of BerndPaysan

The committee discussed this proposal and we came up with a one-liner that provides the same one-shot functionality without changes to the internals of header creation. It only requires quality of implementation on : definitions that they go into the vocabulary that is current at the time of : instead of ; (which all quality implementations done by the committee members do).

: in ( "voc" "defining-word" -- )
  get-current >r also ' execute definitions previous ' execute r> set-current ;

Use would be

in gui : init-gl ( .. -- .. ) ... ;
in gui variable foo
in gui defer bar

We perceive this solution to be more elegant (quite portable one-liner with existing words instead of changing some internals), but nonetheless thank you for pointing out the need for this functionality. Of course, it requires quality implementations for : to work, so we encourage people to implement their : without relying on the ambiguous condition.

mtruteavatar of mtrute

Why did you choose vocabularies? The Forth Standard does not contain them, it specified wordlists instead. A better interface and more standard like would be

IN ( wid "action" -- )

with wid beeing a wordlist identifier as returned by WORDLIST.

: IN get-current >r set-current ' execute r> set-current ;

mtruteavatar of mtrute

Why do you use vocabularies? They are not part of the standard and they can do harmful things to the environment. I'd suggest using word lists instead. They are standard and they are much simpler to use (and define).

  : IN ( wid "action" -- ) 
     GET-CURRENT >R SET-CURRENT ' EXECUTE R> SET-CURRENT ;

Use it as follows

WORDLIST CONSTANT gui
IN gui : init-gl ( .. -- .. ) ... ;
IN gui VARIABLE foo
IN gui DEFER bar

The constant for the word list can be wrapped as a vocabulary easily. Many Forth's have tools to show the content of a wordlist by it's id already (in amforth it's called show-wordlist) so there is no need for vocabularies, at least not at the standard level.

ruvavatar of ruv

mtrute, it seems that you meant

WORDLIST CONSTANT gui
gui IN : init-gl ( .. -- .. ) ... ;
gui IN VARIABLE foo
gui IN DEFER bar
  1. For CONSTANT it is quite awkward: 10 gui IN CONSTANT ten
  2. In some Forth-systems CURRENT wordlist should no be changed while a word is being defined, or it should be the same on : and ;

AntonErtlavatar of AntonErtl

Which Forth systems do not put the word in the wordlist current during ":" if the current wordlist ist changed before ";"?

It seems to me that WLSCOPE requires at least as much effort to work as the requirement above. Therefore, standardizing that requirement is preferable to standardizing WLSCOPE.

JennyBrienavatar of JennyBrien

I used something like this in my JenX XML parser (EuroForth 2001) by defining CREATION and DEF: as versions of CREATE and : respectively that took a wid as a parameter.

It wasn't strictly necessary, but it did mean I could easily control where particular types of words were defined without explicitly having to use SET-CURRENT.

Example:

Wordlist ENTITY?

: CENTITY entity? creation c, DOES> ... ;

All words defined with Centity go on the Entity wordlist.

Like IN, DEF: doesn't work if ; reads the Current wordlist, so I also would like that behaviour to be standardised.

mtruteavatar of mtrute

@ruv: my IN is a parsing word (uses tick), so I think, my example is valid. But that is not the point I am after. What I find strange is the use of vocabularies when we have wordlists at hand that are simpler to use and to understand.

ruvavatar of ruv

@AntonErtl, In SP-Forth (up to 4.21 at least) the current wordlist on ; should be the same as on : since the SMUDGE word (some alternative to REVEAL) reveals (unhides) the last word in the current wordlist only. So, if the current wordlist was changed — the defined word will not be revealed (despite it is defined in the proper wordlist). Also when multiple "storages" (data+code spaces, "sections" in your paper ) are used — word lists are bound to sections and the current section is changed when the current word list is changed (in case if they belong to the different sections). By the way, the temporary word lists (that reside in temporary sections) are used in the locals implementation from the they first version in 2001 :).

@mtrute, I even tested your definition of IN — your example doesn't work. Regarding word lists -- I prefer them over vocabularies too.

JimPetersonavatar of JimPeterson

An alternate proposal:

Add the word DEFINE ( xt c-addr u -- ) as a deferred word through which all words are added to wordlists (e.g., via CREATE, :, BUFFER:, CONSTANT, VARIABLE, VALUE, SYNONYM, etc.). Calling DEFINE adds the word specified via ( c-addr u ) to the current wordlist, with execution token xt, and sets it as the most recently-defined word (so that IMMEDIATE may affect it). While xt need not be a valid execution token, an ambiguous condition would exist if it was not valid and the wordlist in question wound up in the search order. Later, successful calls to SEARCH-WORDLIST should return the provided xt.

In this manner, the system's own wordlist mechanic can be leveraged to provide the user with arbitrary named storage capability, storing user data as xt. What's more, redefining what DEFINE is, via IS or DEFER! (while saving the original somewhere) would allow for the simple creation of facilities like WLSCOPE, etc.

We might also want a DELETE-WORDLIST ( wid -- ), to clean up later.

ruvavatar of ruv

Add the word DEFINE ( xt c-addr u -- )

Sure, such a word should be standardized — many systems have it under names like aka, alias, naming, enroll-name, etc.

as a deferred word

A deferred word is a bad choice for an API, since a deferred word cannot be redefined, since the system's words to, action-of, defer@, defer! are not applicable to the redefined word. Therefore, a separate getter and setter should be used in APIs.

the system's own wordlist mechanic can be leveraged to provide the user with arbitrary named storage capability, storing user data as xt.

Why not to expose such a mechanic under its own general API, which is not connected to execution tokens and immediacy? It can be even implemented as a portable library.

We might also want a DELETE-WORDLIST ( wid -- ), to clean up later.

I think, this functionality cannot be implemented in most systems. In some cases marker can be used instead.

Reply New Version