6.2.2020 PARSE-NAME CORE EXT

( "<spaces>name<space>" -- c-addr u )

Skip leading space delimiters. Parse name delimited by a space.

c-addr is the address of the selected string within the input buffer and u is its length in characters. If the parse area is empty or contains only white space, the resulting string has length zero.

Implementation:

: isspace? ( c -- f )
   BL 1+ U< ;

: isnotspace? ( c -- f )
   isspace? 0= ;

: xt-skip ( addr1 n1 xt -- addr2 n2 )
   \ skip all characters satisfying xt ( c -- f )
   >R
   BEGIN
     DUP
   WHILE
     OVER C@ R@ EXECUTE
   WHILE
     1 /STRING
   REPEAT THEN
   R> DROP ;

: parse-name ( "name" -- c-addr u )
   SOURCE >IN @ /STRING
   ['] isspace? xt-skip OVER >R
   ['] isnotspace? xt-skip ( end-word restlen r: start-word )
   2DUP 1 MIN + SOURCE DROP - >IN !
   DROP R> TUCK - ;

Testing:

T{ PARSE-NAME abcd S" abcd" S= -> <TRUE> }T
T{ PARSE-NAME   abcde   S" abcde" S= -> <TRUE> }T

\ test empty parse area
T{ PARSE-NAME 
   NIP -> 0 }T
   \ empty line
T{ PARSE-NAME    
   NIP -> 0 }T
   \ line with white space

T{ : parse-name-test ( "name1" "name2" -- n ) 
   PARSE-NAME PARSE-NAME S= ; -> }T

T{ parse-name-test abcd abcd -> <TRUE> }T
T{ parse-name-test  abcd   abcd   -> <TRUE> }T
T{ parse-name-test abcde abcdf -> <FALSE> }T
T{ parse-name-test abcdf abcde -> <FALSE> }T
T{ parse-name-test abcde abcde 
    -> <TRUE> }T

T{ parse-name-test abcde abcde  
    -> <TRUE> }T
   \ line with white space

ContributeContributions

AntonErtlavatar of AntonErtl [35] Space delimiters and white spaceComment2017-10-26 16:45:24

For the meaning of space delimiters, see 3.4.1.1 and 11.3.5. The same meaning is intended for "white space".

Reply New Version

ruvavatar of ruv [92] Etymology and naming convention issueComment2019-07-09 17:08:46

In the following definition names: NAME>STRING, NAME>INTERPRET, NAME>COMPILE and FIND-NAME (in a proposal) — the "NAME" part means (connotates) a name token. The same for >NAME and NAME> from Forth-83 standard (where it was a name field).

This PARSE-NAME definition has nothing to do with the name tokens. And "NAME" part means something else in this case. It means (connotates) a sequence of arbitrary characters delimited by space.

This difference in meaning exposes an inconsistency in the naming convention. This inconsistency can be solved in the following ways:

a. Use some another word in place of "NAME" in this definition name. For example "LEXEME" (e.g. PARSE-LEXEME )

b. Use some another word in place of "NAME" in the names in the first group. For example "NT" (e.g. FIND-NT )

AntonErtlavatar of AntonErtl

PARSE-NAME was originally proposed as PARSE-WORD, but that name has conflicting behaviour in some widely-used systems, so PARSE-NAME was chosen.

While consistency in naming conventions is desirable, and it would be nice if no name existed that might be considered to belong to a different group of words than it actually belongs to, that seems too high a standard to satisfy in all cases. And in particular, I don't think the benefits of this principle are worth the cost that renaming existing standard words would entail.

ruvavatar of ruv

When PARSE-NAME was chosen (2005) the definitions from the first group (2012) was not proposed yet. So this inconsistency did not exist in those time (and I did not have any objection for PARSE-NAME name too). It has appeared far later.

Perhaps in the future some kind of namespaces will be introduced in Forth to distinguish the groups of words. But for now we should choose names more carefully. I suppose that inconsistency in naming confuses the newbies and gives the impression of a bad designed programming language. So I would like to make Forth Standard better.

I agree that we can't just rename the already existing standard words. But we can add the synonyms with better names, and leave the old variants for back compatibility.

Reply New Version