Proposal: [344] Support for single line comments during `evaluate`

Informal

This page is dedicated to discussing this specific proposal

ContributeContributions

ruv [344] Support for single line comments during `evaluate`Proposal2024-06-17 20:47:11

Author

Ruv

Change Log

2024-06-17 Initial version

Problem

Sometimes it is required to apply evaluate to a string that contains multiple text lines (text fragments separated with a line terminator sequence) and single-line comments (i.e., the word "\"). In this case the word "\" will skip all text lines till the end of the string. The desired behavior is that it skips the text till (and including) the nearest line terminator only, if the parse area contains a line terminator.

Previous work

Some known discussions/posts on this problem:

2021-11-22 New Line characters in a string passed to EVALUATE
2022-10-16 Portable line-oriented parsing

Solution

Change the glossary entry 6.2.2535 \ (in CORE EXT) to include the functionality from 7.6.2.2535 \, so this word will work as expected regardless the input source kind. Namely, include this functionality:

parse and discard the portion of the parse area corresponding to the remainder of the current line.

The behavior will not change when the input source is a file, since in this case the input buffer contains only a single line.

Proposal

Remove the glossary entry 7.6.2.2535 \ (in BLOCK EXT)

In the glossary entry 6.2.2535 \ (in CORE EXT), replace the text description for the Execution semantics with the following;

Parse and discard the portion of the parse area corresponding to the remainder of the current line. \ is an immediate word.

Reference implementation

A portable implementation (redefinition) for the word \ is following.

: evaluation ( -- flag )
  \ Return a flag: is the input source a string being evaluated.
  [defined] blk [if] blk @ 0<> if false exit then [then]
  source-id -1 =
;
: source-following ( -- sd )
  \ Return the parse area (a string).
  \ NB: the returned string may contain a line-terminator sequence in any position.
  source >in @ /string
;
: skip-source-line ( -- )
  \ Discard a part of the parse area that belongs to the current line.
  evaluation 0= if ['] \ execute exit then
  source-following  over >r  s\" \n"  dup >r  search  if drop r@ then  +  rdrop
  r> -  >in +!
;
: \ ( -- )
  skip-source-line
; immediate

Testing

t{ s\" 1 \\ \n drop 0 " evaluate -> 0 }t

AntonErtl [r1238] 2024-06-18 07:17:32

What about other parsing words, e.g., s", or user-defined parsing words?

If you want to support multi-line evaluate, wouldn't it be better to extend evaluate to only present the first line to parsing, and then after refill the next line and so on. Then \ should work automatically as intended,

ruv [r1240] 2024-06-18 08:15:46

What about other parsing words, e.g., s", or user-defined parsing words?

It can be solved by adding: "When parsing from a text string using a space delimiter, control characters shall be treated the same as the space character" (I.e., the same as for a file).

So, parse-name will skip a line terminator in evaluating string too. (It seems like most systems already behave this way).

s", will be able to parse multiple text lines in an evaluating string (if " is not found in the current text line), but I don't see any problem with that.

If you want to support multi-line evaluate, wouldn't it be better to extend evaluate to only present the first line to parsing, and then after refill the next line and so on. Then \ should work automatically as intended,

My idea is that a program should not depend on refill, i.e., whether the parse area contains a single text line or multiple text lines.
Changing \ is almost portable. I.e., it can be implemented via a polyfill (a portable module).
This approach is slightly more efficient, since we don't need to break text into lines before feed the text interpreter.
Less internal states, easier implementation (compared to support for refilling when evaluating a string).

Reply New Version