Digest #109 2020-08-12

Contributions

[146] 2020-08-11 21:39:27 StephenPelc wrote:

proposal - 2020 Forth Standards meeting agenda

2020 Forth Standards Meeting

1-3 Sept 2020, Online

We expect to be using BigBlueButton or Zoom or Webex or some such. If you want to rant or rave about online meeting tools, it's outside the scope of this document. For reasonable discussion, contact Stephen Pelc, stephen@mpeforth.com

Schedule

The standards meeting will be Monday-Thursday from 2:30 pm to 6:30 pm CEST with a short bio-break at 4:30. An alternative would be to do Tuesday to Thursday and Sunday afternoon. The latter solution fits with at least one committee member who doesn't do Mondays!

Participants

Review of Procedures

Covid consequences
Brexit consequences
Payment for services/licences

Reports

Chair
Editor
Technical
Treasurer

Review of Proposals and Activities

Recognisers

Stay as experimental proposal?

Separate POSTPONE action?

Impact of dot parser on POSTPONE?

Multi-tasking from APH
Ambiguous condition and IMMEDIATE

TC answer by Bernd, 2019-09-12 15:19:24

Move from RUV, any further action?

CS-DROP from UH

say orig and dest must be same size

Go to vote?

Case insensitivity

ASCII case insensitivity only.

Go to vote?

Remove the “rules of FIND” (BP)

Locals word set?

Go to vote?

Reference implementation of SYNONYM (AE, RUV)

Broken reference implementation.

New reference implementation.

VOCABULARY (UH)
Unfindable definitions (RUV)
Case sensitivity in [IF] and friends.
FIND
FIND-NAME
License (JK, RUV)
String, EPLACES (RUV)

Error if macro does not exist during compilation?

Why RECURSE is needed (BI) Pick a TC answer.
Input values other than true and false [IF]

Pick flag as z/nz, vote, TC response

sample implementation that can also be interpreted (MAX)

Adopt RUV's response as TC answer.

Better wording for Colon (RUV)
NAME>INTERPRET wording (RUV)
The parts of execution semantics and the calling definition (RUV)
Recognizer RfD rephrase 2020 (UH)

Move to recogniser workshop

"(" typo in a testcase (RUV)

Assign to editor

Wording: declare undefined interpretation semantics for locals (RUV)

Remove ambiguous conditions

Word set of S>D word (RUV) Leave as is?
Same name token for different words (RUV)
Recognizer for locals (RUV)
There is error in testing SM/REM (MB)

Pass to editor

Defer Implementation (Tolich)
Recogniser (BP)

Move to recogniser workshop.

Does wording imply that if you SYNONYM a word with the same name (JN)
What happens when parse reaches the end of the parse area? (JN)
TEST instead of TEAT in F.1 para 2 (JN)

Pass to editor

Workshop Topics

Workshops are topics for discussion outside the formal meeting.

Future Document Format
Stack comments

stack comments should be parseable

Stack naming S: D: F: N: R:

stack effect notation

stack effect conventions

Test suites

Philosophy

J Hayes sequencing

G Jackson suite

Workshop reports

Consideration of proposals + CfV votes

Matters arising

Any other business

Date of next meeting

Replies

[r396] 2020-08-10 18:22:48 AntonErtl replies:

requestClarification - What happens when parse reaches the end of the parse area and the parse delimiter was not found?

I would love to point out chapter and verse for every issue, but at the volume of questions you are asking, this would be a full-time job. So below, I will just point out my understanding of the issues, and leave it to you to read the appropriate sections for details. Relevant sections are "3.4.1 Parsing", "3.3.3.5 Input buffers" (and the related block and file sections mentioned there), "11.6.1.1718 INCLUDED" (and similar words), "6.1.1360 EVALUATE" (and its blocks version) and "7.6.1.1790 LOAD".

Normal parsing does not REFILL. You need to REFILL explicitly; words that REFILL internally explicitly mention it (e.g., "11.6.1.0080 (")
Yes, reaching the end of the parse area (i.e., line in case of INCLUDED) ends parsing unless otherwise mentioned.
EVALUATE has specific effects (such as setting SOURCE-ID to -1) that are inappropriate during, e.g., INCLUDED. But you can certainly have a common factor between EVALUATE and INCLUDED for processing an input buffer.
Loading the whole file into RAM is probably in the spirit (and if the wording does not allow it, it probably should), but you then have to treat each line as a separate input buffer.
"Why is the standard dictating implementation instead of behavior?" It describes how a standard system behaves. If a standard program cannot detect the difference, you can implement it differently. In the present case, could a standard program write to the file during INCLUDED? That would be detectable, but as mentioned in 4, I think it should not be.
The way you describe your implementation, I expect it is not compliant. But one would have to look at the implementation, and then consider whether one can write a standard program that detects the difference. As a first step, you could run Gerry Jackson's test suite.
"But I am wondering what specific standard behaviors will break by doing it this way?" Every program that uses "0 parse" to get the rest of the line will not behave as intended. Every program that uses REFILL to switch to the next line will not behave as intended. Possibly more breakage.
"If not, is it possible to get the standard changed to allow this?" Very unlikely, because breaking existing programs (and such programs exist) is a no-no.
"But, is this really the intent of the standard?" Yes. Mitch Bradley made it clear. And if you read the references mentioned above, it is not just the intent, but also the wording of the standard.
Multi-line "(" is standard. See "11.6.1.0080 (". Multi-line ." S" and C" are not.
"can PARSE and all the words using it be changed to say they only go to the end of the line if the terminator is not found?" The standard already says that in 3.4.1. But if you want, you can propose changing the wording of the standard. I don't think it would help, though; it would just inflate the standard, making it harder to find the relevant text for other issues. We have this site for clarifying issues.

[r397] 2020-08-11 06:29:33 JamesNorris replies:

requestClarification - What happens when parse reaches the end of the parse area and the parse delimiter was not found?

Normal parsing does not REFILL. You need to REFILL explicitly; words that REFILL internally explicitly mention it (e.g., "11.6.1.0080 (")

I'll have to unfix " then... I made it only one line.

Yes, reaching the end of the parse area (i.e., line in case of INCLUDED) ends parsing unless otherwise mentioned.

The definition of PARSE above needs to say what is returned when the end of the parse area is reached and the terminator was not found.

EVALUATE has specific effects (such as setting SOURCE-ID to -1) that are inappropriate during, e.g., INCLUDED. But you can certainly have a common factor between EVALUATE and INCLUDED for processing an input buffer.

It was my understanding from reading the standard that implementing SOURCE-ID was optional.

Loading the whole file into RAM is probably in the spirit (and if the wording does not allow it, it probably should), but you then have to treat each line as a separate input buffer.

It does not. Why would I have to treat each line as a separate input buffer when I can instead consider line terminators to be white space when parsing words, and treat line terminators as delimiters when parsing for the end of the line in line comment? This way I avoid unnecessary copying.

"Why is the standard dictating implementation instead of behavior?" It describes how a standard system behaves. If a standard program cannot detect the difference, you can implement it differently. In the present case, could a standard program write to the file during INCLUDED? That would be detectable, but as mentioned in 4, I think it should not be.

My reply to number 4 illustrates my question. There is more than one way to do a line comment, or parse words and achieve the same result.

The way you describe your implementation, I expect it is not compliant. But one would have to look at the implementation, and then consider whether one can write a standard program that detects the difference. As a first step, you could run Gerry Jackson's test suite.

I would like to use the test suite, but the repository on github is missing a license as far as I can tell which means it is copyrighted and not available for the public. https://docs.github.com/en/github/creating-cloning-and-archiving-repositories/licensing-a-repository

"But I am wondering what specific standard behaviors will break by doing it this way?" Every program that uses "0 parse" to get the rest of the line will not behave as intended. Every program that uses REFILL to switch to the next line will not behave as intended. Possibly more breakage.

I fixed PARSE to stop at the end of the line by treating finding a line terminator as finding the end of the parse area for PARSE. I'm still loading the entire file into a buffer and then parsing the whole buffer in one go for PARSE-NAMES. I just have to treat line terminators as white space delimiters for PARSE-NAMES.

"If not, is it possible to get the standard changed to allow this?" Very unlikely, because breaking existing programs (and such programs exist) is a no-no.

Even if no existing programs or behaviors are broken? Such as doing a line comment in a different way that still has the interpreter skipping to the end of the line?

Multi-line "(" is standard. See "11.6.1.0080 (". Multi-line ." S" and C" are not.

I'll have to unfix that one too...

"can PARSE and all the words using it be changed to say they only go to the end of the line if the terminator is not found?" The standard already says that in 3.4.1. But if you want, you can propose changing the wording of the standard. I don't think it would help, though; it would just inflate the standard, making it harder to find the relevant text for other issues. We have this site for clarifying issues.

I missed this on first reading it. I'm guessing you are referring to this phrase:

3.4.1 Unless otherwise noted, the number of characters parsed may be from zero to the implementation-defined maximum length of a counted string.

It doesn't exactly say it's one line.... And PARSE doesn't return a counted string so... not sure why this is in there. Had to look up 'counted string' in the definition of terms which says:

counted string: A data structure consisting of one character containing a length followed by zero or more contiguous data characters. Normally, counted strings contain text.

And 'one character' is implementation defined. It's usually a byte but some use something larger to hold characters. So technically you are correct but I'm not sure of the reason for this rule.

[r398] 2020-08-11 12:30:40 JamesNorris replies:

requestClarification - What happens when parse reaches the end of the parse area and the parse delimiter was not found?

"The definition of PARSE above needs to say what is returned when the end of the parse area is reached and the terminator was not found."

I found this in the extended description of parsing:

"Otherwise, the string continues up to and including the last character in the parse area, and the number in >IN is changed to the length of the input buffer, thus emptying the parse area."

So it does say what to do if the end of the parse area is reached and the delimiter was not found. But it wouldn't hurt to have it up there with the short description of PARSE. Something like "If no end delimiter remains in the parse area, the parsed string equals the entire parse area." I'm not going to go so far as to make a proposal since it is already in the extended description.