Interpretation semantics for this word are undefined.


( "ccc<quote>" -- )
Parse ccc delimited by " (double-quote), using the translation rules below. Append the run-time semantics given below to the current definition.

Translation rules:

Characters are processed one at a time and appended to the compiled string. If the character is a `\' character it is processed by parsing and substituting one or more characters as follows, where the character after the backslash is case sensitive:

\a BEL (alert, ASCII 7)
\b BS (backspace, ASCII 8)
\e ESC (escape, ASCII 27)
\f FF (form feed, ASCII 12)
\l LF (line feed, ASCII 10)
\m CR/LF pair (ASCII 13, 10)
\n newline(implementation dependent , e.g., CR/LF, CR, LF, LF/CR)
\q double-quote (ASCII 34)
\r CR (carriage return, ASCII 13)
\t HT (horizontal tab, ASCII 9)
\v VT (vertical tab, ASCII 11)
\z NUL (no character, ASCII 0)
\" double-quote (ASCII 34)
The resulting character is the conversion of these two hexadecimal digits. An ambiguous conditions exists if \x is not followed by two hexadecimal characters.
\\ backslash itself (ASCII 92)

An ambiguous condition exists if a \ is placed before any character, other than those defined in here.


( -- c-addr u )
Return c-addr and u describing a string consisting of the translation of the characters ccc. A program shall not alter the returned string.



AntonErtlavatar of AntonErtl Core-ext S\" should reference File-ext S\"Proposal2017-04-16 08:03:17

The references don't link to File-ext S\". At least one user thought that the standard does not define interpretation semantics for S\" at all.

Do we actually need Core-ext S\" at all? Isn't File-ext S\" enough? Are there any systems that implement the Core-Ext version and not the File-ext version?

BerndPaysanavatar of BerndPaysan 2017-04-16 21:54:19

Actially, the file-ext version doesn't make much sense. S" is in file-ext, because for the file words, you need strings in interpretation mode. S\" however is for strings with special characters, and file systems often forbid these special characters.

IMHO, the core ext S\" should have interpretation semantics, and the file-ext one should go. Minimalistic systems which don't want the burden of interpretation semantics in S" will certainly not implement the rather complicated parser for S\".

gnuarmavatar of gnuarm 2017-04-17 13:40:27

I can see S\" being useful in embedded systems to construct strings for comms. Since embedded systems often don't have file systems it would be awkward to include the file words just to get a version of S\". To me the question is why it needs to be in File-Ext?

I thought the lack of interpretation semantics in Core-Ext was so the string could be stored in non-volatile memory, but I don't know if that holds water. I'm also unclear on the requirement to not modify the string. Is that related to allowing the string to be compiled into non-volatile memory? Of so, why is that still imposed on the File-Ext version which has added interpretation semantics?

I guess I'm not clear on the reasons for these two restrictions.

AntonErtlavatar of AntonErtl 2017-04-17 14:43:17

You can include interpretive S\" in your system even if you don't have file words. You may or may not be able to claim that the system has file-ext S\", but who cares about that anyway. But sure, putting interpretive S\" in file-ext is questionable. Should we move it elsewhere? Not sure. Disadvantage: Links to http://forth-standard.org/standard/file/Seq would break. Is the advantage big enough to justify that?

The reason for the core-ext and file-ext S\" is probably because S" also has this split.

The reasons why a program must not change the string returned by compiled S\" is the same in both versions: The c-addr may point to compiled code. Changing the string may change the string that is returned the next time that code is executed; or it may not; or it may cause an exception (if the string is protected by an MMU); or it may not even have an effect on the string that is returned this time (if the string is in ROM). Interestingly, that restriction is specified only for the compiled string: it's in the run-time semantics that is not exercised by the interpretation semantics.

BerndPaysanavatar of BerndPaysan 2017-04-17 22:31:57

Links don't break if we use a "moved permanently" redirect for expired links; that's a problem with a technical solution.

We might want to introduce something like an "optional functionality" of a word, without having it in two places. The interpretation semantics of S" and S\" is optional. Note that S\" is in FILE EXT, so it's even optional if you have the file wordset. The file wordset would require S"s interpretation semantics (no longer optional if you have the file wordset).

StephenPelcavatar of StephenPelc 2017-04-19 16:35:24

IMHO the S" and S\" words should be moved out of the File word sets. It's just confusing and the conditions of 20 years ago no longer apply.

However, the decisions about interpretation and compilation semantics should surely be deferred until we have a notation for building separate semantics without invoking state-smart words. There are ways to do this that may even satisfy both Anton and Stephen, but they probably need to be locked into a room together at some time.

I am preparing a paper on the topic, but need more time before it can be published.


GerryJacksonavatar of GerryJackson Reference implementation for S\"Suggested reference implementation2018-01-30 08:13:52

A reference implementation for S\" exists at http://www.forth200x.org/escaped-strings.html. Shouldn't this be included in Annex E?

Similarly test cases for S\" should be in Annex F, the above link also has these.