Proposal: EMIT and non-ASCII values
This page is dedicated to discussing this specific proposal
ContributeContributions
AntonErtl
EMIT and non-ASCII valuesProposal2021-04-03 15:34:40
Author:
Anton Ertl
Change Log:
2021-04-03 Original proposal
Problem:
The first ideas for the xchar wordset had EMIT behave like (current) XEMIT. Then Stephen Pelc pointed out that EMIT is used in a number of programs for dealing with raw bytes, so we introduced XEMIT for dealing with extended characters. But the wording and stack effect of EMIT suggests that EMIT should deal with (possibly extended) characters rather than raw bytes. This is at odds with a number of implementations, and there is hardly any reason to keep both EMIT and XEMIT.
Solution:
Define EMIT to deal with raw bytes.
I leave a likewise proposal for KEY to interested parties.
Typical use: (Optional)
$c3 emit $a4 emit \ outputs ä on an UTF-8 system
Proposal:
Change the definition of EMIT into:
EMIT ( char -- )
Send char as raw byte to the user output device.
Rationale:
EMIT supports low-level communication of arbitrary contents, not limited to specific encodings; it corresponds to TYPEing one char/byte. To print multi-byte extended characters, the straightforward way is to use TYPE or XEMIT, but you can also print the individual bytes with multiple EMITs.
Reference implementation:
create emit-buf 1 allot
: emit ( char -- )
emit-buf c! emit-buf 1 type ;
Existing practice
Gforth, SwiftForth, and VFX implement EMIT as dealing with raw bytes (tested with the "typical use" above), but Peter Fälth's system implements EMIT as an alias of XEMIT, and iForth prints two funny characters. It is unclear if there are any existing programs affected by the proposed change.
Testing:
This cannot be tested from a standard program, because there is no way to inspect the output of EMIT.