I don't really see how the implementation I propose is insufficient for your system, unless 0<> does not return either 0 or -1. I see that TRUE is declared to return "a single-cell value with all bits set", so maybe that doesn't translate to -1 in your system?
- In a twos-complement system: -a cell with all bits set is -1 when interpreted as a signed number.
- In a ones-complement system, -1 is equal to
1 INVERT
(so it has most bits set, but the least significant cleared); the value with all bits set is -0 (except that the standard points out that the usual arithmetic operations like + and * should never result in a -0; you can only get it as a flag or via bit manipulations).
- In a sign-magnitude system, -1 is equal to
1 SIGN-BIT XOR
(so it has only 2 bits set); the value with all bits set is the negative of the maximum positive signed value; this representation also has a -0, but that only has one bit set. Historical sign-magnitude machines and IEEE floating point uses the most significant bit of storage as the sign bit, at least when converting the bit pattern to the corresponding unsigned value; but it is also possible to have sign-magnitude where the sign bit is adjacent to the least-significant bit.
In all three of those numeric encodings, the expressions 1 -1 *
and 1 NEGATE
will result in -1, but it is not usable as a canonical flag since the number of bits set differs by the encoding. Meanwhile, a cell with all bits set treated as an unsigned integer would be the maximum unsigned integer - but only if the system does not use the standard's escape clause of capping the maximum u value at the same as the maximum d value rather than using the full cell.
(And there's the historical mess that older Forth standards allowed systems where TRUE was equal to 1 rather than all bits set, matching some common hardware setups and the C language)
It is because the interpretation of a cell with all bits set has three different values in the three different encodings that Forth declares it to be ambiguous behavior if you ever perform math directly on a flag value; at most, you are guaranteed that you can perform bitwise logical operations on a flag value plus an integer to then result in an integer or zero (that is, a b = 1+
is ambiguous, but a b = -1 XOR 1+
is well-defined). In fact, because all bits set in ones complement is -0, the standard is careful to describe non-canonical flags as a cell with at least one bit set (rather than a cell with a non-zero value). Of course, when the next version of Forth requires twos-complement, it should also get rid of the ambiguity of doing math on a flag value, and we could do an editorial cleanup of all other places where the standard was dancing around concessions to ones-complement or sign-magnitude cell behavior.
On my particular system atop a VM with no fixed cell size, and for that matter, no native negative number support, I have found it easier to actually encode positive vs. negative by using a sign-magnitude with the least-significant bit as the sign bit, dispatching to the right variant of + or * based on the sign bit, then shifting that bit away before using the VM's native math on the resulting unsigned values. But when it comes to other operations, like XOR, I'm finding it easier to just declare that I convert a number between positive and negative as if by twos-complement modular arithmetic rules (even if that's not what is actually happening in the underlying bit representation). At which point, I'm finding it easier to declare that by fiat, -1 behaves as if it has all bits set (even though in the VM representation it may only have 2 bits set).
Despite these points, I'm far more interested in the other question: should reference implementations proposed here focus more towards concisely conveying the intent of the word (like 0. 2SWAP D-), or should they attempt to offer reasonable implementations that can be used to fill out a nascent system (such as something that does not cyclically reference D-'s implementation as DNEGATE D+)?
As a user, I prefer concise implementations that convey intent, even if it leads to circular references. As an implementer, I would prefer the extra leg up in having a topological sorting on how to build bigger pieces out of smaller building blocks, even if that leads to more verbosity or non-optimal implementations. I guess it boils down to deciding whether the standard is intended to be more user-friendly or more implementer-friendly; in my book, I think the balance swings in favor of users.