17.6.2.2255 SUBSTITUTE STRING EXT
Perform substitution on the string c-addr1 u1 placing the result at string c-addr2 u3, where u3 is the length of the resulting string. An error occurs if the resulting string will not fit into c-addr2 u2 or if c-addr2 is the same as c-addr1. The return value n is positive or 0 on success and indicates the number of substitutions made. A negative value for n indicates that an error occurred, leaving c-addr2 u3 undefined. Negative values of n are implementation defined except for values in table 9.1 THROW code assignments.
Substitution occurs left to right from the start of c-addr1 in one pass and is non-recursive.
When text of a potential substitution name, surrounded by `%
' (ASCII $25) delimiters
is encountered by SUBSTITUTE, the following occurs:
- If the name is null, a single delimiter character is passed
to the output, i.e.,
%%
is replaced by%
. The current number of substitutions is not changed. - If the text is a valid substitution name acceptable to
17.6.2.2141 REPLACES, the leading and trailing delimiter
characters and the enclosed substitution name are replaced by
the substitution text. The current number of substitutions
is incremented.
- If the text is not a valid substitution name, the name with
leading and trailing delimiters is passed unchanged to the
output. The current number of substitutions is not changed.
- Parsing of the input string resumes after the trailing delimiter.
If after processing any pairs of delimiters, the residue of the input string contains a single delimiter, the residue is passed unchanged to the output.
See:
Rationale:
Translation of a sentence or message from one language to another may result in changes to the displayed parameter order. The example, the Afrikaans translation of this sentence requires a different order:
The words SUBSTITUTE and REPLACES provide for this requirements by defining a text substitution facility. For example, we can provide an initial string in the form:
Your balance at %time% on %date% is %currencyvalue%.
The %
is used as delimiters for the substitution name. The
text "currencyvalue
", "date
" and "time
"
are text substitutions, where the replacement text is defined by
REPLACES:
The substitution name "date" is defined to be replaced with the string "10/Nov/2014" and "time" to be replaced with "02:52". Thus SUBSTITUTE would produce the string:
Your balance at 02:52 on 10/Nov/2014 is %currencyvalue%.
As the substitution name "currencyvalue" has not been defined, it is left unchanged in the resulting string.
The return value n is nonnegative on success and indicates the number of substitutions made. In the above example, this would be two. A negative value indicates that an error occurred. As substitution is not recursive, the return value could be used to provide a recursive substitution.
Implementation of SUBSTITUTE may be considered as being equivalent to a wordlist which is searched. If the substitution name is found, the word is executed, returning a substitution string. Such words can be deferred or multiple wordlists can be used. The implementation techniques required are similar to those used by ENVIRONMENT?. There is no provision for changing the delimiter character, although a system may provide system-specific extensions.
Implementation:
[UNDEFINED] bounds [IF]
: bounds \ addr len -- addr+len addr
OVER + SWAP
;
[THEN]
[UNDEFINED] -rot [IF]
: -rot \ a b c -- c a b
ROT ROT
;
[THEN]
CHAR % CONSTANT delim
\ Character used as the substitution name delimiter.
string-max BUFFER: Name
\ Holds substitution name as a counted string.
VARIABLE DestLen
\ Maximum length of the destination buffer.
2VARIABLE Dest
\ Holds destination string current length and address.
VARIABLE SubstErr
\ Holds zero or an error code.
: addDest \ char --
\ Add the character to the destination string.
Dest @ DestLen @ < IF
Dest 2@ + C! 1 CHARS Dest +!
ELSE
DROP -1 SubstErr !
THEN
;
: formName \ c-addr len -- c-addr' len'
\ Given a source string pointing at a leading delimiter, place the name string in the name buffer.
1 /STRING 2DUP delim scan >R DROP \ find length of residue
2DUP R> - DUP >R Name place \ save name in buffer
R> 1 CHARS + /STRING \ step over name and trailing %
;
: >dest \ c-addr len --
\ Add a string to the output string.
bounds ?DO
I C@ addDest
1 CHARS +LOOP
;
: processName \ -- flag
\ Process the last substitution name. Return true if found, 0 if not found.
Name COUNT findSubst DUP >R IF
EXECUTE COUNT >dest
ELSE
delim addDest Name COUNT >dest delim addDest
THEN
R>
;
: SUBSTITUTE \ src slen dest dlen -- dest dlen' n
\ Expand the source string using substitutions.
\ Note that this version is simplistic, performs no error checking,
\ and requires a global buffer and global variables.
Destlen ! 0 Dest 2! 0 -rot \ -- 0 src slen
0 SubstErr !
BEGIN
DUP 0 >
WHILE
OVER C@ delim <> IF \ character not %
OVER C@ addDest 1 /STRING
ELSE
OVER 1 CHARS + C@ delim = IF \ %%
for one output %
delim addDest 2 /STRING \ add one %
to output
ELSE
formName processName IF
ROT 1+ -rot \ count substitutions
THEN
THEN
THEN
REPEAT
2DROP Dest 2@ ROT SubstErr @ IF
DROP SubstErr @
THEN
;
Testing:
\ Define a few string constants
: "hi" S" hi" ;
: "wld" S" wld" ;
: "hello" S" hello" ;
: "world" S" world" ;
\ Define a few test strings
: sub1 S" Start: %hi%,%wld%! :End" ; \ Original string
: sub2 S" Start: hello,world! :End" ; \ First target string
: sub3 S" Start: world,hello! :End" ; \ Second target string
\ Define the hi
and wld
substitutions
T{ "hello" "hi" REPLACES -> }T \ Replace "%hi%
" with "hello
"
T{ "world" "wld" REPLACES -> }T \ Replace "%wld%
" with "world
"
\ "%hi%,%wld%
" changed to "hello,world
"
T{ sub1 subbuff 30 SUBSTITUTE ROT ROT sub2 COMPARE -> 2 0 }T
\ Change the hi
and wld
substitutions
T{ "world" "hi" REPLACES -> }T
T{ "hello" "wld" REPLACES -> }T
\ Now "%hi%,%wld%
" should be changed to "world,hello
"
T{ sub1 subbuff 30 SUBSTITUTE ROT ROT sub3 COMPARE -> 2 0 }T
\ Where the subsitution name is not defined
: sub4 S" aaa%bbb%ccc" ;
T{ sub4 subbuff 30 SUBSTITUTE ROT ROT sub4 COMPARE -> 0 0 }T
\ Finally the %
character itself
: sub5 S" aaa%%bbb" ;
: sub6 S" aaa%bbb" ;
T{ sub5 subbuff 30 SUBSTITUTE ROT ROT sub6 COMPARE -> 0 0 }T
ContributeContributions
alextangent [30] Case sensitivityRequest for clarification2017-07-10 18:29:41
Is %test% considered the same string as or different from %TEST% ?