6 Glossary

6.1.0570 >NUMBER to-number CORE

( ud₁ c-addr₁ u₁ -- ud₂ c-addr₂ u₂ )

ud₂ is the unsigned result of converting the characters within the string specified by c-addr₁ u₁ into digits, using the number in BASE, and adding each into ud₁ after multiplying ud₁ by the number in BASE. Conversion continues left-to-right until a character that is not convertible, including any "+" or "-", is encountered or the string is entirely converted. c-addr₂ is the location of the first unconverted character or the first character past the end of the string if the string was entirely converted. u₂ is the number of unconverted characters in the string. An ambiguous condition exists if ud₂ overflows during the conversion.

See:

3.2.1.2 Digit conversion.

Testing:

CREATE GN-BUF 0 C,
: GN-STRING GN-BUF 1 ;
: GN-CONSUMED GN-BUF CHAR+ 0 ;
: GN' [CHAR] ' WORD CHAR+ C@ GN-BUF C! GN-STRING ;

T{ 0 0 GN' 0' >NUMBER ->         0 0 GN-CONSUMED }T
T{ 0 0 GN' 1' >NUMBER ->         1 0 GN-CONSUMED }T
T{ 1 0 GN' 1' >NUMBER -> BASE @ 1+ 0 GN-CONSUMED }T
\ FOLLOWING SHOULD FAIL TO CONVERT
T{ 0 0 GN' -' >NUMBER ->         0 0 GN-STRING   }T
T{ 0 0 GN' +' >NUMBER ->         0 0 GN-STRING   }T
T{ 0 0 GN' .' >NUMBER ->         0 0 GN-STRING   }T

: >NUMBER-BASED
BASE @ >R BASE ! >NUMBER R> BASE ! ;

T{ 0 0 GN' 2'       10 >NUMBER-BASED -> 2 0 GN-CONSUMED }T
T{ 0 0 GN' 2'        2 >NUMBER-BASED -> 0 0 GN-STRING   }T
T{ 0 0 GN' F'       10 >NUMBER-BASED -> F 0 GN-CONSUMED }T
T{ 0 0 GN' G'       10 >NUMBER-BASED -> 0 0 GN-STRING   }T
T{ 0 0 GN' G' MAX-BASE >NUMBER-BASED -> 10 0 GN-CONSUMED }T
T{ 0 0 GN' Z' MAX-BASE >NUMBER-BASED -> 23 0 GN-CONSUMED }T

: GN1 ( UD BASE -- UD' LEN )
   \ UD SHOULD EQUAL UD' AND LEN SHOULD BE ZERO.
   BASE @ >R BASE !
   <# #S #>
   0 0 2SWAP >NUMBER SWAP DROP    \ RETURN LENGTH ONLY
   R> BASE ! ;

T{        0   0        2 GN1 ->        0   0 0 }T
T{ MAX-UINT   0        2 GN1 -> MAX-UINT   0 0 }T
T{ MAX-UINT DUP        2 GN1 -> MAX-UINT DUP 0 }T
T{        0   0 MAX-BASE GN1 ->        0   0 0 }T
T{ MAX-UINT   0 MAX-BASE GN1 -> MAX-UINT   0 0 }T
T{ MAX-UINT DUP MAX-BASE GN1 -> MAX-UINT DUP 0 }T

ContributeContributions

FrancoisLaagel [262] >NUMBER Test PatternsSuggested Testcase2022-08-28 11:10:27

I am trying to implement >NUMBER on the top of a working 79-STANDARD CONVERT implementation. What I came up with so far is:

: >NUMBER ( ud1 c-addr1 u1 -- ud2 c-addr2 u2 )

R DUP >R \ S: ud1 c-addr1 R: u1 c-addr1 1- CONVERT \ S: ud2 c-addr2 R: u1 c-addr1 DUP R> - \ S: ud2 c-addr2 c-addr2-c-addr1 R: u1 R> SWAP - ; \ S: ud2 c-addr2 u1-(c-addr2-c-addr1)

But I have a problem with one of the test cases:

T{ 0 0 GN' F' 10 >NUMBER-BASED -> F 0 GN-CONSUMED }T

My interpreter chokes on the F character right after -> because the current BASE is decimal. I think this test pattern is incorrect. Anyone cares to shed tome light on this?

FrancoisLaagel [r860] 2022-08-29 09:47:45

Problem solved. core.fr is supposed to be run in HEX. I was running the >NUMBER test cases interactively after doubletest.fth had switched BASE to DECIMAL.

Closed

Reply New Version

EricBlake [402] Possible reference implementationSuggested reference implementation2025-08-17 00:00:26

The following uses a couple of words from the double and string sets; the alternative using only words in core is more verbose.

: >digit ( char -- d true | 0 ) \ "to-digit"
  \ convert char to a digit according to base followed by true, or false if out of range
  DUP [ '9' 1+ ] LITERAL <
  IF '0' - \ convert '0'-'9'
    DUP 0< IF DROP 0 EXIT THEN \ reject < '0'
  ELSE
    DUP 'a' < IF BL + THEN \ convert to lowercase, exploiting ASCII
    'a' -
    DUP 0< IF DROP 0 EXIT THEN \ reject non-letter < 'a'
    #10 + \ convert 'a'-'z'
  THEN
  DUP BASE @ < DUP 0= IF NIP THEN ( d true | false ) \ reject beyond base
;
: >NUMBER ( ud1 c-addr1 u1 -- ud2 c-addr2 u2 ) \ "to-number"
  2SWAP 2>R
  BEGIN ( c-addr u ) ( R: ud.accum )
    DUP WHILE \ character left to inspect
      OVER C@ >digit
    WHILE \ digit parsed within base
      2R> BASE @ 1 M*/ ( c-addr u d.digit ud.accum ) \ scale accum by base
      ROT M+ 2>R \ add current digit to accum
      1 /STRING ( c-addr1+1 u1-1 )
  REPEAT THEN
  2R> 2SWAP ( ud2 c-addr2 u2 )
;

EricBlake [r1498] 2025-08-17 00:39:27

DUP 'a' < IF BL + THEN \ convert to lowercase, exploiting ASCII can be shortened to BL OR

EricBlakeNew Version: Possible reference implementation [r1499] 2025-08-17 15:40:24

Hide differences

The following uses a couple of words from the double and string sets; the alternative using only words in core is more verbose.

~~: >digit ( char -- d true | 0 ) \ "to-digit"~~

: >digit ( char -- +n true | 0 ) \ "to-digit"

\ convert char to a digit according to base followed by true, or false if out of range DUP [ '9' 1+ ] LITERAL < IF '0' - \ convert '0'-'9' DUP 0< IF DROP 0 EXIT THEN \ reject < '0' ELSE

~~DUP 'a' < IF BL + THEN \ convert to lowercase, exploiting ASCII~~

BL OR \ convert to lowercase, exploiting ASCII

'a' -
DUP 0< IF DROP 0 EXIT THEN \ reject non-letter < 'a'
#10 + \ convert 'a'-'z'

THEN

~~DUP BASE @ < DUP 0= IF NIP THEN ( d true | false ) \ reject beyond base~~

DUP BASE @ < DUP 0= IF NIP THEN ( +n true | false ) \ reject beyond base

; : >NUMBER ( ud1 c-addr1 u1 -- ud2 c-addr2 u2 ) \ "to-number" 2SWAP 2>R BEGIN ( c-addr u ) ( R: ud.accum ) DUP WHILE \ character left to inspect OVER C@ >digit WHILE \ digit parsed within base

~~2R> BASE @ 1 M*/ ( c-addr u d.digit ud.accum ) \ scale accum by base~~

  2R> BASE @ 1 M*/ ( c-addr u n.digit ud.accum ) \ scale accum by base

  ROT M+ 2>R \ add current digit to accum
  1 /STRING ( c-addr1+1 u1-1 )

REPEAT THEN 2R> 2SWAP ( ud2 c-addr2 u2 ) ;

Reply New Version