6.1.0570 >NUMBER to-number CORE

( ud1 c-addr1 u1 -- ud2 c-addr2 u2 )

ud2 is the unsigned result of converting the characters within the string specified by c-addr1 u1 into digits, using the number in BASE, and adding each into ud1 after multiplying ud1 by the number in BASE. Conversion continues left-to-right until a character that is not convertible, including any "+" or "-", is encountered or the string is entirely converted. c-addr2 is the location of the first unconverted character or the first character past the end of the string if the string was entirely converted. u2 is the number of unconverted characters in the string. An ambiguous condition exists if ud2 overflows during the conversion.

See:

Testing:

CREATE GN-BUF 0 C,
: GN-STRING GN-BUF 1 ;
: GN-CONSUMED GN-BUF CHAR+ 0 ;
: GN' [CHAR] ' WORD CHAR+ C@ GN-BUF C! GN-STRING ;

T{ 0 0 GN' 0' >NUMBER ->         0 0 GN-CONSUMED }T
T{ 0 0 GN' 1' >NUMBER ->         1 0 GN-CONSUMED }T
T{ 1 0 GN' 1' >NUMBER -> BASE @ 1+ 0 GN-CONSUMED }T
\ FOLLOWING SHOULD FAIL TO CONVERT
T{ 0 0 GN' -' >NUMBER ->         0 0 GN-STRING   }T
T{ 0 0 GN' +' >NUMBER ->         0 0 GN-STRING   }T
T{ 0 0 GN' .' >NUMBER ->         0 0 GN-STRING   }T

: >NUMBER-BASED
   BASE @ >R BASE ! >NUMBER R> BASE ! ;

T{ 0 0 GN' 2'       10 >NUMBER-BASED ->  2 0 GN-CONSUMED }T
T{ 0 0 GN' 2'        2 >NUMBER-BASED ->  0 0 GN-STRING   }T
T{ 0 0 GN' F'       10 >NUMBER-BASED ->  F 0 GN-CONSUMED }T
T{ 0 0 GN' G'       10 >NUMBER-BASED ->  0 0 GN-STRING   }T
T{ 0 0 GN' G' MAX-BASE >NUMBER-BASED -> 10 0 GN-CONSUMED }T
T{ 0 0 GN' Z' MAX-BASE >NUMBER-BASED -> 23 0 GN-CONSUMED }T

: GN1 ( UD BASE -- UD' LEN )
   \ UD SHOULD EQUAL UD' AND LEN SHOULD BE ZERO.
   BASE @ >R BASE !
   <# #S #>
   0 0 2SWAP >NUMBER SWAP DROP    \ RETURN LENGTH ONLY
   R> BASE ! ;

T{        0   0        2 GN1 ->        0   0 0 }T
T{ MAX-UINT   0        2 GN1 -> MAX-UINT   0 0 }T
T{ MAX-UINT DUP        2 GN1 -> MAX-UINT DUP 0 }T
T{        0   0 MAX-BASE GN1 ->        0   0 0 }T
T{ MAX-UINT   0 MAX-BASE GN1 -> MAX-UINT   0 0 }T
T{ MAX-UINT DUP MAX-BASE GN1 -> MAX-UINT DUP 0 }T

ContributeContributions

FrancoisLaagelavatar of FrancoisLaagel [262] >NUMBER Test PatternsSuggested Testcase2022-08-28 11:10:27

I am trying to implement >NUMBER on the top of a working 79-STANDARD CONVERT implementation. What I came up with so far is:

: >NUMBER ( ud1 c-addr1 u1 -- ud2 c-addr2 u2 )

R DUP >R \ S: ud1 c-addr1 R: u1 c-addr1 1- CONVERT \ S: ud2 c-addr2 R: u1 c-addr1 DUP R> - \ S: ud2 c-addr2 c-addr2-c-addr1 R: u1 R> SWAP - ; \ S: ud2 c-addr2 u1-(c-addr2-c-addr1)

But I have a problem with one of the test cases:

T{ 0 0 GN' F' 10 >NUMBER-BASED -> F 0 GN-CONSUMED }T

My interpreter chokes on the F character right after -> because the current BASE is decimal. I think this test pattern is incorrect. Anyone cares to shed tome light on this?

FrancoisLaagelavatar of FrancoisLaagel

Problem solved. core.fr is supposed to be run in HEX. I was running the >NUMBER test cases interactively after doubletest.fth had switched BASE to DECIMAL.

Closed
Reply New Version

EricBlakeavatar of EricBlake [402] Possible reference implementationSuggested reference implementation2025-08-17 00:00:26

The following uses a couple of words from the double and string sets; the alternative using only words in core is more verbose.

: >digit ( char -- d true | 0 ) \ "to-digit"
  \ convert char to a digit according to base followed by true, or false if out of range
  DUP [ '9' 1+ ] LITERAL <
  IF '0' - \ convert '0'-'9'
    DUP 0< IF DROP 0 EXIT THEN \ reject < '0'
  ELSE
    DUP 'a' < IF BL + THEN \ convert to lowercase, exploiting ASCII
    'a' -
    DUP 0< IF DROP 0 EXIT THEN \ reject non-letter < 'a'
    #10 + \ convert 'a'-'z'
  THEN
  DUP BASE @ < DUP 0= IF NIP THEN ( d true | false ) \ reject beyond base
;
: >NUMBER ( ud1 c-addr1 u1 -- ud2 c-addr2 u2 ) \ "to-number"
  2SWAP 2>R
  BEGIN ( c-addr u ) ( R: ud.accum )
    DUP WHILE \ character left to inspect
      OVER C@ >digit
    WHILE \ digit parsed within base
      2R> BASE @ 1 M*/ ( c-addr u d.digit ud.accum ) \ scale accum by base
      ROT M+ 2>R \ add current digit to accum
      1 /STRING ( c-addr1+1 u1-1 )
  REPEAT THEN
  2R> 2SWAP ( ud2 c-addr2 u2 )
;

EricBlakeavatar of EricBlake

DUP 'a' < IF BL + THEN \ convert to lowercase, exploiting ASCII can be shortened to BL OR

EricBlakeavatar of EricBlakeNew Version: Possible reference implementation

Hide differences

The following uses a couple of words from the double and string sets; the alternative using only words in core is more verbose.


: >digit ( char -- d true | 0 ) \ "to-digit"

: >digit ( char -- +n true | 0 ) \ "to-digit"

\ convert char to a digit according to base followed by true, or false if out of range DUP [ '9' 1+ ] LITERAL < IF '0' - \ convert '0'-'9' DUP 0< IF DROP 0 EXIT THEN \ reject < '0' ELSE

DUP 'a' < IF BL + THEN \ convert to lowercase, exploiting ASCII
BL OR \ convert to lowercase, exploiting ASCII
'a' -
DUP 0< IF DROP 0 EXIT THEN \ reject non-letter < 'a'
#10 + \ convert 'a'-'z'

THEN

DUP BASE @ < DUP 0= IF NIP THEN ( d true | false ) \ reject beyond base

DUP BASE @ < DUP 0= IF NIP THEN ( +n true | false ) \ reject beyond base

; : >NUMBER ( ud1 c-addr1 u1 -- ud2 c-addr2 u2 ) \ "to-number" 2SWAP 2>R BEGIN ( c-addr u ) ( R: ud.accum ) DUP WHILE \ character left to inspect OVER C@ >digit WHILE \ digit parsed within base

  2R> BASE @ 1 M*/ ( c-addr u d.digit ud.accum ) \ scale accum by base
  2R> BASE @ 1 M*/ ( c-addr u n.digit ud.accum ) \ scale accum by base
  ROT M+ 2>R \ add current digit to accum
  1 /STRING ( c-addr1+1 u1-1 )

REPEAT THEN 2R> 2SWAP ( ud2 c-addr2 u2 ) ;


Reply New Version