Digest #15 2016-12-23
Contributions
proposal - Implementations requiring BOTH 32 bit single floats and 64 bit double floats.
Here the problem arises that what kind of floats are referred to by F* F/ etc., single (32bit) or double (64bit). so would it not be better to use two naming standards for both the precisions such as G* G/ for double operations, C* C/ for single complex numbers and Z* Z/ for double complex numbers. please advise.
Replies
Yes, the traverse-dir approach also has the advantage that you could pass the matching information, and either implement it on POSIX with fnmatch, or on DOS/Windows with the mask passed to FindFirstFile.
traverse-dir ( ix diraddr u1 patternaddr u2 xt -- kx ) with xt taking ( ix fileaddr u -- jx )
One question for mkdir
that automatically creates ancestors — should it remove the directories created on the previous steps if it fails on some step?
My old LAY-PATH
implementation does not remove intermediate steps on fail.
In any case, the operation is not atomic. But this should be mentioned.
Directory proposal
In order to write cross platform and cross system libraries it is essential to have means to traverse a systems file structure. This proposal is based upon the only known (by the authors) widly adopted implementation in Gforth.
Authors: Ulrich Hoffmann & Gerald Wodni
Add new Type wdirid (or the like) to section 3.x
Words for traversal:
open-dir ( c-addr u -- dirid ior )
Open the directory specified by c-addr, u and return dir-id for futher access to it.
read-dir ( c-addr u1 dirid -- u2 flag ior )
Attempt to read the next entry from the directory specified by dir-id to the buffer of length u1 at address c-addr. If the attempt fails because there is no more entries, ior=0, flag=0, u2=0, and the buffer is unmodified. If the attempt to read the next entry fails because of any other reason, return ior<>0. If the attempt succeeds, store file name to the buffer at c-addr and return ior=0, flag=true and u2 equal to the size of the file name. If the length of the file name is greater than u1, store first u1 characters from file name into the buffer and indicate "name too long" with ior, flag=true, and u2=u1.
close-dir ( dirid -- ior )
Close the directory specified by dir-id.
traverse-dir ( ix c-addr u xt -- kx ) with xt taking ( ix c-addr-filename u-filename -- jx )
Possible alternative/addition to the upper three words. Suggested by Bernd Paysan and Matthias Trute
dir? ( c-addr u -- flag ior )
check if path is a directory
make-dir ( c-addr u -- ior )
create the directory c-addr u and all its parents Remark: renamed mkdir-parents to mkdir, removed unix-specific umask
Words for pathes:
Description take from the Node.js manual:
normalize-path ( c-addr-1 u1 -- c-addr-1 u-2 )
Normalize a string path, taking care of '..' and '.' parts. When multiple slashes are found, they're replaced by a single one; when the path contains a trailing slash, it is preserved.
basename-path ( c-addr-1 u1 -- c-addr-2 u-2 )
Return the last portion of a path. Similar to the Unix basename command.
dirname-path ( c-addr-1 u1 -- c-addr-2 u-2 )
Return the directory name of a path. Similar to the Unix dirname command.
extname-path ( c-addr-1 u1 -- c--addr-2 u-2 )
Return the extension of the path, from the last '.' to end of string in the last portion of the path. If there is no '.' in the last portion of the path or the first character of it is '.', then it returns an empty string.
absolute-path? ( c-addr-1 u1 -- f )
Determines whether path is an absolute path. An absolute path will always resolve to the same location, regardless of the working directory.
join-path ( c-addr-1 u1 c-addr2 u2 -- c-addr-3 u3 )
Join all arguments together and normalize the resulting path. Arguments must be strings. Use implicit allocation?
filename-match ( c-addr1 u1 c-addr2 u2 – f )
check if both pathes match (after expanding) any '.' and '..'
parent-dir ( c-addr1 u1 -- c-addr1 u2 )
move up one directory
parent-dir? ( c-addr1 u1 -- f )
check if there is a parent directory
: parent-dir? >r r@ parent-dir nip r> = ;
@AntonErtl, In SP-Forth (up to 4.21 at least) the current wordlist on ;
should be the same as on :
since the SMUDGE
word (some alternative to REVEAL
) reveals (unhides) the last word in the current wordlist only. So, if the current wordlist was changed — the defined word will not be revealed (despite it is defined in the proper wordlist).
Also when multiple "storages" (data+code spaces, "sections" in your paper ) are used — word lists are bound to sections and the current section is changed when the current word list is changed (in case if they belong to the different sections). By the way, the temporary word lists (that reside in temporary sections) are used in the locals implementation from the they first version in 2001 :).
@mtrute, I even tested your definition of IN
— your example doesn't work. Regarding word lists -- I prefer them over vocabularies too.
People, don't you think that GitHub/ForthHub is more more convenient for such discussions?
Also, it seems to me that developing such proposals in GitHub (or perhaps in GitHub/ForthHub) is far more convenient than here. Version control gives too much to don't use it.
@mtrute, @BernyPaysan: traverse-dir would certainly be comfortable if you want to iterate a single directory. But having open/read/close one can easily compare multiple directories and it would be trivial to implement traverse-dir.
@ruv: I agree with you, that a version control would add quite some convenience. However normally proposals are developed in smaller groups and only presented to the public once they are pretty solid. At the standards meeting the committee decided to allow for experimental proposals, and Ullrich Hoffmann and me wanted to take it one step further and publish a very drafty version to collect some early feedback. Additionally let me point out that forth-standard.org tries to centralize the committees actions. That being said, I'll look into implementing some versioning features.
Thanks for pointing out the problems with a failing mkdir for parents!
@JennyBrien
For values the question is whether to A affects the value of both A and B or A alone. Similarly to B. If we want consistency with variables and constants then to A should change the value of both A and B and that is what I would expect.
I agree with that, but the question I was asking was rather:
I was responding to Anton's message and hadn't seen yours before I submitted mine. I think we were writing a reply at the same time!
Given Value A Synonym B A Is TO B guaranteed to work?
Yes I think so.
I think it would naturally for a flag-setting TO but it may fail on some systems with a parsing TO,
The standard insists on a parsing to
, see the specification for to
, so we don't need to consider that.
As regards the rest of your reply I think I agree with much of it. My system, which has detached headers, simply provides the same xt for both B and A for synonym B A
so anything that is done to A or B is done to both whether by does> to is
or defer!
. And that seems reasonable to me although I wouldn't object to does>
being forbidden.
Actually I've just thought of another issue that would break my system, I think, and that is if we have the sequence
create A
\ Possibly more definitions etc but no application of DOES> to A
: X does> ... ;
synonym B A X
i.e. A is not a create ... does>
word. This may be a good reason for banning does>
@ruv I don't mind where the discussion is held. This site was set up specifically for the Forth 200X standard with a provision for comments, and so it is not unreasonable to have a discussion here.
@GeraldWodni To compare two directories, you either need to sort both (the answers of read-dir can be in any order the OS chooses; usually, the file names are returned in the order they are stored in the directory, which is very implementation specific), or take the filenames of one one-by-one and do a FILE-STATUS check in the other, and repeat the other way round, so no, read-dir doesn't help much here.
The TRAVERSE-DIR documentation needs some more words, e.g. about the lifetime of the provided buffer for xt: This is a one-shot buffer, and the data there lives only for one call of xt. On the other hand, it is always as large as needed, unlike the READ-DIR output, which can indicate that the result didn't fit.
That alone makes TRAVERSE-DIR much easier to use and harder to implement.
The standard insists on a parsing to, see the specification for to, so we don't need to consider that.
It insists that a parsing TO
be allowed - so the name must always follow directly, without another word or line break in between as is possible with a non - parsing TO
.
As regards the rest of your reply I think I agree with much of it. My system, which has detached headers, simply provides the same xt for both B and A for synonym B A so anything that is done to A or B is done to both whether by
does> to is
ordefer!
. And that seems reasonable to me although I wouldn't object todoes>
being forbidden.
Actually I've just thought of another issue that would break my system, I think, and that is if we have the sequence
create A
\ Possibly more definitions etc but no application of DOES> to A
: X does> ... ;
synonym B A X
i.e. A is not a create ... does> word. This may be a good reason for banning does>
Either way, I think what happens here is that both A and B are set to do X. That is definitely a bad idea: the definition of a word (in this case A) should not change after it has been used - unless of course it's a Deferred word, where you state at the outset that that's your intention.
This case is covered. B has not been defined using CREATE, so the code is already non - Standard.
proposal - Implementations requiring BOTH 32 bit single floats and 64 bit double floats.
The floats on the floating point stack are all one size, usually double. Only memory operations like SF@
and SF!
convert to and from single.
proposal - Implementations requiring BOTH 32 bit single floats and 64 bit double floats.
My question was in the case of a forth implementation which implements BOTH 32 bit single floats AND 64 bit double floats. How do you propose to name their respective words? for example, sf+ for single float+,df+ for double float+ and f+ may point to either by using SYNONYMS etc.
proposal - Implementations requiring BOTH 32 bit single floats and 64 bit double floats.
Yes, the convention you propose is what comes first to my mind. The only question is how you would name the corresponding @ and ! words.
I would recommend against adding having different FP sizes in a Forth system. It complicates matters and buys little to nothing on current hardware; the main benefit of smaller FP numbers is in memory and memory bandwidth requirements, and we have SF@ SF! for that already.
proposal - Implementations requiring BOTH 32 bit single floats and 64 bit double floats.
you already have df@, df!, sf@, sf!, f@ and f!. no problems there. let f@ f! f+ ... family be the "default" float behavior, sf@ sf! sf+ ... family be the single float behavior and df@ df! df+ ... be the double float behavior. Moreover the sfstack?, dfstack? and fstack? indicate which of these have a separate stack (keeping future 64bit implementations in mind in case doubles could be stacked in the main stack). as far as the rational for having both implementations is analogous to single cell numbers vs. double cell numbers and some practical requirements of implementing on modern cores like ARM cortex A9 etc. with VFPv3 which have both options available in some cases, also some vector / dsp operations are also implemented as single floats (like graphic processors etc.). since i am actually facing this problem in my implementation, i thought the standards community might help. regards and thanks for your responses.