There's also the realization that computer memory except for a few >specialized Forth chips is always made from RAM. So ideological
devotion to a pure stack VM seems to pass up perfectly good hardware >capabilities.
Gforth does support address-like locals if you want to use them.
With competent Forth compilers, the machine code is 1) the same when
using stack operations, when using the return stack, or when using
locals
If you want to use a language that is "ideologically devoted" to the >architecture, maybe you shouldn't use Forth at all - and stick with C.
I know there are situations when there are six values on the data stack
and four on the return stack which leave you with few other options. But
you can always use vanilla variables or an extra stack (which is trivial
to implement) to remedy that.
Using Forth means being resourceful. Not to choose the most convenient
and lazy solution imaginable.
I don't see anything about C that is closer to the hardware than Forth
is, and I think that both languages are about equally '"ideologically devoted" to the architecture'. In particular, a C local variable is
no closer to a register (the most efficient hardware feature for
storing data) than a stack item or return stack item is, and register allocation of any of the three is similarly difficult...
[email protected] (Anton Ertl) writes:
With competent Forth compilers, the machine code is 1) the same when
using stack operations, when using the return stack, or when using
locals
"Competent Forth compilers" there describes what by Forth standards
would be called quite fancy optimizing compilers ("analytic compilers").
They are a significant technical feat and there aren't that many of
them. Traditionally Forth has been implemented as simple interpreters.
r 1->0 third 1->2 >l >l 1->1 dup 1->1mov -$08[r14],r13 mov r15,$10[r10] >l 1->1 mov [r10],r13
2->1 add r14,$08 mov rax,rbp mov rbx,[r14]mov -$08[r10],r15 mov rax,[rbx] lea rbp,-$08[rbp] add r14,$08
In that case, a pure stack VM seems to ignore capabilities of the
underlying hardware. Particularly, the the stack's memory actually
being RAM.
Doesn't PICK go back to the earliest days of Forth, as a way
to bypass the limitation?
I believe early C compilers didn't attempt much if any register
allocation.
The
difference was that the C compiler generated straightforward assembly
code to access those variables even when they were in the stack
interior. You didn't have to use ROT or juggle stuff to the R stack to
get to the inner elements.
Forth for whatever reason
chose strict stack discipline (with some loopholes like PICK). I
understand wanting to stay with purity of a model, but a more >hardware-sympathetic model would have been "stack implemented in RAM".
So I still don't understand the benefit of the "pure abstract stack" >approach, other than for a few weird special CPU's.
[email protected] (Anton Ertl) writes:
I don't see anything about C that is closer to the hardware than Forth
is, and I think that both languages are about equally '"ideologically
devoted" to the architecture'. In particular, a C local variable is
no closer to a register (the most efficient hardware feature for
storing data) than a stack item or return stack item is, and register
allocation of any of the three is similarly difficult...
I believe early C compilers didn't attempt much if any register
allocation. You could say "register int x" to manually assign a
register to x if one was available. You were limited to 2 or 3 of those
on the PDP-11. Local variables in C otherwise lived in the stack. The >difference was that the C compiler generated straightforward assembly
code to access those variables even when they were in the stack
interior. You didn't have to use ROT or juggle stuff to the R stack to
get to the inner elements.
In assembler, you could also program in a stack-oriented style yet >straightforwardly access the inner elements. Forth for whatever reason
chose strict stack discipline (with some loopholes like PICK). I
understand wanting to stay with purity of a model, but a more >hardware-sympathetic model would have been "stack implemented in RAM".
So I still don't understand the benefit of the "pure abstract stack" >approach, other than for a few weird special CPU's.
locals
with without ratio
max 3.56us 2.69us 1.32
strcmp 83.20us 70.50us 1.18
- anton
String handling and move operation are the exception, because
they are both simpler and faster in low level.
Simpler is the argument (especially for i86).
Faster is the bonus.
Hans Bezemer <[email protected]> writes:
I don't see anything about C that is closer to the hardware than Forth
is, and I think that both languages are about equally '"ideologically devoted" to the architecture'. In particular, a C local variable is
no closer to a register (the most efficient hardware feature for
storing data) than a stack item or return stack item is, and register allocation of any of the three is similarly difficult (with big
differences in difficulty between solutions that provide some register allocation to those that are so reliable that you usually count on
them).
Using Forth means being resourceful. Not to choose the most convenient
and lazy solution imaginable.
According to <https://www.dictionary.com/browse/resourceful>:
|able to deal skillfully and promptly with new situations,
|difficulties, etc.
Forth systems that do not implement locals are not a new situation.
So do you mean to say that it is a difficulty?
But blaming the programmer for the system implementor's failings is a
tactic used widely by system implementors (in the C world as well as
in the Forth world).
(..) and they often find some arguments that appeal to
elitism (i.e., only the chosen ones can use this programming language
for the elite as it should be used, and the others should program in
Python or "should never have been allowed to touch a keyboard" (Ulrich Drepper).
In any case, why should it be better to use an inconvenient solution
that requires more work rather than a convenient solution that
requires less work (i.e., is lazy)?
For me virtues in programming are to produce correct code, to produce
it quickly, the code should use the resources economically (which does
not mean that saving a few bytes on a machine with GBs of memory is
virtuos), and the code should be readable and maintainable.
[email protected] writes:
String handling and move operation are the exception, because
they are both simpler and faster in low level.
Simpler is the argument (especially for i86).
Faster is the bonus.
In other words, Forth without locals is not well suited for words
that have so much active data. That is also reflected in hardware
designed for Forth, which got additional registers like A or B (or
additional capabilities for the top of the return stack register R),
which make it simpler and faster to implement such words.
A definition of STRCMP in the paper is
: strcmp { addr1 u1 addr2 u2 -- n }
addr1 addr2
u1 u2 min 0
?do { s1 s2 }
s1 c@ s2 c@ - ?dup
if
unloop exit
then
s1 char+ s2 char+
loop
2drop
u1 u2 - ;
So in the loop we have a loop count (on the return stack), two cursors
(s1 and s2) into the compared strings, and within the loop body we additionally have the two characters, for a total of five live values,
three of which survive across iterations and are changed in every
iteration. One could implement it as
\ untested, and the following versions, too
: strcmp { addr1 u1 addr2 u2 -- n }
addr1 addr2
u1 u2 min 0
?do
addr1 i + c@ addr2 i + c@ - ?dup
if
unloop exit
then
loop
u1 u2 - ;
where only one of the values changes in each iteration, but now the ?DO...LOOP cannot be replaced with a version that does not store a
second value but counts down (or up) to 0, so now we have a total of 6
live values, four of which survive across iterations, and one is
changed on every iteration.
One can reduce this by one value by keeping one of the cursors in the
loop counter:
: strcmp {: addr1 u1 addr2 u2 -- n :}
addr2 addr1 - {: offset :}
u1 u2 min addr1 + addr1 ?do
i c@ i offset + c@ - ?dup
if
unloop exit
then
loop
u1 u2 - ;
So now we have five live values in the body of the loop at the same
time, three of which live across iterations, and one of which changes
in each iteration. Keeping the loop parameters separate significantly lessens the load on the data stack.
Let's see if we can eliminate the local from the loop body:
: strcmp {: addr1 u1 addr2 u2 -- n :}
addr2 addr1 - ( offset )
u1 u2 min addr1 + addr1 ?do ( offset )
dup i + c@ i c@ - ?dup
if
nip unloop exit
then
loop
drop u1 u2 - ;
That leaves stack purists with the task of eliminating the locals from
the prologue and epilogue of this word. Two items have to be stored
across the loop, or the difference could be computed speculatively and
only one item stored across the loop. And the computations before the
loop involve four values alive at the same time (fortunately addr2 is
does not live long). Let's see:
: strcmp {: addr1 u1 addr2 u2 -- n :}
rot 2dup - >r ( addr1 addr2 u1 u2 R: n1 )
min -rot over - ( u12 addr1 offset R: n1 )
swap rot bounds ( offset limit start R: n1 )
?do ( offset R: n1 loop-sys )
dup i + c@ i c@ - ?dup
if
nip unloop r> drop exit
then
loop
drop r> negate ;
As can be seen by the many stack comments, the stack load here is more
than I can easily deal with.
Maybe a stack purist can improve on that. But can he improve it
enough to make it as easy to understand as any of the versions with
locals?
- anton
On Sat, 25 Apr 2026 10:22:16 GMT
[email protected] (Anton Ertl) wrote:
[email protected] writes:
String handling and move operation are the exception, because
they are both simpler and faster in low level.
Simpler is the argument (especially for i86).
Faster is the bonus.
In other words, Forth without locals is not well suited for words
that have so much active data. That is also reflected in hardware
designed for Forth, which got additional registers like A or B (or
additional capabilities for the top of the return stack register R),
which make it simpler and faster to implement such words.
A definition of STRCMP in the paper is
: strcmp { addr1 u1 addr2 u2 -- n }
addr1 addr2
u1 u2 min 0
?do { s1 s2 }
s1 c@ s2 c@ - ?dup
if
unloop exit
then
s1 char+ s2 char+
loop
2drop
u1 u2 - ;
So in the loop we have a loop count (on the return stack), two cursors
(s1 and s2) into the compared strings, and within the loop body we
additionally have the two characters, for a total of five live values,
three of which survive across iterations and are changed in every
iteration. One could implement it as
\ untested, and the following versions, too
: strcmp { addr1 u1 addr2 u2 -- n }
addr1 addr2
u1 u2 min 0
?do
addr1 i + c@ addr2 i + c@ - ?dup
if
unloop exit
then
loop
u1 u2 - ;
where only one of the values changes in each iteration, but now the
?DO...LOOP cannot be replaced with a version that does not store a
second value but counts down (or up) to 0, so now we have a total of 6
live values, four of which survive across iterations, and one is
changed on every iteration.
One can reduce this by one value by keeping one of the cursors in the
loop counter:
: strcmp {: addr1 u1 addr2 u2 -- n :}
addr2 addr1 - {: offset :}
u1 u2 min addr1 + addr1 ?do
i c@ i offset + c@ - ?dup
if
unloop exit
then
loop
u1 u2 - ;
So now we have five live values in the body of the loop at the same
time, three of which live across iterations, and one of which changes
in each iteration. Keeping the loop parameters separate significantly
lessens the load on the data stack.
Let's see if we can eliminate the local from the loop body:
: strcmp {: addr1 u1 addr2 u2 -- n :}
addr2 addr1 - ( offset )
u1 u2 min addr1 + addr1 ?do ( offset )
dup i + c@ i c@ - ?dup
if
nip unloop exit
then
loop
drop u1 u2 - ;
That leaves stack purists with the task of eliminating the locals from
the prologue and epilogue of this word. Two items have to be stored
across the loop, or the difference could be computed speculatively and
only one item stored across the loop. And the computations before the
loop involve four values alive at the same time (fortunately addr2 is
does not live long). Let's see:
: strcmp {: addr1 u1 addr2 u2 -- n :}
rot 2dup - >r ( addr1 addr2 u1 u2 R: n1 )
min -rot over - ( u12 addr1 offset R: n1 )
swap rot bounds ( offset limit start R: n1 )
?do ( offset R: n1 loop-sys )
dup i + c@ i c@ - ?dup
if
nip unloop r> drop exit
then
loop
drop r> negate ;
As can be seen by the many stack comments, the stack load here is more
than I can easily deal with.
Maybe a stack purist can improve on that. But can he improve it
enough to make it as easy to understand as any of the versions with
locals?
I recently reviewed the string comparison for search-wordlist
and came up with the following
The string stored in the word header is already uppercased.
So string comparison will be case insensitive
: UC ( c -- c' ) \ uppercase char
dup $61 $7B within $20 and - ;
: NCOMP4 ( addr n addr' n' - f) \ 0 is match
dup >r
begin
rot = while \ str cstr
r> dup 1- >r
while \ str cstr
swap count uc \ cstr str' s1
rot count \ str' s1 cstr' c1
repeat
2drop r> drop 0 exit
then
2drop r> drop 1 ;
First iteration in the loop it does not compare chars but the length!
BR
Peter
On 25-04-2026 07:26, Anton Ertl wrote:[reinserted deleted, relevant context]
Hans Bezemer <[email protected]> writes:
If you want to use a language that is "ideologically devoted" to the
architecture, maybe you shouldn't use Forth at all - and stick with C.
I don't see anything about C that is closer to the hardware than Forth
is, and I think that both languages are about equally '"ideologically
devoted" to the architecture'. In particular, a C local variable is
no closer to a register (the most efficient hardware feature for
storing data) than a stack item or return stack item is, and register
allocation of any of the three is similarly difficult (with big
differences in difficulty between solutions that provide some register
allocation to those that are so reliable that you usually count on
them).
Well, you're actually shooting at Paul Rubin - not at me. Thank you! I
take all the help I can get!
(..) and they often find some arguments that appeal to
elitism (i.e., only the chosen ones can use this programming language
for the elite as it should be used, and the others should program in
Python or "should never have been allowed to touch a keyboard" (Ulrich
Drepper).
It's your own pal Bernd that said: "A good programmer will write even
better code in Forth. A bad programmer will write abysmal code in Forth.
And I'm sorry to say - but most programmers are quite bad."
So, either you agree with him or we have an unfortunate departure of one
of the most foremost members of Gforth. Because this states - in no >uncertain words - that Forth programmers *ARE* elite.
It would be better to think deeply, find an original solution and learn.
Like Albert with his brilliant ;: word.
Hans Bezemer
[email protected] writes:
String handling and move operation are the exception, because
they are both simpler and faster in low level.
Simpler is the argument (especially for i86).
Faster is the bonus.
In other words, Forth without locals is not well suited for words
that have so much active data. That is also reflected in hardware
designed for Forth, which got additional registers like A or B (or
additional capabilities for the top of the return stack register R),
which make it simpler and faster to implement such words.
A definition of STRCMP in the paper is
: strcmp { addr1 u1 addr2 u2 -- n }
addr1 addr2
u1 u2 min 0
?do { s1 s2 }
s1 c@ s2 c@ - ?dup
if
unloop exit
then
s1 char+ s2 char+
loop
2drop
u1 u2 - ;
- anton--
This one is about a third bigger than yours - if we disregard the "UC",
that is:
: comp
rot over - if drop 2drop true exit then
0 ?do
over i chars + c@ over i chars + c@ -
if drop drop unloop true exit then
loop drop drop false
;
In 4tH, it is even visually more compact:
: comp
rot over - if drop 2drop true ;then
0 ?do over i [] c@ over i [] c@ - if drop drop unloop true ;then loop
drop drop false
;
The extra length comes mainly from the three different possible exits:
- It's not the same size (first line);
- It's not the same content (exit within loop);
- It's the same thing (after loop).
I can't say I particularly like the use of "COUNT" here - because it
actually represents "C@+" - except for the first run. Neither am I very
happy with the BEGIN..WHILE..WHILE..REPEAT..THEN construct - but that's
not your fault ;-)
All that being said, I cannot deny it is a clever piece of code using
the full capabilities of the language, bravo!
Hans Bezemer--
...
In the case of Forth and locals this tactic has not worked very well,
so even Forth, Inc. (who have been the most vocal among the commercial
Forth providers about their dislike of locals) have implemented
locals.
...
And traditionally Forth has been implemented without locals, for the
same reason: It takes less memory and, for the system implementor,
less work
In any case, when it comes to performance measurements on "simple interpreters" like the Gforth of 1994, Forth code with locals usually
turns out to be slower and consume more memory than Forth code using
(and trying to avoid) stack juggling.
... looking at the code for Gforth for 3DUP.3 compared to the others,
Gforth still uses more primitives ...
You seem to argue that the random-access aspect of locals provides a performance advantage on simple systems, but in most cases, code using
locals is at a performance disadvantage on such systems
(and traditionalists have often used that to argue against locals).
Keeping at least one stack item in a register leads to a smaller and
faster implementation, and is not more complex than keeping all the
stack memory in RAM.
A way to use RAM that is less frowned upon by Forth traditionalists is (global) variables. The fact that the use of global variables is
frowned upon in the wider programming community for various reasons
seems to pour oil into the fire of their elitism.
Hans Bezemer <[email protected]> writes:
We do have N>R (https://forth-standard.org/standard/tools/NtoR). So if
the whole problem is "there is no more room on the FP stack", there is
a way out.
That must be pretty new (it's not in gforth 0.7.3)
so I wonder how
helpful it really is.
In any case, it does not help with FP stack limitations at all,
because N>R transfers cells from the data stack to the return stack.
R was suggested as a way to implement horribleness #2 but it wouldactually have to be FN>R or something like that.
Paul Rubin <[email protected]d> writes:
Hans Bezemer <[email protected]> writes:
We do have N>R (https://forth-standard.org/standard/tools/NtoR). So if
the whole problem is "there is no more room on the FP stack", there is
a way out.
That must be pretty new (it's not in gforth 0.7.3)
It was accepted into Forth-200x at the 2010 standards meeting.
so I wonder how
helpful it really is.
We have two uses in the Gforth sources. I.e., not particularly
useful.
In any case, it does not help with FP stack limitations at all,
because N>R transfers cells from the data stack to the return stack.
My take on FP stack depth limitations in some systems is that you use
as much FP stack as you need, and a Forth system (like Gforth) where
you can make the FP stack as deep as available memory and address
space permit, and publish that. Maybe it will inspire the system implementors with shallow FP stacks to provide deep FP stacks, at
least optionally.
However, when I did something that required a deep FP stack (adding up
an array with pairwise addition <[email protected]>), I actually worked
around the limitations of systems that only provide a shallow FP
stack. But that was easy enough in that case.
Concerning systems with FP stack limits, AFAIK VFX has FP packages
that support very deep stacks, including the SSE-based package that
used to be the default in VFX64 for a while.
iForth implements a deep stack: it uses the 387 stack within a
definition and stores the FP stack items that are on the 387 stack to
memory on calls, and if the FP stack would overflow from the
computations within a word. I think this is a good approach: Much FP computation time is spent in words that do not call other words, or at
least the FP stack items do not live across the calls. iForth seems
to overdo it, however, even code like
: bar
dup f@ cell+ dup f@ cell+ dup f@ cell+
dup f@ cell+ dup f@ cell+ dup f@ cell+
f+ f+ f+ f+ f+ ;
which uses only 6 FP stack items does not produce the obvious code,
but something significantly longer: It first performs 6 FLD
instructions corresponding to the 6 F@, then stores 4 FP items,
presumably on the memory FP stack, and only then starts the additions (interleaved with some other code).
- anton
[email protected] (Anton Ertl) writes:
In any case, it does not help with FP stack limitations at all,
because N>R transfers cells from the data stack to the return stack.
In the code I mentioned, I wasn't running out of FP stack space, but
rather, I didn't see how to write the function in any non-horrible way without using FP locals. Horrible ways included: 1) implementing a
separate FP stack in memory for intermediate values during the
recursion, or 2) using ugly hacks to stash FP values on the regular data stack.
R was suggested as a way to implement horribleness #2 but it wouldactually have to be FN>R or something like that.
lxf uses the cpu FP stack. I think that is one of the worse decisions
I made for it. It will fail on all but the simplest complex fp math >operations. For lxf64 a priority was to have a separate in memory
FP stack. It has worked out very well!
BR
Peter
[email protected] (Anton Ertl) writes:
And traditionally Forth has been implemented without locals, for the
same reason: It takes less memory and, for the system implementor,
less work
A simple implementation of locals doesn't sound like that much work?
Mostly you need a runtime scheme to make sure the locals are cleaned up
in case of exceptions being thrown. If you're willing to ignore the
standard you don't need to complicate the text interpreter much. I
Hans Bezemer <[email protected]> writes:
On 25-04-2026 07:26, Anton Ertl wrote:[reinserted deleted, relevant context]
Hans Bezemer <[email protected]> writes:
If you want to use a language that is "ideologically devoted" to the
architecture, maybe you shouldn't use Forth at all - and stick with C.
I don't see anything about C that is closer to the hardware than Forth
is, and I think that both languages are about equally '"ideologically
devoted" to the architecture'. In particular, a C local variable is
no closer to a register (the most efficient hardware feature for
storing data) than a stack item or return stack item is, and register
allocation of any of the three is similarly difficult (with big
differences in difficulty between solutions that provide some register
allocation to those that are so reliable that you usually count on
them).
Well, you're actually shooting at Paul Rubin - not at me. Thank you! I
take all the help I can get!
Actually, this whole paragraph is a reaction on your statement, not
his. You deleted it for whatever reason, so I reinserted it.
Concerning Paul Rubin, just because he is wrong does not mean you are
right.
(..) and they often find some arguments that appeal to
elitism (i.e., only the chosen ones can use this programming language
for the elite as it should be used, and the others should program in
Python or "should never have been allowed to touch a keyboard" (Ulrich
Drepper).
It's your own pal Bernd that said: "A good programmer will write even
better code in Forth. A bad programmer will write abysmal code in Forth.
And I'm sorry to say - but most programmers are quite bad."
So, either you agree with him or we have an unfortunate departure of one
of the most foremost members of Gforth. Because this states - in no
uncertain words - that Forth programmers *ARE* elite.
What departure? We disagree on a number of things.
And the issue is not whether Forth programmers or any other
programmers are elite, but that many programmers think that they are
elite (whether they are or aren't) and that the designers or advocates
of deficient programming systems make use of that to dupe them, along
the lines of: "You as elite programmers can cope with this deficiency
[of course they don't call it a definiency], it's only subpar
programmers [more elaborate denigrations are common, see Ulrich
Drepper] who complain about it."
In the case of Forth and locals this tactic has not worked very well,
so even Forth, Inc. (who have been the most vocal among the commercial
Forth providers about their dislike of locals) have implemented
locals. But of course we see the echo of all of this still around
here.
In article <nnd$1196d1a5$0da70c85@6de98b5b6c1b0418>,
Hans Bezemer <[email protected]> wrote:
<SNIP>
It would be better to think deeply, find an original solution and learn.
Like Albert with his brilliant ;: word.
Chuck Moore invented and coined the ;: word.
I came up with CO with is similar, or maybe the same.
[email protected] (Anton Ertl) writes:
And traditionally Forth has been implemented without locals, for the
same reason: It takes less memory and, for the system implementor,
less work
A simple implementation of locals doesn't sound like that much work?
I've
imagined some alternate versions of COLON, e.g.
: foo ( ... ) ; \ regular colon, no locals
1: foo ( ... ) ; \ one local called A
2: foo (... ) ; \ two locals, A and B
...
4: foo (... ) ; \ four locals: A, B, C, D.
The slowdown doesn't surprise me but it's not that big a deal, compared
to the slowdown of using interpreted Forth instead of assembly language
in the first place.
... looking at the code for Gforth for 3DUP.3 compared to the others,
Gforth still uses more primitives ...
That's a lot of code in the expansion! I wonder how it will look in a
simple interpreter.
You seem to argue that the random-access aspect of locals provides a
performance advantage on simple systems, but in most cases, code using
locals is at a performance disadvantage on such systems
Well, if the slowdown is less than say 2x, I'd say the code cleanup
matters more, due to the traditional 90/10 rule (maybe now 99/1) of
where CPU cycles go. Code the hot spots for speed and the rest for >convenience.
Keeping at least one stack item in a register leads to a smaller and
faster implementation, and is not more complex than keeping all the
stack memory in RAM.
That's only with a fancy compiler AND a requirement of the application
code having statically determined stack effects. Traditional words like
?DUP would confuse this scheme amirite?
A way to use RAM that is less frowned upon by Forth traditionalists is
(global) variables. The fact that the use of global variables is
frowned upon in the wider programming community for various reasons
seems to pour oil into the fire of their elitism.
I see what you mean by that. But, whole-program C compilers do
something like register allocation to re-use those "global" cells when
sets of them won't be needed at the same time. The Forth approach would
need either a similar fancy compiler, or else require the programmer to
do an error-prone manual memory layout process, or else burn memory >unnecessarily for those cells whose usage doesn't overlap.
Paul Rubin <[email protected]d> writes:
[email protected] (Anton Ertl) writes:
And traditionally Forth has been implemented without locals, for the
same reason: It takes less memory and, for the system implementor,
less work
A simple implementation of locals doesn't sound like that much work?
Bernd Paysan wrote a simple locals implementation <https://cgit.git.savannah.gnu.org/cgit/gforth.git/tree/locals.fs>
that takes 84 SLOC:
I recently reviewed the string comparison for search-wordlist
and came up with the following
The string stored in the word header is already uppercased.
So string comparison will be case insensitive
: UC ( c -- c' ) \ uppercase char
dup $61 $7B within $20 and - ;
: NCOMP4 ( addr n addr' n' - f) \ 0 is match
dup >r
begin
rot = while \ str cstr
r> dup 1- >r
while \ str cstr
swap count uc \ cstr str' s1
rot count \ str' s1 cstr' c1
repeat
2drop r> drop 0 exit
then
2drop r> drop 1 ;
First iteration in the loop it does not compare chars but the length!
On 26-04-2026 11:50, Anton Ertl wrote:
Bernd Paysan wrote a simple locals implementation
<https://cgit.git.savannah.gnu.org/cgit/gforth.git/tree/locals.fs>
that takes 84 SLOC:
With all respect to Bernd, but yeah - compare that to this 0.5 SLOC >implementation of local:
: local r> swap dup >r @ >r ;: r> r> ! ;
Paul Rubin <[email protected]d> writes:
...
I've
imagined some alternate versions of COLON, e.g.
: foo ( ... ) ; \ regular colon, no locals
1: foo ( ... ) ; \ one local called A
2: foo (... ) ; \ two locals, A and B
...
4: foo (... ) ; \ four locals: A, B, C, D.
If you cannot chose the names, locals lose a lot of their benefits in
making the code more understandable (OTOH, mathematicians have made to
with similar naming schemes for a long time). You might then just as
well work with >R >R >R >R and R@, R'@, 2 RPICK and 3 RPICK.
Hans Bezemer <[email protected]> writes:
On 26-04-2026 11:50, Anton Ertl wrote:
Bernd Paysan wrote a simple locals implementation
<https://cgit.git.savannah.gnu.org/cgit/gforth.git/tree/locals.fs>
that takes 84 SLOC:
With all respect to Bernd, but yeah - compare that to this 0.5 SLOC
implementation of local:
: local r> swap dup >r @ >r ;: r> r> ! ;
Let's see:
[~:167902] gforth-0.5.0
GForth 0.5.0, Copyright (C) 1995-2000 Free Software Foundation, Inc.
GForth comes with ABSOLUTELY NO WARRANTY; for details type `license'
Type `bye' to exit
warnings off include locals.fs ok
ok
: local r> swap dup >r @ >r ;: r> r> ! ;
*the terminal*:1: Undefined word
peter <[email protected]> writes:
I recently reviewed the string comparison for search-wordlist
and came up with the following
The string stored in the word header is already uppercased.
So string comparison will be case insensitive
: UC ( c -- c' ) \ uppercase char
dup $61 $7B within $20 and - ;
: NCOMP4 ( addr n addr' n' - f) \ 0 is match
dup >r
begin
rot = while \ str cstr
r> dup 1- >r
while \ str cstr
swap count uc \ cstr str' s1
rot count \ str' s1 cstr' c1
repeat
2drop r> drop 0 exit
then
2drop r> drop 1 ;
First iteration in the loop it does not compare chars but the length!
Clever, but, at least without comment, too clever.
This code, and, more clearly, Hans Bezemers version demonstrate that
STR= is easier than COMPARE, STRCMP, or STR<, because you can deal
with the case of length difference right at the start, whereas the
latter words have to check the characters up to the end of the shorter
string first before dealing with the length. This shows the greatest
benefit in cases like
s" 0123456789abcdefg" s" 0123456789abcdefgh" strcmp
As for STRCMP, I have measured the five versions shown in my earlier
posting (whole program posted below), with the bugs fixed, and the
?DUP IF replaced by DUP IF ... THEN DROP, because it produces better
code.
I have also included the following versions:
: strcmp { addr1 u1 addr2 u2 -- n }
u1 u2 min 0
?do
addr1 c@ addr2 c@ - ?dup
if
unloop exit
then
addr1 char+ TO addr1
addr2 char+ TO addr2
loop
u1 u2 - ;
This comes from the '94 paper and is the version that uses TO instead
of defining new locals at every iteration. Paul Rubin will love the
code that current Gforth produces for "addr2 char+ TO addr2":
<strcmp+$E0> @local2 1->2
$7F337DA71BBA: mov 0x10(%rbp),%r15
<strcmp+$E8> char+ 2->2
$7F337DA71BBE: add $0x1,%r15
<strcmp+$F0> !local2 2->1
$7F337DA71BC2: mov %r15,0x10(%rbp)
The TO <local> code was not that efficient in earlier Gforth versions.
The other version I added is:
: strcmp ( addr1 u1 addr2 u2 -- n )
rot 2dup 2>r min 0 ?do ( addr1 addr2 )
over c@ over c@ - dup if
nip nip 2rdrop unloop exit then
drop
char+ swap char+ swap
loop
2drop r> r> - ;
This is the STRCMP3 from <[email protected]>
and may be the locals-less version I compared against in the '94
paper.
I also included your version (without the UC call) and Hans Bezemer's version.
I benchmarked two Forth systems, gforth-fast and gforth-itc.
gforth-itc uses indirect-threaded code and should perform similar to
the "simple interpreters" that Paul Rubin had in mind.
I ran three different benchmarks on these words, which performed the following a number of times:
s" 0123456789abcdefg" 2dup strcmp drop \ bench1
s" 0123456789abcdefg" s" 2123456789abcdefg" strcmp drop \ bench2
s" 0123456789abcdefg" s" 0123456789abcdefgh" strcmp drop \bench3
In bench1 the strings are equal and everything has to be compared. In
bench2 the strings have the same length, but differ in the first char,
so the loop can terminate after the first char. In bench3 the strings
have different length, but all chars that both strings have are the
same. In the latter case versionpeter and versionbezemer have an
advantage from not performing the same functionality.
The cycles numbers are per invocation of STRCMP, including benchmark overhead.
The benchmarks are run on a Ryzen 8700G (Zen4)>
In addition to the cycles, I also show the bytes of the native code of
the whole word in gforth-fast on AMD64 (without the final jmp (2
Bytes)), and of the loop (including the code for the if...then).
Bytes | cycles gforth-fast | cycles gforth-itc |
strcmp loop|bench1 bench2 bench3 | bench1 bench2 bench3 |
262 127 | 109.5 16.6 109.4 | 1732.7 147.4 1724.5 | version0
303 151 | 164.2 17.2 164.4 | 1714.1 170.4 1613.5 | version1
257 122 | 105.3 17.4 105.1 | 1496.7 166.4 1493.0 | version2
280 113 | 98.6 19.2 99.0 | 1230.1 194.4 1116.2 | version3
267 118 | 91.2 17.9 91.2 | 1268.6 198.4 1269.0 | version4
273 108 | 89.9 17.0 90.0 | 1136.0 178.4 1138.9 | version5
261 128 | 121.1 14.6 118.5 | 1221.4 131.3 1213.3 | version6
210 142 | 137.5 15.4 9.5 | 1244.4 155.3 78.3 | versionpeter
260 119 | 107.8 16.4 9.8 | 1186.2 134.5 71.3 | versionbezemer
So the champion among the full-featured strcmps for bench1 and bench3
is version5, for bench2 version6. The str= variants are much faster
for bench3 (of course), but slower than several other versions for
bench1 and slower than version6 for bench2. The native code size is
smallest for version2 (among the full-featured strcmp
implementations), so the locals-less versions do not win everything.
So locals-less (version5 and version6) is somewhat faster on both
gforth-fast and gforth-itc.
lxf has a more efficient locals implementation. Let's see how it
fares. It does not support the usage in version1, so I leave that
away.
cycles lxf
bench1 bench2 bench3
79.9 12.0 79.9 version0
99.6 12.0 99.6 version2
98.8 14.1 98.1 version3
86.0 13.2 86.0 version4
84.1 12.6 84.2 version5
88.7 10.0 92.8 version6
98.3 10.0 6.0 versionpeter
72.1 9.5 6.0 versionbezemer
On lxf version0 (with locals) is the fastest for bench1 and bench3,
and version6 is the fastest for bench2. Hans Bezemers version wins everything if we are only interested in str= functionality.
And here's the code (measurement scripts at the bottom): ----------------------------------------------------------
[defined] version0 [if]
: strcmp {: addr1 u1 addr2 u2 -- n :}
u1 u2 min 0
?do
addr1 c@ addr2 c@ - dup
if
unloop exit
then
drop
addr1 char+ TO addr1
addr2 char+ TO addr2
loop
u1 u2 - ;
[then]
[defined] version1 [if]
: strcmp {: addr1 u1 addr2 u2 -- n :}
addr1 addr2
u1 u2 min 0
?do {: s1 s2 :}
s1 c@ s2 c@ - dup
if
unloop exit
then
drop s1 char+ s2 char+
loop
2drop
u1 u2 - ;
[then]
[defined] version2 [if]
: strcmp {: addr1 u1 addr2 u2 -- n :}
u1 u2 min 0
?do
addr1 i + c@ addr2 i + c@ - dup
if
unloop exit
then
drop
loop
u1 u2 - ;
[then]
[defined] version3 [if]
: strcmp {: addr1 u1 addr2 u2 -- n :}
addr2 addr1 - {: offset :}
u1 u2 min addr1 + addr1 ?do
i c@ i offset + c@ - dup
if
unloop exit
then
drop
loop
u1 u2 - ;
[then]
[defined] version4 [if]
: strcmp {: addr1 u1 addr2 u2 -- n :}
addr2 addr1 - ( offset )
u1 u2 min addr1 + addr1 ?do ( offset )
dup i + c@ i c@ - dup
if
nip negate unloop exit
then
drop
loop
drop u1 u2 - ;
[then]
[defined] version5 [if]
: strcmp ( addr1 u1 addr2 u2 -- n )
rot 2dup - >r ( addr1 addr2 u1 u2 R: n1 )
min -rot over - ( u12 addr1 offset R: n1 )
swap rot bounds ( offset limit start R: n1 )
?do ( offset R: n1 loop-sys )
dup i + c@ i c@ - dup
if
nip negate unloop r> drop exit
then
drop
loop
drop r> negate ;
[then]
[defined] version6 [if]
[undefined] 2rdrop [if]
: 2rdrop postpone 2r> postpone 2drop ; immediate
[then]
: strcmp ( addr1 u1 addr2 u2 -- n )
rot 2dup 2>r min 0 ?do ( addr1 addr2 )
over c@ over c@ - dup if
nip nip 2rdrop unloop exit then
drop
char+ swap char+ swap
loop
2drop r> r> - ;
[then]
[defined] versionpeter [if]
\ from <[email protected]>
\ renamed and deleted the call to UC
: strcmp ( addr n addr' n' - f) \ 0 is match
dup >r
begin
rot = while \ str cstr
r> dup 1- >r
while \ str cstr
swap count \ cstr str' s1
rot count \ str' s1 cstr' c1
repeat
2drop r> drop 0 exit
then
2drop r> drop 1 ;
[then]
[defined] versionbezemer [if]
\ from <nnd$548d4f1b$1e104571@905dda44db1f54ae>
\ renamed
: strcmp
rot over - if drop 2drop true exit then
0 ?do
over i chars + c@ over i chars + c@ -
if drop drop unloop true exit then
loop drop drop false
;
[then]
[defined] t{ [if]
t{ s" abc" s" abc" strcmp -> 0 }t
t{ s" abc" s" abcd" strcmp -> -1 }t
t{ s" abc" s" abd" strcmp -> -1 }t
t{ s" abd" s" abc" strcmp -> 1 }t
t{ s" cbc" s" abc" strcmp -> 2 }t
t{ s" abc" s" adc" strcmp -> -2 }t
[then]
\ Benchmarks
[undefined] iterations [if]
100000000 constant iterations
[then]
: benchmark ( c-addr1 u1 c-addr2 u2 -- )
iterations 0 do
2over 2over strcmp drop
loop
2drop 2drop ;
: bench1
s" 0123456789abcdefg" 2dup benchmark ;
: bench2
s" 0123456789abcdefg" s" 2123456789abcdefg" benchmark ;
: bench3
s" 0123456789abcdefg" s" 0123456789abcdefgh" benchmark ;
0 [if]
# bash script for producing the cycles
IFS=":"
for i in 0 1 2 3 4 5 6 peter bezemer; do
for forthit in gforth-fast:100000000 gforth-itc:10000000; do
fields=($forthit); forth="${fields[0]}"; iterations="${fields[1]}"
for bench in 1 2 3; do
perf stat --log-fd 3 -x, -e cycles:u $forth -e "create version$i $iterations constant iterations" ~/forth/strcmp.4th -e "bench$bench bye" 3>&1 >/dev/null|
awk -F, '{printf "%6.1f ",$1/'$iterations'}'
done
done
echo version$i
done
IFS=":"
for i in 0 2 3 4 5 6 peter bezemer; do
forth=lxf; iterations=100000000
for bench in 1 2 3; do
perf stat --log-fd 3 -x, -e cycles:u $forth "create version$i $iterations constant iterations include $HOME/forth/strcmp.4th bench$bench bye" 3>&1 >/dev/null|
awk -F, '{printf "%6.1f ",$1/'$iterations'}'
done
echo version$i
done
[then]
--------------------------------------------------------------
- anton
On Sun, 26 Apr 2026 14:03:03 GMT...
[email protected] (Anton Ertl) wrote:
I benchmarked two Forth systems, gforth-fast and gforth-itc.
gforth-itc uses indirect-threaded code and should perform similar to
the "simple interpreters" that Paul Rubin had in mind.
I ran three different benchmarks on these words, which performed the
following a number of times:
s" 0123456789abcdefg" 2dup strcmp drop \ bench1
s" 0123456789abcdefg" s" 2123456789abcdefg" strcmp drop \ bench2
s" 0123456789abcdefg" s" 0123456789abcdefgh" strcmp drop \bench3
In bench1 the strings are equal and everything has to be compared. In
bench2 the strings have the same length, but differ in the first char,
so the loop can terminate after the first char. In bench3 the strings
have different length, but all chars that both strings have are the
same. In the latter case versionpeter and versionbezemer have an
advantage from not performing the same functionality.
The cycles numbers are per invocation of STRCMP, including benchmark overhead.
The benchmarks are run on a Ryzen 8700G (Zen4)>
In addition to the cycles, I also show the bytes of the native code of
the whole word in gforth-fast on AMD64 (without the final jmp (2
Bytes)), and of the loop (including the code for the if...then).
Bytes | cycles gforth-fast | cycles gforth-itc |
strcmp loop|bench1 bench2 bench3 | bench1 bench2 bench3 |
262 127 | 109.5 16.6 109.4 | 1732.7 147.4 1724.5 | version0
303 151 | 164.2 17.2 164.4 | 1714.1 170.4 1613.5 | version1
257 122 | 105.3 17.4 105.1 | 1496.7 166.4 1493.0 | version2
280 113 | 98.6 19.2 99.0 | 1230.1 194.4 1116.2 | version3
267 118 | 91.2 17.9 91.2 | 1268.6 198.4 1269.0 | version4
273 108 | 89.9 17.0 90.0 | 1136.0 178.4 1138.9 | version5
261 128 | 121.1 14.6 118.5 | 1221.4 131.3 1213.3 | version6
210 142 | 137.5 15.4 9.5 | 1244.4 155.3 78.3 | versionpeter
260 119 | 107.8 16.4 9.8 | 1186.2 134.5 71.3 | versionbezemer
lxf has a more efficient locals implementation. Let's see how it
fares. It does not support the usage in version1, so I leave that
away.
cycles lxf
bench1 bench2 bench3
79.9 12.0 79.9 version0
99.6 12.0 99.6 version2
98.8 14.1 98.1 version3
86.0 13.2 86.0 version4
84.1 12.6 84.2 version5
88.7 10.0 92.8 version6
98.3 10.0 6.0 versionpeter
72.1 9.5 6.0 versionbezemer
Anton, thanks for running all these tests.
I have now also run them on my Ryzen 9950X.
There is an error in version 6 that i corrected.
2rdrop needs to be after unloop. On lxf64 that uses registers for
loop parameters this is necessary!
I needed also to change the log-fd to 5 to get it to run.
The tests are run with Debian under WSL2.
Here are the results
lxf64
59.1 10.0 57.6 version0
48.1 10.0 48.4 version2
43.0 10.7 42.5 version4
42.2 9.1 42.2 version5
55.1 9.0 55.0 version6
65.7 8.0 6.0 versionpeter
32.8 9.0 4.2 versionbezemer
lxf
64.2 8.5 64.2 version0
112.3 10.2 90.1 version2
78.8 10.6 75.6 version4
88.1 9.4 88.2 version5
112.2 7.5 114.7 version6
71.0 8.2 7.4 versionpeter
50.9 8.3 4.3 versionbezemer
There is a significant impact in having loop parameters in registers!
version 2 and 6 are interesting for lxf. The full stat gives some more
info.
peter <[email protected]> writes:
On Sun, 26 Apr 2026 14:03:03 GMT
[email protected] (Anton Ertl) wrote:
I benchmarked two Forth systems, gforth-fast and gforth-itc.
gforth-itc uses indirect-threaded code and should perform similar to
the "simple interpreters" that Paul Rubin had in mind.
I ran three different benchmarks on these words, which performed the
following a number of times:
s" 0123456789abcdefg" 2dup strcmp drop \ bench1
s" 0123456789abcdefg" s" 2123456789abcdefg" strcmp drop \ bench2
s" 0123456789abcdefg" s" 0123456789abcdefgh" strcmp drop \bench3
In bench1 the strings are equal and everything has to be compared. In
bench2 the strings have the same length, but differ in the first char,
so the loop can terminate after the first char. In bench3 the strings
have different length, but all chars that both strings have are the
same. In the latter case versionpeter and versionbezemer have an
advantage from not performing the same functionality.
The cycles numbers are per invocation of STRCMP, including benchmark overhead.
The benchmarks are run on a Ryzen 8700G (Zen4)>
In addition to the cycles, I also show the bytes of the native code of
the whole word in gforth-fast on AMD64 (without the final jmp (2
Bytes)), and of the loop (including the code for the if...then).
Bytes | cycles gforth-fast | cycles gforth-itc |
strcmp loop|bench1 bench2 bench3 | bench1 bench2 bench3 |
262 127 | 109.5 16.6 109.4 | 1732.7 147.4 1724.5 | version0
303 151 | 164.2 17.2 164.4 | 1714.1 170.4 1613.5 | version1
257 122 | 105.3 17.4 105.1 | 1496.7 166.4 1493.0 | version2
280 113 | 98.6 19.2 99.0 | 1230.1 194.4 1116.2 | version3
267 118 | 91.2 17.9 91.2 | 1268.6 198.4 1269.0 | version4
273 108 | 89.9 17.0 90.0 | 1136.0 178.4 1138.9 | version5
261 128 | 121.1 14.6 118.5 | 1221.4 131.3 1213.3 | version6
210 142 | 137.5 15.4 9.5 | 1244.4 155.3 78.3 | versionpeter
260 119 | 107.8 16.4 9.8 | 1186.2 134.5 71.3 | versionbezemer ...
lxf has a more efficient locals implementation. Let's see how it
fares. It does not support the usage in version1, so I leave that
away.
cycles lxf
bench1 bench2 bench3
79.9 12.0 79.9 version0
99.6 12.0 99.6 version2
98.8 14.1 98.1 version3
86.0 13.2 86.0 version4
84.1 12.6 84.2 version5
88.7 10.0 92.8 version6
98.3 10.0 6.0 versionpeter
72.1 9.5 6.0 versionbezemer
And, to top it off, sf64 and vfx64, after correcting the bug in
version6 that you pointed out:
cycles sf-4.0.0-RC89 | cycles vfx64 5.43 |
bench1 bench2 bench3 | bench1 bench2 bench3 |
195.1 62.0 194.5 | 124.2 42.2 123.3 | version0
136.3 63.0 136.2 | 200.4 124.1 204.4 | version2
143.7 69.6 143.4 | 90.7 36.7 91.3 | version4
115.1 36.0 114.1 | 102.0 30.2 101.8 | version5
132.8 38.0 133.3 | 85.8 28.2 88.2 | version6
182.0 19.0 9.0 | 95.7 10.2 6.2 | versionpeter
224.9 40.2 8.0 | 63.2 29.2 6.2 | versionbezemer
Interesting performance variations.
Anton, thanks for running all these tests.
I have now also run them on my Ryzen 9950X.
There is an error in version 6 that i corrected.
2rdrop needs to be after unloop. On lxf64 that uses registers for
loop parameters this is necessary!
Thanks. In sf64 and vfx64 this change is necessary, too.
I needed also to change the log-fd to 5 to get it to run.
The tests are run with Debian under WSL2.
WSL2 supports performance counters. Great!
What happens with log-fd=3?
Here are the results
lxf64
59.1 10.0 57.6 version0
48.1 10.0 48.4 version2
43.0 10.7 42.5 version4
42.2 9.1 42.2 version5
55.1 9.0 55.0 version6
65.7 8.0 6.0 versionpeter
32.8 9.0 4.2 versionbezemer
lxf
64.2 8.5 64.2 version0
112.3 10.2 90.1 version2
78.8 10.6 75.6 version4
88.1 9.4 88.2 version5
112.2 7.5 114.7 version6
71.0 8.2 7.4 versionpeter
50.9 8.3 4.3 versionbezemer
There is a significant impact in having loop parameters in registers! >version 2 and 6 are interesting for lxf. The full stat gives some more >info.
Not any info that I find helpful. But my guess is as follows: Keeping
the loop index in memory has reliably meant that counted loops take at
least 5 cycles per iteration. In recent processors (from this decade
or a little earlier), hardware can perform zero-cycle store-to-load forwarding, but it is not reliable. So my guess is that in version2
and version6 we are seeing cases where this hardware optimization has
not worked. So, yes, keeping loop parameters that change in registers
is a good idea even on recent CPUs.
The differences between Zen4 and Zen5 on lxf are significant, but I
guess that if you take the average, you get the picture of small
progress that I see on various websites.
- anton
Hans Bezemer <[email protected]> writes:
On 26-04-2026 11:50, Anton Ertl wrote:
Bernd Paysan wrote a simple locals implementation
<https://cgit.git.savannah.gnu.org/cgit/gforth.git/tree/locals.fs>
that takes 84 SLOC:
With all respect to Bernd, but yeah - compare that to this 0.5 SLOC
implementation of local:
: local r> swap dup >r @ >r ;: r> r> ! ;
Let's see:
[~:167902] gforth-0.5.0
GForth 0.5.0, Copyright (C) 1995-2000 Free Software Foundation, Inc.
GForth comes with ABSOLUTELY NO WARRANTY; for details type `license'
Type `bye' to exit
warnings off include locals.fs ok
ok
: local r> swap dup >r @ >r ;: r> r> ! ;
*the terminal*:1: Undefined word
: local r> swap dup >r @ >r ;: r> r> ! ;
^^
Backtrace:
$F7B5A158 throw
$F7B6418C no.extensions
Although, admittedly, while Bernd Paysan's locals.fs loads, it does
not work AFAICT (I tried it on gforth-0.4 and gforth-0.5; it does not
load on gforth-0.6 and later). Apparently it had bitrotted between
the time when it was written in 1992 and gforth-0.4 in 1998.
- anton
On 26/04/2026 7:50 pm, Anton Ertl wrote:
Paul Rubin <[email protected]d> writes:
...
I've
imagined some alternate versions of COLON, e.g.
: foo ( ... ) ; \ regular colon, no locals
1: foo ( ... ) ; \ one local called A
2: foo (... ) ; \ two locals, A and B
...
4: foo (... ) ; \ four locals: A, B, C, D.
informsIf you cannot chose the names, locals lose a lot of their benefits in
making the code more understandable (OTOH, mathematicians have made to
with similar naming schemes for a long time). You might then just as
well work with >R >R >R >R and R@, R'@, 2 RPICK and 3 RPICK.
That Julian Noble (among others) felt the need for FTRAN INTRAN etc
what scientists and academics really want - and it's a long way from the 'stack based' locals offered by most forth systems. The latter representidentifiers.
a concession to forth before a user has even begun to consider
To an outsider, forth locals do nothing to ameliorate what they see as fundamentally broken about the language. ISTM if a forther hasconceded to
use stack-based locals, he can certainly make choices about what form identifiers take.
As a matter of fact, this thingy creates locals:
: ;: >r ; : local r> swap dup >r @ >r ;: r> r> ! ;
On 28/04/2026 13:34, Hans Bezemer wrote:
As a matter of fact, this thingy creates locals:
: ;: >r ; : local r> swap dup >r @ >r ;: r> r> ! ;
LOCAL can also be defined as:
: local r> over @ rot 2>r ;: 2r> ! ;
which I guess you won't like, but is a bit shorter. It also survives
your pre-processor conversion of 2>r to >r >r, similarly 2r>
On 29-04-2026 13:44, Gerry Jackson wrote:
On 28/04/2026 13:34, Hans Bezemer wrote:
As a matter of fact, this thingy creates locals:
: ;: >r ; : local r> swap dup >r @ >r ;: r> r> ! ;
LOCAL can also be defined as:
: local r> over @ rot 2>r ;: 2r> ! ;
which I guess you won't like, but is a bit shorter. It also survives
your pre-processor conversion of 2>r to >r >r, similarly 2r>
I don't say you're wrong, but there is some logic to this madness:
1. In 4tH, "2>R" is the same as ">R >R". The compiler expands it like
that. So -- there is no advantage to do "2>R". Yes, you can do "2R@",
but not "R@". It won't be portable;
If you cannot chose the names... You might then just as well work with
R >R >R >R and R@, R'@, 2 RPICK and 3 RPICK.
In the code you see the threaded code interspersed with the native
code. If you ignore the native code, you see what a simple
interpreter would see (if it had a locals implementation that produced
code similar to that of Gforth).
So it's "code cleanup", not making use of hardware facilities for
efficiency on simple interpreters, that you see as the benefit of
locals.
Even with multi-representation stack-caching as used since Gforth 0.7
(which does require more compiler smarts), no statically determined
stack effect is necessary, because the code generator returns to the canonical state on control-flow.
... we have user variables like BASE and HLD (in F83, HOLDPTR in
gforth). They are used across multiple words, and the fact that you
don't have to pass them and put them into a local has been touted as
an advantage over locals: Definitions that use global variables are
easier to factor.
You might then just as well work with >R >R >R >R and R@, R'@, 2 RPICK
and 3 RPICK.
...
Flashforth has a separate P stack which can be used for temporaries
within a word, but I don't remember how cleanup is handled, if at all.
It's a cpu register - not a stack. For re-entrancy old value must first
be pushed onto the cpu stack before loading the new. IIRC FF has a word
that combines those. Basically a variable.
[email protected] (Anton Ertl) writes:[...]
I wonder if gforth would get less code bloat if you added some
primitives for pushing more than one local. E.g. 2>L, 3>L, etc. would
push that many stack elements to LOCAL0, LOCAL1, LOCAL2. Then there
wouldn't be that big chunk of replicated code.
l >l 62 len= 4+ 26+ 3
l >l >l 9 len= 4+ 34+ 3
l >l >l >l 5 len= 4+ 42+ 3
l f>l 2 len= 4+ 42+ 3
l @local0 20 len= 4+ 11+ 3
l lit f@localn 1 len= 4+ 24+ 3
l 67 len= 4+ 18+ 3
l 10 len= 4+ 23+ 3
Even with multi-representation stack-caching as used since Gforth 0.7
(which does require more compiler smarts), no statically determined
stack effect is necessary, because the code generator returns to the
canonical state on control-flow.
I see, yeah, but that means stack juggling to get to the canonical
state.
... we have user variables like BASE and HLD (in F83, HOLDPTR in
gforth). They are used across multiple words, and the fact that you
don't have to pass them and put them into a local has been touted as
an advantage over locals: Definitions that use global variables are
easier to factor.
Urgggh...
[email protected] (Anton Ertl) writes:
You might then just as well work with >R >R >R >R and R@, R'@, 2 RPICK
and 3 RPICK.
But, now you have to avoid mixing that style with using the R stack for >temporaries, including stuff like loop indexes which sometimes go
there.
And you have to clean up the R stack before returning,
and maybe
arrange for that to happen in case of an exception.
In article <10r3nfo$33464$[email protected]>,
Gerry Jackson <[email protected]> wrote:
On 07/04/2026 12:35, [email protected] wrote:
A similar situation applies to "TO must scan". It turns out there
is no standard program that can detect this. It steers implementation
towards a scanning TO.
On the contrary, Ruvim posted some code that is a standard program and
which distinguises between a parsing TO and one that sets a flag for a
following VALUE to act on.
I can't find the post but the gist of it was (I think):
1 value v1
2 value v2 immediate
: test to v2 v1 ;
Running test with
3 test
A parsing TO will set v2 to 3
A flagging TO will execute v2 during compilation of test because it is
immediate. So test will set v1 to 3 leaving v2 unchanged.
No it doesn't. It leaves garbage on the stack during compilation,
leading to mostly an error.
I leave it up to the reader whether this counts as a standard program.
I overlooked this clever example.
So I guess my VALUE is non compliant, proven by a contrived test.
It still makes no sense to forbid a flagging implementation.
(Also VALUE's don't make sense, anyway.)
Gerry
Groetjes Albert
On 7 Apr 2026 at 21:55:37 CEST, "Gerry Jackson" <[email protected]> wrote:
On 07/04/2026 12:35, [email protected] wrote:
A similar situation applies to "TO must scan". It turns out there
is no standard program that can detect this. It steers implementation
towards a scanning TO.
On the contrary, Ruvim posted some code that is a standard program and
which distinguises between a parsing TO and one that sets a flag for a
following VALUE to act on.
VFX sets a flag for TO and friends and has done so for 30+ years. We have no intention of changing despite the cleverness of Ruvim's detection scheme. We take the "as if" position because
a) it simplifies implementation.
b) no user has complained.
Stephen
On 2026-04-09 10:12, Stephen Pelc wrote:
On 7 Apr 2026 at 21:55:37 CEST, "Gerry Jackson" <[email protected]>
wrote:
On 07/04/2026 12:35, [email protected] wrote:
A similar situation applies to "TO must scan". It turns out there
is no standard program that can detect this. It steers implementation
towards a scanning TO.
On the contrary, Ruvim posted some code that is a standard program and
which distinguises between a parsing TO and one that sets a flag for a
following VALUE to act on.
VFX sets a flag for TO and friends and has done so for 30+ years. We
have no
intention of changing despite the cleverness of Ruvim's detection
scheme. We
take the "as if" position because
a) it simplifies implementation.
b) no user has complained.
As far as I can see, this approach does not simplify implementation to
any significant extent; instead, it limits the use cases. In VFX, it
also brakes `find` for `to` and other similar words. As a user, I would complain.
To ensure that all operators in VfxForth parse the parse area for their immediate argument, it suffices to modify the word `operator` as follows:
: take-name>xt ( "ccc" -- xt )
bl word find ?undef
;
: translate-xt ( any xt -- any )
state @ if compile, else execute then
;
: operator \ n -- ; define an operator
create
here swap , OperatorChain @ , OperatorChain !
immediate
does> @ OperatorType !
take-name>xt translate-xt
;
This also makes `find` correctly works for `to` and other similar words.
On 2026-04-08 10:34, [email protected] wrote:
In article <10r3nfo$33464$[email protected]>,
Gerry Jackson <[email protected]> wrote:
On 07/04/2026 12:35, [email protected] wrote:
A similar situation applies to "TO must scan". It turns out there
is no standard program that can detect this. It steers implementation
towards a scanning TO.
On the contrary, Ruvim posted some code that is a standard program and
which distinguises between a parsing TO and one that sets a flag for a
following VALUE to act on.
I can't find the post but the gist of it was (I think):
1 value v1
2 value v2 immediate
: test to v2 v1 ;
Yes, something similar.
Running test with
3 test
A parsing TO will set v2 to 3
Yes, and this is specified by the standard.
A flagging TO will execute v2 during compilation of test because it is
immediate. So test will set v1 to 3 leaving v2 unchanged.
No it doesn't. It leaves garbage on the stack during compilation,
leading to mostly an error.
I leave it up to the reader whether this counts as a standard program.
The provided program conforms to the Forth-94 standard and later versions.
A system that fails that test does not conform to the standard with
respect to `to`.
Historically, there were two approaches to implement `to`: "parsing" and "non-parsing" [1]. Forth-94 formally specified the "parsing" approach:
| ANS Forth explicitly requires that TO must parse,
| so that TO's effect will be predictable when
| it is used at the end of the parse area.
OTOH, it disallowed applying the words `postpone` and `[compile]` to
`to` [2]. Perhaps, this was done as a concession to implementations that adhered to the "non-parsing" approach, to prevent behavior variations in
a standard program caused by deviations in implementations of `to`.
[1] <https://forthhub.github.io/forth-sf-net/standard/dpans/ dpansa6.htm#A.6.2.2295>
[2] https://forthhub.github.io/forth-sf-net/standard/dpans/ dpans6.htm#6.2.2295
Here is an example of a program that relies on a parsing `to`, but is
not compliant due to that very ambiguous condition regarding `postpone`. Let's introduce a multiple assignment construct of the following form:
`1 2 3 to( a b c )`.
: ?comp ( -- ) state @ if exit then -14 throw ;
: equals ( sd2 sd1 -- flag )
dup 3 pick <> if 2drop 2drop false exit then
compare 0=
;
: source-offset ( -- u )
>in @
;
: set-source-offset ( u -- )
source nip over u< invert if >in ! exit then
-18 throw \ "parsed string overflow"
;
synonym take-lexeme-maybe parse-name
: take-lexeme ( "ccc" -- sd )
take-lexeme-maybe dup if exit then -16 throw
;
: to( ( "ccc<rparen>" -- ) \ " a b c )"
?comp source-offset ( u.offset )
take-lexeme s" )" equals if drop exit then
( u.offset ) recurse ( u.offset )
source-offset swap set-source-offset
postpone to
set-source-offset
; immediate
\ usage example
0 value a
0 value b
: init-foo ( -- ) 2 3 to( a b ) ;
init-foo a . b . \ it should print "2 3"
Here is an example of a program that relies on a parsing `to`, but is
not compliant due to that very ambiguous condition regarding `postpone`.
Let's introduce a multiple assignment construct of the following form:
`1 2 3 to( a b c )`.
: ?comp ( -- ) state @ if exit then -14 throw ;
: equals ( sd2 sd1 -- flag )
dup 3 pick <> if 2drop 2drop false exit then
compare 0=
;
: source-offset ( -- u )
>in @
;
: set-source-offset ( u -- )
source nip over u< invert if >in ! exit then
-18 throw \ "parsed string overflow"
;
synonym take-lexeme-maybe parse-name
: take-lexeme ( "ccc" -- sd )
take-lexeme-maybe dup if exit then -16 throw
;
: to( ( "ccc<rparen>" -- ) \ " a b c )"
?comp source-offset ( u.offset )
take-lexeme s" )" equals if drop exit then
( u.offset ) recurse ( u.offset )
source-offset swap set-source-offset
postpone to
set-source-offset
; immediate
\ usage example
0 value a
0 value b
: init-foo ( -- ) 2 3 to( a b ) ;
init-foo a . b . \ it should print "2 3"
Do you know a Forth system in which `to` parses the parse area and in
which the definition for `to(` given above *does not* work?
How do you implement a construct `to( ... )` that works both in >interpretation state and in compilation state?
Also, we could replace `postpone to` with
`state @ if postpone to else ['] to execute then`
Interestingly, the Recognizer API does not help in implementing this >construct.
Ruvim <[email protected]> writes:
Here is an example of a program that relies on a parsing `to`, but is
not compliant due to that very ambiguous condition regarding `postpone`. >>> Let's introduce a multiple assignment construct of the following form:
`1 2 3 to( a b c )`.
: ?comp ( -- ) state @ if exit then -14 throw ;
: equals ( sd2 sd1 -- flag )
dup 3 pick <> if 2drop 2drop false exit then
compare 0=
;
: source-offset ( -- u )
>in @
;
: set-source-offset ( u -- )
source nip over u< invert if >in ! exit then
-18 throw \ "parsed string overflow"
;
synonym take-lexeme-maybe parse-name
: take-lexeme ( "ccc" -- sd )
take-lexeme-maybe dup if exit then -16 throw
;
: to( ( "ccc<rparen>" -- ) \ " a b c )"
?comp source-offset ( u.offset )
take-lexeme s" )" equals if drop exit then
( u.offset ) recurse ( u.offset )
source-offset swap set-source-offset
postpone to
set-source-offset
; immediate
\ usage example
0 value a
0 value b
: init-foo ( -- ) 2 3 to( a b ) ;
init-foo a . b . \ it should print "2 3"
Do you know a Forth system in which `to` parses the parse area and in
which the definition for `to(` given above *does not* work?
Depending on what you mean by "work".
Anything that contains ?COMP is deficient by design.
How do you implement a construct `to( ... )` that works both in
interpretation state and in compilation state?
I don't. As for how someone else could do it, see below.>
Also, we could replace `postpone to` with
`state @ if postpone to else ['] to execute then`
Also deficient by design.
Interestingly, the Recognizer API does not help in implementing this
construct.
One way would be to add an immediate word TO( that changes rec-forth
(the default recognizer sequence; damn renamings) to a special
recognizer. This special recognizer just pushes every string it
should recognizer to a TO(-stack, except when it recognizes ")".
When it recognizes ")", it restores the original REC-FORTH, takes the
top string "<word>" off the TO(-stack, constructs a string "TO
<word>", and EVALUATEs it (you may try to take precautions such that
the right TO is found, but the standard gives us little to play with
here). Repeat until the TI(-stack is empty. Not even non-standard
POSTPONE TO is needed, and it also works with non-parsing TO
implementations. It does not work if the user has defined TO to mean something else (the curse of EVALUATE).
However, instead of going for recognizers, you might play the same
trick by letting TO( parse the strings up to ")" and push them on the TO(-stack. And that's simpler to implement,
so yes, the recognizer API does not help here.
But then recognizers are not designed for more than a word (the string recognizer is already a stretch). So what Gforth has is a REC-TO that recognizes "-><word>".
So here's the implementation (untested):
: to(
0 0 2>r begin
parse-name dup 0= abort" unfinished TO("
2dup 2>r
s" )" str= until
2r> 2drop \ get rid of ")"
begin
2r> dup while
[: "to " type ;] >string-execute evaluate
\ freeing the strings is left as exercise to the reader
repeat
2drop \ get rid of 0 0
; immediate
Gforth has an API for defing words with user-defined TO <https://net2o.de/gforth/Words-with-user_002ddefined-TO-etc_002e.html>,
but there is currently no proper API for defining words that perform
the function of TO or one of its siblings (+TO etc), in particular
there is no API that would support a user-defined REC-TO or TO(. This
shows in using internal words like TO-SLOTS in REC-TO.
Given the large differences between TO implementations in systems, I
expect that we will have a hard time (as in: it won't happen)
standardizing TO-related APIs.
- anton
Ruvim <[email protected]> writes:
Here is an example of a program that relies on a parsing `to`, but is
not compliant due to that very ambiguous condition regarding `postpone`. >>> Let's introduce a multiple assignment construct of the following form:
`1 2 3 to( a b c )`.
: ?comp ( -- ) state @ if exit then -14 throw ;
: equals ( sd2 sd1 -- flag )
dup 3 pick <> if 2drop 2drop false exit then
compare 0=
;
: source-offset ( -- u )
in @;
: set-source-offset ( u -- )
source nip over u<invert if >in ! exit then
-18 throw \ "parsed string overflow"
;
synonym take-lexeme-maybe parse-name
: take-lexeme ( "ccc" -- sd )
take-lexeme-maybe dup if exit then -16 throw
;
: to( ( "ccc<rparen>" -- ) \ " a b c )"
?comp source-offset ( u.offset )
take-lexeme s" )" equals if drop exit then
( u.offset ) recurse ( u.offset )
source-offset swap set-source-offset
postpone to
set-source-offset
; immediate
\ usage example
0 value a
0 value b
: init-foo ( -- ) 2 3 to( a b ) ;
init-foo a . b . \ it should print "2 3"
Do you know a Forth system in which `to` parses the parse area and in
which the definition for `to(` given above *does not* work?
Depending on what you mean by "work". Anything that contains ?COMP is deficient by design.
How do you implement a construct `to( ... )` that works both in
interpretation state and in compilation state?
I don't. As for how someone else could do it, see below.
Also, we could replace `postpone to` with
`state @ if postpone to else ['] to execute then`
Also deficient by design.
Interestingly, the Recognizer API does not help in implementing this
construct.
One way would be to add an immediate word TO( that changes rec-forth
(the default recognizer sequence; damn renamings) to a special
recognizer. This special recognizer just pushes every string it
should recognizer to a TO(-stack, except when it recognizes ")".
When it recognizes ")", it restores the original REC-FORTH, takes the
top string "<word>" off the TO(-stack, constructs a string "TO
<word>", and EVALUATEs it (you may try to take precautions such that
the right TO is found, but the standard gives us little to play with
here). Repeat until the TI(-stack is empty. Not even non-standard
POSTPONE TO is needed, and it also works with non-parsing TO
implementations. It does not work if the user has defined TO to mean something else (the curse of EVALUATE).
However, instead of going for recognizers, you might play the same
trick by letting TO( parse the strings up to ")" and push them on the TO(-stack. And that's simpler to implement, so yes, the recognizer
API does not help here.
But then recognizers are not designed for more than a word (the string recognizer is already a stretch). So what Gforth has is a REC-TO that recognizes "-><word>".
So here's the implementation (untested):
: to(
0 0 2>r begin
parse-name dup 0= abort" unfinished TO("
2dup 2>r
s" )" str= until
2r> 2drop \ get rid of ")"
begin
2r> dup while
[: "to " type ;] >string-execute evaluate
\ freeing the strings is left as exercise to the reader
repeat
2drop \ get rid of 0 0
; immediate
Gforth has an API for defing words with user-defined TO <https://net2o.de/gforth/Words-with-user_002ddefined-TO-etc_002e.html>,
but there is currently no proper API for defining words that perform
the function of TO or one of its siblings (+TO etc), in particular
there is no API that would support a user-defined REC-TO or TO(. This
shows in using internal words like TO-SLOTS in REC-TO.
Given the large differences between TO implementations in systems, I
expect that we will have a hard time (as in: it won't happen)
standardizing TO-related APIs.
- anton
On 7 Apr 2026 at 21:55:37 CEST, "Gerry Jackson" <[email protected]> wrote:
On 07/04/2026 12:35, [email protected] wrote:
A similar situation applies to "TO must scan". It turns out there
is no standard program that can detect this. It steers implementation
towards a scanning TO.
On the contrary, Ruvim posted some code that is a standard program and
which distinguises between a parsing TO and one that sets a flag for a
following VALUE to act on.
I can't find the post but the gist of it was (I think):
1 value v1
2 value v2 immediate
: test to v2 v1 ;
Running test with
3 test
A parsing TO will set v2 to 3
A flagging TO will execute v2 during compilation of test because it is
immediate. So test will set v1 to 3 leaving v2 unchanged.
The test depends on being able to define V2 as IMMEDIATE.
Where in the standard does it specify that children on VALUE are not IMMEDIATE ?
If does so specify, then the test is implementation dependent,
and contradicts the intention of the standard.
Be careful what you wish for.
VFX and its predecessor ProForth have used a flagging TO for well over 30 years and I am not breaking the world's largest Forth application
(32 bit, over 30 Mb binary) for a language lawyer and a probably invalid test.
On 2026-06-06 16:17, Anton Ertl wrote:
Ruvim <[email protected]> writes:
Here is an example of a program that relies on a parsing `to`, but is
not compliant due to that very ambiguous condition regarding `postpone`. >>>> Let's introduce a multiple assignment construct of the following form: >>>> `1 2 3 to( a b c )`.
: ?comp ( -- ) state @ if exit then -14 throw ;
: equals ( sd2 sd1 -- flag )
dup 3 pick <> if 2drop 2drop false exit then
compare 0=
;
: source-offset ( -- u )
>in @
;
: set-source-offset ( u -- )
source nip over u< invert if >in ! exit then
-18 throw \ "parsed string overflow"
;
synonym take-lexeme-maybe parse-name
: take-lexeme ( "ccc" -- sd )
take-lexeme-maybe dup if exit then -16 throw
;
: to( ( "ccc<rparen>" -- ) \ " a b c )"
?comp source-offset ( u.offset )
take-lexeme s" )" equals if drop exit then
( u.offset ) recurse ( u.offset )
source-offset swap set-source-offset
postpone to
set-source-offset
; immediate
\ usage example
0 value a
0 value b
: init-foo ( -- ) 2 3 to( a b ) ;
init-foo a . b . \ it should print "2 3"
Do you know a Forth system in which `to` parses the parse area and in
which the definition for `to(` given above *does not* work?
Depending on what you mean by "work".
By "does not work" I mean that the provided program does not behave as >specified in the usage example (a kind of test).
Anything that contains ?COMP is deficient by design.
Do you mean that preventing accidentally execution of some word in >interpretation state (by throwing an exception) is deficient?
Also, we could replace `postpone to` with
`state @ if postpone to else ['] to execute then`
Also deficient by design.
Do you mean an ambiguous condition on postponing and ticking `to`?
But then recognizers are not designed for more than a word (the string
recognizer is already a stretch). So what Gforth has is a REC-TO that
recognizes "-><word>".
So here's the implementation (untested):
: to(
0 0 2>r begin
parse-name dup 0= abort" unfinished TO("
2dup 2>r
s" )" str= until
2r> 2drop \ get rid of ")"
begin
2r> dup while
[: "to " type ;] >string-execute evaluate
\ freeing the strings is left as exercise to the reader
repeat
2drop \ get rid of 0 0
; immediate
Thus, this implementation is more complex and unhygienic [2], and these >drawbacks are introduced solely for the sake of a few Forth systems that >provide non-standard `to`. The cost seems unjustified.
[2] It is unhygienic, as it requires `to` to be present in the context.
See: <https://en.wikipedia.org/wiki/Hygienic_macro>
Gforth has an API for defing words with user-defined TO
<https://net2o.de/gforth/Words-with-user_002ddefined-TO-etc_002e.html>,
but there is currently no proper API for defining words that perform
the function of TO or one of its siblings (+TO etc), in particular
there is no API that would support a user-defined REC-TO or TO(. This
shows in using internal words like TO-SLOTS in REC-TO.
There could be a basic factor similar to `defer!`:
`execute-setter` Execution: ( any1 xt1 -- )
Set `xt1` to return `any1` on execution. An ambiguous condition exists
if `xt1` cannot be set to return `any1`.
`xt1` is the execution token of a word created with `value`, `2value`,
or `fvalue`.
Given the large differences between TO implementations in systems, I
expect that we will have a hard time (as in: it won't happen)
standardizing TO-related APIs.
A side note: I think this is another argument against methods based on
`TO` or `IS` in the Recognizers API, and in APIs in general.
Surely this only serves to demonstrate that recognisers are not the answer
to all maidens' prayers.
Ruvim <[email protected]> writes:
On 2026-06-06 16:17, Anton Ertl wrote:
Ruvim <[email protected]> writes:
Here is an example of a program that relies on a parsing `to`, but is >>>>> not compliant due to that very ambiguous condition regarding `postpone`. >>>>> Let's introduce a multiple assignment construct of the following form: >>>>> `1 2 3 to( a b c )`.
: ?comp ( -- ) state @ if exit then -14 throw ;
: equals ( sd2 sd1 -- flag )
dup 3 pick <> if 2drop 2drop false exit then
compare 0=
;
: source-offset ( -- u )
>in @
;
: set-source-offset ( u -- )
source nip over u< invert if >in ! exit then
-18 throw \ "parsed string overflow"
;
synonym take-lexeme-maybe parse-name
: take-lexeme ( "ccc" -- sd )
take-lexeme-maybe dup if exit then -16 throw
;
: to( ( "ccc<rparen>" -- ) \ " a b c )"
?comp source-offset ( u.offset )
take-lexeme s" )" equals if drop exit then
( u.offset ) recurse ( u.offset )
source-offset swap set-source-offset
postpone to
set-source-offset
; immediate
\ usage example
0 value a
0 value b
: init-foo ( -- ) 2 3 to( a b ) ;
init-foo a . b . \ it should print "2 3"
Do you know a Forth system in which `to` parses the parse area and in
which the definition for `to(` given above *does not* work?
Depending on what you mean by "work".
By "does not work" I mean that the provided program does not behave as
specified in the usage example (a kind of test).
That is already satisfied by:
: to-b-to-a to b to a ;
: to(
')' parse 2drop
postpone to-b-to-a ; immediate
A much simpler implementation that does not have the deficiency
discussed below. It also demonstrates that you cannot use a test as a specification.
Anything that contains ?COMP is deficient by design.
Do you mean that preventing accidentally execution of some word in
interpretation state (by throwing an exception) is deficient?
Any word that is not the text interpreter and that uses "STATE @" is deficient.
So here's the implementation (untested):
: to(
0 0 2>r begin
parse-name dup 0= abort" unfinished TO("
2dup 2>r
s" )" str= until
2r> 2drop \ get rid of ")"
begin
2r> dup while
[: "to " type ;] >string-execute evaluate
\ freeing the strings is left as exercise to the reader
repeat
2drop \ get rid of 0 0
; immediate
Thus, this implementation is more complex and unhygienic [2], and these
drawbacks are introduced solely for the sake of a few Forth systems that
provide non-standard `to`. The cost seems unjustified.
[2] It is unhygienic, as it requires `to` to be present in the context.
See: <https://en.wikipedia.org/wiki/Hygienic_macro>
Accidental capture of identifiers is not to only problem of
EVALUATE-based macros, even in this case. Another is accidental nonvisibility of TO. And yet, the EVALUATE-based implementation of
TO( is superior to the one you give above in several aspects:
1) Shorter. And I would say it is less complex (as evidenced by its shortness), but you claim that it is more complex without explaining
why you think so.
2) No STATE @ deficiency.
3) Implementable in standard Forth (>STRING-EXECUTE can be implementedThis non-standard use was the main purpose of my implementation. I
in standard Forth), while the implementation you give above uses
non-standard POSTPONE TO and ['] TO.
On 2026-06-07 13:23, Anton Ertl wrote:[...]
Ruvim <[email protected]> writes:
On 2026-06-06 16:17, Anton Ertl wrote:
Ruvim <[email protected]> writes:
Do you know a Forth system in which `to` parses the parse area and in >>>>> which the definition for `to(` given above *does not* work?
Depending on what you mean by "work".
By "does not work" I mean that the provided program does not behave as
specified in the usage example (a kind of test).
That is already satisfied by:
: to-b-to-a to b to a ;
: to(
')' parse 2drop
postpone to-b-to-a ; immediate
Whether the definitions works as expected is checked by
`init-foo a . b .`
To be tested automatically, two last lines can be written as:
t{ 0 value a 0 value b -> }t
t{ :noname 2 3 to( a b ) ; execute a b -> 2 3 }t
Any word that is not the text interpreter and that uses "STATE @" is
deficient.
I would classify my word `to(` is a text interpreter. Is it still deficient?
[...]...
So here's the implementation (untested):
: to(
0 0 2>r begin
parse-name dup 0= abort" unfinished TO("
2dup 2>r
s" )" str= until
2r> 2drop \ get rid of ")"
begin
2r> dup while
[: "to " type ;] >string-execute evaluate
\ freeing the strings is left as exercise to the reader
repeat
2drop \ get rid of 0 0
; immediate
Thus, this implementation is more complex and unhygienic [2], and these
drawbacks are introduced solely for the sake of a few Forth systems that >>> provide non-standard `to`. The cost seems unjustified.
[2] It is unhygienic, as it requires `to` to be present in the context.
See: <https://en.wikipedia.org/wiki/Hygienic_macro>
Accidental capture of identifiers is not to only problem of
EVALUATE-based macros, even in this case.
If we
take the standard words as the basis, I expect the approach with two
loops and `evaluate` will be longer than the approach with recursion and >`postpone`.
2) No STATE @ deficiency.
Your definition also contains `STATE @`, but indirectly, inside
`evaluate`.
There is one thing that this TO( cannot do: It does not work as
intended inside ]] ... [[. And if somebody writes
POSTPONE TO( a b )
they will not get the equivalent of
POSTPONE ->b POSTPONE ->a
Can we deal with this by defining a recognizer for TO(, which could
then use the TRANSLATE-TO designed for REC-TO?
'translate-to' ( n xt - translation ) gforth-experimental
xt belongs to a value-flavoured (or defer-flavoured) word, n is the
index into the 'to-table:' for xt (see Words with user-defined TO etc.). Interpreting run-time: '( ... -- ... )'
Perform the to-action with index n in the 'to-table:' of xt. Additional stack effects depend on n and xt.
It is probably possible to do that, but it would be complex: The
recognizer would produce a translation that contains all the xts of
the value-like words, and a TRANSLATE-TO(.
The TRANSLATE-TO( action
for interpreting would have to get all of the xts out of the way. The
it woul get the xt for the last one and perform TRANSLATE-TO
INTERPRETING; repeat for the next one, and repeat until all the xts
are done. The action for compiling and postponing could be
implemented in a similar way (but call COMPILING and POSTPONING
instead of INTERPRETING), or maybe simpler because there is no need to
get the xts out of the way.
In any case, that's quite a bit of work. Maybe someone (Stephen
Pelc?) thinks this demonstrates that the recognizer words are too
limited. I think it demonstrates that TO( is not a good idea.
Ruvim <[email protected]> writes:[...]
On 2026-06-07 13:23, Anton Ertl wrote:
Any word that is not the text interpreter and that uses "STATE @" is
deficient.
I would classify my word `to(` is a text interpreter. Is it still deficient?
Yes, because it is no text interpreter, and playing Humpty-Dumpty does
not change that. E.g.,
to( : foo 1 + . ; 2 foo )
does not print 3.
[...][...]
So here's the implementation (untested):
: to(
0 0 2>r begin
parse-name dup 0= abort" unfinished TO("
2dup 2>r
s" )" str= until
2r> 2drop \ get rid of ")"
begin
2r> dup while
[: "to " type ;] >string-execute evaluate
\ freeing the strings is left as exercise to the reader
repeat
2drop \ get rid of 0 0
; immediate
2) No STATE @ deficiency.
Your definition also contains `STATE @`, but indirectly, inside
`evaluate`.
Yes, good point, as mentioned: Accidental capture of identifiers is
not to only problem of EVALUATE-based macros, even in this case.
The problem is that the text interpreter inside the EVALUATE
invocations use the STATE at the run-time of TO(, not at the parsing
time.
One way to handle that would be to disallow using ', ['], POSTPONE and [COMPILE] on TO(, but
1) The Forth standard does not give us a way to enforce that.
2) As your TO( implementation shows, such a restriction can become a hindrance.
Another way to handle that is to have a recognizer that deals with
TO(.
That solves objection 1) above, but not necessarily objection 2).
However, maybe the user who wants to do something which would
require POSTPONEing or ticking a word TO( can do manage to do what
they want with the recognizer, the translator, or the translator's
actions.
In general, recognizers are suitable in the following cases:
- the behavior cannot be implemented with a single word (as for a
string literal starter, numeric literals, etc);
- the recognizing of a lexeme (or the beginning of a construct)
should not depend on the search order (or be shadowed by words from the search order);
- reuse of the Forth text interpreter for translating text by
different rules (especially, through nested input sources; for example, `execute-parsing` can be implemented using the Recognizer API);
Otherwise, using a parsing word is also perfectly suitable.
Whether it is a parsing word or another construct, it is better if the affected lexical block (its beginning and end) is visually marked.
Therefore, `to( foo )` is better than `to foo`.
<https://forth-standard.org/proposals/recognizer-committee-proposal-2025-09-11?hideDiff#reply-1623>.
REC-FORTH is proposed as a deferred word, but there is no requirement
to use IS on it. And unlike for value-flavoured words, there are
non-parsing words for dealing with deferred words: DEFER@ and DEFER!.
So if you want to define IS( ... ), there is no need to concern
yourself with non-standard words like (TO).
On 2026-06-07 13:23, Anton Ertl wrote:
[...]
<https://forth-standard.org/proposals/recognizer-committee-proposal-2025-09-11?hideDiff#reply-1623>.
REC-FORTH is proposed as a deferred word, but there is no requirement
to use IS on it. And unlike for value-flavoured words, there are
non-parsing words for dealing with deferred words: DEFER@ and DEFER!.
As I recall, you had objections regarding the `BASE` variable
(specifically, the use `@` and `!` as methods to get/set its value). I
agree with them as well.
These objections also apply to the use of `DEFER@` and `DEFER!` (or >`ACTION-OF` and `IS`) to get/set the value of a `DEFER` child.
For example, you wrote on 2024-10-05 in ><[email protected]>:
| Forth-94 seems to have had some of that, though,
| with words like GET-CURRENT and SET-CURRENT instead
| of a (user) variable CURRENT that had existing
| practice at the time.
| I wish they had defined GET-BASE and SET-BASE
| instead of BASE.
I would like to understand why you do not apply the same reasoning to
the Recognizers API variant you proposed.
Ruvim <[email protected]> writes:
On 2026-06-07 13:23, Anton Ertl wrote:
[...]
<https://forth-standard.org/proposals/recognizer-committee-proposal-2025-09-11?hideDiff#reply-1623>.
REC-FORTH is proposed as a deferred word, but there is no requirement
to use IS on it. And unlike for value-flavoured words, there are
non-parsing words for dealing with deferred words: DEFER@ and DEFER!.
As I recall, you had objections regarding the `BASE` variable
(specifically, the use `@` and `!` as methods to get/set its value). I
agree with them as well.
These objections also apply to the use of `DEFER@` and `DEFER!` (or
`ACTION-OF` and `IS`) to get/set the value of a `DEFER` child.
For example, you wrote on 2024-10-05 in
<[email protected]>:
| Forth-94 seems to have had some of that, though,
| with words like GET-CURRENT and SET-CURRENT instead
| of a (user) variable CURRENT that had existing
| practice at the time.
| I wish they had defined GET-BASE and SET-BASE
| instead of BASE.
I would like to understand why you do not apply the same reasoning to
the Recognizers API variant you proposed.
The reason why I would have preferred SET-BASE GET-BASE over the
(user) variable BASE is that it would have allowed to do the first
stage of two-stage division in SET-BASE, and then perform the second
stage in #.
If BASE had been a UVALUE, that would have been ok, too, because I
could have used SET-TO on BASE to perform the same optimization.
Eventually I found a different way to achieve much of the same
benefit, see <[email protected]>.
| Sysop: | DaiTengu |
|---|---|
| Location: | Appleton, WI |
| Users: | 1,123 |
| Nodes: | 10 (0 / 10) |
| Uptime: | 39:12:57 |
| Calls: | 14,372 |
| Calls today: | 1 |
| Files: | 186,380 |
| D/L today: |
7,900 files (2,327M bytes) |
| Messages: | 2,540,712 |