...
However we could squeeze for 16 bits, without logically affecting the model.
code field: one byte, an offset to a code area of 256 bytes.
data field: 16 bit pointer
name filed: 4 byte, 3 first and last char. Only 7 bits,
8th bit counts are flags
flag field: hidden in the name
link field: 256 bit offset, d.e. are at most 256 byte long.
The total of 8 bytes will put even the original fig model to shame.
However we could squeeze for 16 bits, without logically affecting the model.
...name filed: 4 byte, 3 first and last char. Only 7 bits,
8th bit counts are flags
Don't know who was first but Fig-forth's variable length names is something >that Forth Inc and pretty much everyone adopted. Moore attempted to defend >'3 chars plus count' but to no avail. That ship had sailed.
dxf <[email protected]> writes:
...name filed: 4 byte, 3 first and last char. Only 7 bits,
8th bit counts are flags
Don't know who was first but Fig-forth's variable length names is something >>that Forth Inc and pretty much everyone adopted. Moore attempted to defend >>'3 chars plus count' but to no avail. That ship had sailed.
Looking at the traditional length+3 chars and
[email protected]'s 3 first and last, at least one pair of
words in Forth-94 conflicts on both systems, and WORDS could show them
as REA?????? and REA*E, respectively. It would be interesting to
determine (say, by checking the words from an existing Forth system),
which scheme produces more conflicts.
- anton
[email protected] writes:
However we could squeeze for 16 bits, without logically affecting the model.
These days for such a constrained target, it's probably best to tether
from a bigger machine.
The argument was that an extreme impractical Forth can be implemented
with this model as guideline, to counter the argument of extreme waste
that I expected.
[email protected] writes:
The argument was that an extreme impractical Forth can be implemented
with this model as guideline, to counter the argument of extreme waste
that I expected.
The users have voted with their feet: They usually have used the
"extremely wasteful" fig-Forth with the default settings (names with
up to 31-char) rather than using the "extremely impractical" option to
only store the first n chars of a name (with n being configurable in fig-Forth). "Wastefulness" won over "Impracticality" so convincingly
that even Forth, Inc. switched from "impracticality" to
"wastefulness", as well as almost everyone else. The exception is
Chuck Moore, who continues with the "impractical" approach in
ColorForth. Even in the Forth universe, few people seem to use
ColorForth and I have not heard of systems that follow its approach to
names.
Concerning memory consumption, Gforth's development version includes quite a bit of meta-information for two purposes:
1) How the threaded code relates to the source code, not just where
each definition starts, but also where words are used.
2) How the machine code addresses relate to what is COMPILE,d, to get
proper decompilation with SEE-CODE, SIMPLE-SEE, and SEE.
Both sets of informations are in big tables that are stored
out-of-line, not with the headers, and each takes about as much space
as the inline stuff (headers, threaded code, other bodies); the
information where each definition is defined is stored in the headers, however. The result is that gforth.fi takes 2.3MB on my system; for comparison, the inline dictionary stuff is 686464 bytes, the native
code of 465671 bytes for gforth-fast and 892251 for gforth (also
out-of-line, both not in the image).
One may consider this wasteful, but I consider it good use of the RAM
that our machines have; my PC from 1993 had 16MB, the one from 2015
16GB, my current one 64GB.
Concerning header size, here's what we have in Gforth (on a 64-bit system):
here 5 constant five here over - dump
403AC9C0: 20 20 20 20 66 69 76 65 - 04 00 00 00 00 00 00 00 five........
403AC9D0: 08 C7 3A 40 00 00 00 00 - 81 EA A1 9D EA 55 00 00 ..:@.........U..
403AC9E0: 60 AF 30 40 00 00 00 00 - 05 00 00 00 00 00 00 00 `.0@............
I.e., 6 cells for a word with a name that fits in one cell, and one
cell of body. The cells are:
403AC9C0: Name padded with spaces at the front to align it to a cell boundary 403AC9C8: Name length and flags
403AC9D0: link field (pointer to previous word in the same wordlist) 403AC9D8: header methods (pointer to a method table)
403AC9E0: code field (contains the code address of docon in this case) 403AC9E8: body aka parameter field, contains the value in this case
For more information, read--- Synchronet 3.21f-Linux NewsLink 1.2
@InProceedings{paysan&ertl19,
author = {Bernd Paysan and M. Anton Ertl},
title = {The new {Gforth} Header},
crossref = {euroforth19},
pages = {5--20},
url = {http://www.euroforth.org/ef19/papers/paysan.pdf},
url-slides = {http://www.euroforth.org/ef19/papers/paysan-slides.pdf},
video = {https://wiki.forth-ev.de/doku.php/events:ef2019:header},
OPTnote = {refereed},
abstract = {The new Gforth header is designed to directly
implement the requirements of Forth-94 and
Forth-2012. Every header is an object with a fixed
set of fields (code, parameter, count, name, link)
and methods (\texttt{execute}, \texttt{compile,},
\texttt{(to)}, \texttt{defer@}, \texttt{does},
\texttt{name>interpret}, \texttt{name>compile},
\texttt{name>string}, \texttt{name>link}). The
implementation of each method can be changed
per-word (prototype-based object-oriented
programming). We demonstrate how to use these
features to implement optimization of constants,
\texttt{fvalue}, \texttt{defer}, \texttt{immediate},
\texttt{to} and other dual-semantics words, and
\texttt{synonym}.}
}
@Proceedings{euroforth19,
title = {35th EuroForth Conference},
booktitle = {35th EuroForth Conference},
year = {2019},
key = {EuroForth'19},
url = {http://www.euroforth.org/ef19/papers/proceedings.pdf}
}
There have been a few changes since that paper: The body address is
now used as nt and xt, and the "name length and flags" field has
gained another flag or two.
- anton
Concerning memory consumption, Gforth's development version includes
quite a bit of meta-information for two purposes:
1) How the threaded code relates to the source code, not just where
each definition starts, but also where words are used.
2) How the machine code addresses relate to what is COMPILE,d, to get
proper decompilation with SEE-CODE, SIMPLE-SEE, and SEE.
Both sets of informations are in big tables that are stored
out-of-line, not with the headers, and each takes about as much space
as the inline stuff (headers, threaded code, other bodies); the
information where each definition is defined is stored in the headers, >however. The result is that gforth.fi takes 2.3MB on my system; for >comparison, the inline dictionary stuff is 686464 bytes, the native
code of 465671 bytes for gforth-fast and 892251 for gforth (also
out-of-line, both not in the image).
One may consider this wasteful, but I consider it good use of the RAM
that our machines have; my PC from 1993 had 16MB, the one from 2015
16GB, my current one 64GB.
Concerning header size, here's what we have in Gforth (on a 64-bit system):
here 5 constant five here over - dump
403AC9C0: 20 20 20 20 66 69 76 65 - 04 00 00 00 00 00 00 00 five........
403AC9D0: 08 C7 3A 40 00 00 00 00 - 81 EA A1 9D EA 55 00 00 ..:@.........U..
403AC9E0: 60 AF 30 40 00 00 00 00 - 05 00 00 00 00 00 00 00 `.0@............
I.e., 6 cells for a word with a name that fits in one cell, and one
cell of body. The cells are:
- anton--
In article <[email protected]>,
Anton Ertl <[email protected]> wrote:
Concerning memory consumption, Gforth's development version includes
quite a bit of meta-information for two purposes:
1) How the threaded code relates to the source code, not just where
each definition starts, but also where words are used.
2) How the machine code addresses relate to what is COMPILE,d, to get >>proper decompilation with SEE-CODE, SIMPLE-SEE, and SEE.
Both sets of informations are in big tables that are stored
out-of-line, not with the headers, and each takes about as much space
as the inline stuff (headers, threaded code, other bodies); the
information where each definition is defined is stored in the headers, >>however. The result is that gforth.fi takes 2.3MB on my system; for >>comparison, the inline dictionary stuff is 686464 bytes, the native
code of 465671 bytes for gforth-fast and 892251 for gforth (also >>out-of-line, both not in the image).
In view that ctags understands Forth code this seems to be a duplicate >effort.
ctags --lang=forth *.frt
The advantage is that you can use emacs (or other sophisticated
editors) to go to a function in a familiar way that you were used
to in other languages too.
For those not familiar with ctags, in additions to definitions
it finds also references. It also is blindingly fast.
Under 100 mS for hundreds of files.
One may consider this wasteful, but I consider it good use of the RAM
that our machines have; my PC from 1993 had 16MB, the one from 2015
16GB, my current one 64GB.
Concerning header size, here's what we have in Gforth (on a 64-bit system): >>
here 5 constant five here over - dump
403AC9C0: 20 20 20 20 66 69 76 65 - 04 00 00 00 00 00 00 00 five........
403AC9D0: 08 C7 3A 40 00 00 00 00 - 81 EA A1 9D EA 55 00 00 ..:@.........U..
403AC9E0: 60 AF 30 40 00 00 00 00 - 05 00 00 00 00 00 00 00 `.0@............
I.e., 6 cells for a word with a name that fits in one cell, and one
cell of body. The cells are:
This was exactly my point.
[email protected] writes:
In article <[email protected]>,
Anton Ertl <[email protected]> wrote:
Concerning memory consumption, Gforth's development version includes >>>quite a bit of meta-information for two purposes:
1) How the threaded code relates to the source code, not just where
each definition starts, but also where words are used.
This is used for making backtraces more informative.
2) How the machine code addresses relate to what is COMPILE,d, to get >>>proper decompilation with SEE-CODE, SIMPLE-SEE, and SEE.
3) There is also the where table that records where each word is
actually used in the loaded source code, whether there is threaded
code for it or not; it records interpretive use as well as immediate
words where the threaded code is for a different word, if there is
threaded code at all.
I have run ctags and etags with this option on the files from the Gray >directory, and both do not find any of the definitions of TERM (which
exist in calc.fs and oberon.fs); that's probably because they have
been defined with a user-defined defining word.
editors) to go to a function in a familiar way that you were used
to in other languages too.
For a long time, I thought that etags.fs is sufficient and we do not
need to add LOCATE to Gforth, but once we implemented LOCATE, I found
that I use it much more often than M-. (forth-find-tag).
For those not familiar with ctags, in additions to definitions
it finds also references. It also is blindingly fast.
Under 100 mS for hundreds of files.
So you may be claiming that ctags covers the job of the where table.
I do not see how to achive that. ctags has an option --cxref, but it
just outputs the definitions in a different format. E.g., when I say
ctags --lang=forth --cxref *.fs<SNIP>
with the DUP use being highlighted. The where table, which consumes
quite a bit of memory (827_776 bytes in gforth.fi for a 64-bit
system), contains that information. I do not see anything in the
ctags/etags manual that provides this functionality. So the LOCATE >information (a cell for each dictionary entry) may be seen as
duplicate information, but the WHERE information does not duplicate
anything.
One may consider this wasteful, but I consider it good use of the RAM >>>that our machines have; my PC from 1993 had 16MB, the one from 2015
16GB, my current one 64GB.
Concerning header size, here's what we have in Gforth (on a 64-bit system): >>>
here 5 constant five here over - dump
403AC9C0: 20 20 20 20 66 69 76 65 - 04 00 00 00 00 00 00 00 five........
403AC9D0: 08 C7 3A 40 00 00 00 00 - 81 EA A1 9D EA 55 00 00 ..:@.........U..
403AC9E0: 60 AF 30 40 00 00 00 00 - 05 00 00 00 00 00 00 00 `.0@............
I.e., 6 cells for a word with a name that fits in one cell, and one
cell of body. The cells are:
This was exactly my point.
Your point was that Gforth uses 6 cells for a word with a name that
fits in a cell and where the body takes one cell?
- anton--
dxf <[email protected]> writes:
...name filed: 4 byte, 3 first and last char. Only 7 bits,
8th bit counts are flags
Don't know who was first but Fig-forth's variable length names is something >> that Forth Inc and pretty much everyone adopted. Moore attempted to defend >> '3 chars plus count' but to no avail. That ship had sailed.
Looking at the traditional length+3 chars and
[email protected]'s 3 first and last, at least one pair of
words in Forth-94 conflicts on both systems, and WORDS could show them
as REA?????? and REA*E, respectively. It would be interesting to
determine (say, by checking the words from an existing Forth system),
which scheme produces more conflicts.
Moore continues with this approach in Color Forth, but he uses some compression approach to usually store more characters in the number of
bits he reserves for the name (IIRC 2 cells, with cell sizes of 20
bits, 18 bits, and 32 bits on different hardware). I don't remember
if he stores the length.
Another option would be to store a hash value that is computed using
all characters in the name. If a good hash function is used, the
probability of a conflict is relatively small with, e.g. 4000 names in
a wordlist (about the number of names that Gforth has in the Forth
wordlist), and even the 28 bits that [email protected]
provides. The probability of no conflict is approximately
((2^28-1)/(2^28))^((4000*3999)/2)
i.e.
1 28 lshift s>f fdup 1e f- fswap f/ 4000 dup 1- * 2/ s>f f** f.
The result is 0.97, i.e., there is a 3% probability of conflict for
these numbers.
The disadvantage of this approach is that WORDS or SEE cannot even
show the little about the name that Chuck Moore's approaches or [email protected]'s approach shows. But then, if you are so
pressed for memory that you use one of these approaches, why not also
save the memory for WORDS and SEE?
Another disadvantage is that the system cannot tell if a redefinition
warning comes from a hash conflict or from the name actually being
redefined; but it shares this disadvantage with all approaches that do
not store the full name.
- anton
Another disadvantage is that the system cannot tell if a redefinition
warning comes from a hash conflict or from the name actually being
redefined; but it shares this disadvantage with all approaches that do
not store the full name.
- anton
In this context a hash conflict is a redefinition. Avoid,
[email protected] writes:
In this context a hash conflict is a redefinition. Avoid,
Unclear how to avoid, other than by storing the entire name so you can
detect the conflict.
If you have a conflict, you rename the new offending definition,
as you do now.
[email protected] writes:You know there is a conflict because the message:
If you have a conflict, you rename the new offending definition,
as you do now.
How do you know when there is a conflict? We're talking about a hash >collision, right? Are we supposed to guarantee that the hash function
won't change between interpreter versions and that sort of thing?
"As you do now": well, no; I've never used a Forth that faced thisYes you do encounter name collisions. See below.
issue. All the ones I've used have stored the entire name instead of >hashing. I thought (or at least hoped) that the different lossy
compression schemes from the early days were historical artifacts due to
the very small machines of the era. By the time of the Commodore 64,
those tricks were not needed.
[email protected] writes:
The argument was that an extreme impractical Forth can be implemented
with this model as guideline, to counter the argument of extreme waste
that I expected.
The users have voted with their feet: They usually have used the
"extremely wasteful" fig-Forth with the default settings (names with
up to 31-char) rather than using the "extremely impractical" option to
only store the first n chars of a name (with n being configurable in fig-Forth). "Wastefulness" won over "Impracticality" so convincingly
that even Forth, Inc. switched from "impracticality" to
"wastefulness", as well as almost everyone else. The exception is
Chuck Moore, who continues with the "impractical" approach in
ColorForth. Even in the Forth universe, few people seem to use
ColorForth and I have not heard of systems that follow its approach to
names.
Concerning memory consumption, Gforth's development version includes quite a bit of meta-information for two purposes:
1) How the threaded code relates to the source code, not just where
each definition starts, but also where words are used.
2) How the machine code addresses relate to what is COMPILE,d, to get
proper decompilation with SEE-CODE, SIMPLE-SEE, and SEE.
Both sets of informations are in big tables that are stored
out-of-line, not with the headers, and each takes about as much space
as the inline stuff (headers, threaded code, other bodies); the
information where each definition is defined is stored in the headers, however. The result is that gforth.fi takes 2.3MB on my system; for comparison, the inline dictionary stuff is 686464 bytes, the native
code of 465671 bytes for gforth-fast and 892251 for gforth (also
out-of-line, both not in the image).
One may consider this wasteful, but I consider it good use of the RAM
that our machines have; my PC from 1993 had 16MB, the one from 2015
16GB, my current one 64GB.
Concerning header size, here's what we have in Gforth (on a 64-bit system):
here 5 constant five here over - dump
403AC9C0: 20 20 20 20 66 69 76 65 - 04 00 00 00 00 00 00 00 five........
403AC9D0: 08 C7 3A 40 00 00 00 00 - 81 EA A1 9D EA 55 00 00 ..:@.........U..
403AC9E0: 60 AF 30 40 00 00 00 00 - 05 00 00 00 00 00 00 00 `.0@............
I.e., 6 cells for a word with a name that fits in one cell, and one
cell of body. The cells are:
403AC9C0: Name padded with spaces at the front to align it to a cell boundary 403AC9C8: Name length and flags
403AC9D0: link field (pointer to previous word in the same wordlist) 403AC9D8: header methods (pointer to a method table)
403AC9E0: code field (contains the code address of docon in this case) 403AC9E8: body aka parameter field, contains the value in this case
For more information, read
@InProceedings{paysan&ertl19,
author = {Bernd Paysan and M. Anton Ertl},
title = {The new {Gforth} Header},
crossref = {euroforth19},
pages = {5--20},
url = {http://www.euroforth.org/ef19/papers/paysan.pdf},
url-slides = {http://www.euroforth.org/ef19/papers/paysan-slides.pdf},
video = {https://wiki.forth-ev.de/doku.php/events:ef2019:header},
OPTnote = {refereed},
abstract = {The new Gforth header is designed to directly
implement the requirements of Forth-94 and
Forth-2012. Every header is an object with a fixed
set of fields (code, parameter, count, name, link)
and methods (\texttt{execute}, \texttt{compile,},
\texttt{(to)}, \texttt{defer@}, \texttt{does},
\texttt{name>interpret}, \texttt{name>compile},
\texttt{name>string}, \texttt{name>link}). The
implementation of each method can be changed
per-word (prototype-based object-oriented
programming). We demonstrate how to use these
features to implement optimization of constants,
\texttt{fvalue}, \texttt{defer}, \texttt{immediate},
\texttt{to} and other dual-semantics words, and
\texttt{synonym}.}
}
@Proceedings{euroforth19,
title = {35th EuroForth Conference},
booktitle = {35th EuroForth Conference},
year = {2019},
key = {EuroForth'19},
url = {http://www.euroforth.org/ef19/papers/proceedings.pdf}
}
There have been a few changes since that paper: The body address is
now used as nt and xt, and the "name length and flags" field has
gained another flag or two.
- anton
On Sat, 18 Apr 2026 10:26:11 GMT
[email protected] (Anton Ertl) wrote:
[email protected] writes:
The argument was that an extreme impractical Forth can be implemented
with this model as guideline, to counter the argument of extreme waste
that I expected.
The users have voted with their feet: They usually have used the
"extremely wasteful" fig-Forth with the default settings (names with
up to 31-char) rather than using the "extremely impractical" option to
only store the first n chars of a name (with n being configurable in
fig-Forth). "Wastefulness" won over "Impracticality" so convincingly
that even Forth, Inc. switched from "impracticality" to
"wastefulness", as well as almost everyone else. The exception is
Chuck Moore, who continues with the "impractical" approach in
ColorForth. Even in the Forth universe, few people seem to use
ColorForth and I have not heard of systems that follow its approach to
names.
Concerning memory consumption, Gforth's development version includes quite a bit of meta-information for two purposes:
1) How the threaded code relates to the source code, not just where
each definition starts, but also where words are used.
2) How the machine code addresses relate to what is COMPILE,d, to get
proper decompilation with SEE-CODE, SIMPLE-SEE, and SEE.
Both sets of informations are in big tables that are stored
out-of-line, not with the headers, and each takes about as much space
as the inline stuff (headers, threaded code, other bodies); the
information where each definition is defined is stored in the headers,
however. The result is that gforth.fi takes 2.3MB on my system; for
comparison, the inline dictionary stuff is 686464 bytes, the native
code of 465671 bytes for gforth-fast and 892251 for gforth (also
out-of-line, both not in the image).
One may consider this wasteful, but I consider it good use of the RAM
that our machines have; my PC from 1993 had 16MB, the one from 2015
16GB, my current one 64GB.
Concerning header size, here's what we have in Gforth (on a 64-bit system): >>
here 5 constant five here over - dump
403AC9C0: 20 20 20 20 66 69 76 65 - 04 00 00 00 00 00 00 00 five........
403AC9D0: 08 C7 3A 40 00 00 00 00 - 81 EA A1 9D EA 55 00 00 ..:@.........U..
403AC9E0: 60 AF 30 40 00 00 00 00 - 05 00 00 00 00 00 00 00 `.0@............
I.e., 6 cells for a word with a name that fits in one cell, and one
cell of body. The cells are:
403AC9C0: Name padded with spaces at the front to align it to a cell boundary
403AC9C8: Name length and flags
403AC9D0: link field (pointer to previous word in the same wordlist)
403AC9D8: header methods (pointer to a method table)
403AC9E0: code field (contains the code address of docon in this case)
403AC9E8: body aka parameter field, contains the value in this case
For more information, read
I did go ahead and change the LXF64 header to something more like the
gforth one! This is what it looks like:
\ offset length purpose
\ -24-8n 8+8n counted name aligned and patched with zeros n=0,1,2,3
\ -16 8 xt (xt token + xt native)
\ -8 4 link
\ -4 2 Tlen Token code length
\ -2 2 Nlen Native code length
\ 0 1 flag byte <- NT points here
\ 1 1 offset to name from NT
\ 2 2 unused
\ 4 4 pointer to translate-name
\ 8 Tlen token code
Your example of five becomes
align-h here-h 5 constant five here-h over - dump
000000007227A0 04 46 49 56 45 00 00 00 C0 27 72 00 30 68 42 00 .FIVE....'r.0hB.
000000007227B0 F8 D0 71 00 03 00 10 00 20 18 00 00 50 00 A0 00 ..q..... ...P...
000000007227C0 26 05 25 00 00 00 00 00 00 00 00 00 00 00 00 00 &.%.............
I have the counted name aligned and zero padded at the start.
This will allow to compare 8 bytes at a time
: NCOMP ( addr addr' - f) \ compare counted name strings strings 0= match
dup c@ 1+
0 ?do
over i + @ over i + @
<> if 2drop unloop true exit then
8 +loop 2drop false ;
I have checked forth-wordlist and 71.5% of all words will require only one comparison. 1.5% will require more then 2 comparisons.
The other interesting change I did was to put in a link to translate-name. Each word now knows how to interpret, compile and postpone itself!
I have now 3 standard word types
translate-name
translate-name-immediate
translate-name-macro
This takes away all checks of the flag and following conditionals.
I could actually remove the flag byte.
I also introduced SET-TRANSLATOR that sets the translator of the
last defined word. This lets me define all state smart words
without state! S" illustrates this:
: [S"]
34 parse slit ; immediate
' ht-execute
:noname drop postpone [S"] ;
:noname drop [n'] [S"] lit, postpone ht-execute ;
create translate-s"
, , ,
: S"
34 parse dup >r pocket dup >r swap move r> r> ;
translate-s" set-translator
ht-execute executes the NT. [n'] returns the NT
C", TO, ACTION-OF, IS and S\" are implemented in similar ways.
Now the recognizers are starting to make good sense!
I did go ahead and change the LXF64 header to something more like the...
gforth one! This is what it looks like:
\ offset length purpose
\ -24-8n 8+8n counted name aligned and patched with zeros n=0,1,2,3
\ -16 8 xt (xt token + xt native)
\ -8 4 link
\ -4 2 Tlen Token code length
\ -2 2 Nlen Native code length
\ 0 1 flag byte <- NT points here
\ 1 1 offset to name from NT
\ 2 2 unused
\ 4 4 pointer to translate-name
\ 8 Tlen token code
The other interesting change I did was to put in a link to translate-name. >Each word now knows how to interpret, compile and postpone itself!
I have now 3 standard word types
translate-name
translate-name-immediate
translate-name-macro
This takes away all checks of the flag and following conditionals.
I could actually remove the flag byte.
I also introduced SET-TRANSLATOR that sets the translator of the
last defined word. This lets me define all state smart words
without state! S" illustrates this:
: [S"]
34 parse slit ; immediate
' ht-execute
:noname drop postpone [S"] ;
:noname drop [n'] [S"] lit, postpone ht-execute ;
create translate-s"
, , ,
: S"
34 parse dup >r pocket dup >r swap move r> r> ;
translate-s" set-translator
ht-execute executes the NT. [n'] returns the NT
peter <[email protected]> writes:
I did go ahead and change the LXF64 header to something more like the >gforth one! This is what it looks like:
\ offset length purpose...
\ -24-8n 8+8n counted name aligned and patched with zeros n=0,1,2,3
\ -16 8 xt (xt token + xt native)
\ -8 4 link
\ -4 2 Tlen Token code length
\ -2 2 Nlen Native code length
\ 0 1 flag byte <- NT points here
\ 1 1 offset to name from NT
\ 2 2 unused
\ 4 4 pointer to translate-name
\ 8 Tlen token code
The other interesting change I did was to put in a link to translate-name. >Each word now knows how to interpret, compile and postpone itself!
I have now 3 standard word types
translate-name
translate-name-immediate
translate-name-macro
This takes away all checks of the flag and following conditionals.
I could actually remove the flag byte.
In Gforth we did this by making the implementations of NAME>INTERPRET
and NAME>COMPILE word-specific:
Words with default compilation semantics have DEFAULT-NAME>COMP als implementation, immediate words have IMM>COMP as implementation, and
other words (e.g., S") have other implementations.
\ the actual implementation is a bit different, but this is the
\ easier-to-understand version.
: default-name>comp ( nt -- xt1 xt2 )
name>interpret ['] compile, ;
: imm>comp ( nt -- xt1 xt2 )
name>interpret ['] execute ;
In Gforth translate-name does not differentiate between different
kinds of words; it always produces "nt translate-name" on success, and NAME>COMPILE takes care of the differences. My guess us that you do
it differently because you do not have NAME>COMPILE. Am I corrent?
I also introduced SET-TRANSLATOR that sets the translator of the
last defined word. This lets me define all state smart words
without state! S" illustrates this:
: [S"]
34 parse slit ; immediate
' ht-execute
:noname drop postpone [S"] ;
:noname drop [n'] [S"] lit, postpone ht-execute ;
create translate-s"
, , ,
: S"
34 parse dup >r pocket dup >r swap move r> r> ;
translate-s" set-translator
Interesting.
ht-execute executes the NT. [n'] returns the NT
So you have NTs. Do you have NT>COMPILE? If so, the differences
between default and immediate and other words should already be
implemented there.
- anton
peter <[email protected]> writes:
I did go ahead and change the LXF64 header to something more like the >gforth one! This is what it looks like:
\ offset length purpose...
\ -24-8n 8+8n counted name aligned and patched with zeros n=0,1,2,3
\ -16 8 xt (xt token + xt native)
\ -8 4 link
\ -4 2 Tlen Token code length
\ -2 2 Nlen Native code length
\ 0 1 flag byte <- NT points here
\ 1 1 offset to name from NT
\ 2 2 unused
\ 4 4 pointer to translate-name
\ 8 Tlen token code
The other interesting change I did was to put in a link to translate-name. >Each word now knows how to interpret, compile and postpone itself!
I have now 3 standard word types
translate-name
translate-name-immediate
translate-name-macro
This takes away all checks of the flag and following conditionals.
I could actually remove the flag byte.
In Gforth we did this by making the implementations of NAME>INTERPRET
and NAME>COMPILE word-specific:
Words with default compilation semantics have DEFAULT-NAME>COMP als implementation, immediate words have IMM>COMP as implementation, and
other words (e.g., S") have other implementations.
\ the actual implementation is a bit different, but this is the
\ easier-to-understand version.
: default-name>comp ( nt -- xt1 xt2 )
name>interpret ['] compile, ;
: imm>comp ( nt -- xt1 xt2 )
name>interpret ['] execute ;
In Gforth translate-name does not differentiate between different
kinds of words; it always produces "nt translate-name" on success, and NAME>COMPILE takes care of the differences. My guess us that you do
it differently because you do not have NAME>COMPILE. Am I corrent?
I also introduced SET-TRANSLATOR that sets the translator of the
last defined word. This lets me define all state smart words
without state! S" illustrates this:
: [S"]
34 parse slit ; immediate
' ht-execute
:noname drop postpone [S"] ;
:noname drop [n'] [S"] lit, postpone ht-execute ;
create translate-s"
, , ,
: S"
34 parse dup >r pocket dup >r swap move r> r> ;
translate-s" set-translator
Interesting.
ht-execute executes the NT. [n'] returns the NT
So you have NTs. Do you have NT>COMPILE? If so, the differences
between default and immediate and other words should already be
implemented there.
- anton
On Sat, 23 May 2026 18:12:20 GMT
[email protected] (Anton Ertl) wrote:
peter <[email protected]> writes:
The other interesting change I did was to put in a link to translate-name. >> >Each word now knows how to interpret, compile and postpone itself!
I have now 3 standard word types
translate-name
translate-name-immediate
translate-name-macro
This takes away all checks of the flag and following conditionals.
I could actually remove the flag byte.
In Gforth we did this by making the implementations of NAME>INTERPRET
and NAME>COMPILE word-specific:
Words with default compilation semantics have DEFAULT-NAME>COMP als
implementation, immediate words have IMM>COMP as implementation, and
other words (e.g., S") have other implementations.
\ the actual implementation is a bit different, but this is the
\ easier-to-understand version.
: default-name>comp ( nt -- xt1 xt2 )
name>interpret ['] compile, ;
: imm>comp ( nt -- xt1 xt2 )
name>interpret ['] execute ;
I studied your linked document and slides a understood it worked
something like that. Seeing your VT table gave me the idea to
put in a link to the translate record
int: default-name>int
comp: default-name>comp
string: named>string
link: named>link
: NAME>COMPILE ( nt -- w xt )
dup nt>trans l@ cell+ @ ;
With the 3 translate-name-xxx all the flag testing is gone!
The ability to set a specific translation record for the
state smart words comes as an extra benefit.
To really integrate the recognizers well I made REC-NAME theprimary name finding function. Fnd-name is then defined as:
: FIND-NAME ( caddr u -- ht | 0)
rec-name dup if drop then ;
Translate-none returns a null pointer in my system.
peter <[email protected]> writes:
On Sat, 23 May 2026 18:12:20 GMT
[email protected] (Anton Ertl) wrote:
peter <[email protected]> writes:
The other interesting change I did was to put in a link to translate-name.
Each word now knows how to interpret, compile and postpone itself!
I have now 3 standard word types
translate-name
translate-name-immediate
translate-name-macro
This takes away all checks of the flag and following conditionals.
I could actually remove the flag byte.
In Gforth we did this by making the implementations of NAME>INTERPRET
and NAME>COMPILE word-specific:
Words with default compilation semantics have DEFAULT-NAME>COMP als
implementation, immediate words have IMM>COMP as implementation, and
other words (e.g., S") have other implementations.
\ the actual implementation is a bit different, but this is the
\ easier-to-understand version.
: default-name>comp ( nt -- xt1 xt2 )
name>interpret ['] compile, ;
: imm>comp ( nt -- xt1 xt2 )
name>interpret ['] execute ;
I studied your linked document and slides a understood it worked
something like that. Seeing your VT table gave me the idea to
put in a link to the translate record
Nowadays we call the table HM, for header methods. VT is too generic.
In development Gforth, you can see the header methods for a word by
using .HM on its NT. E.g.:
``+ .hm
opt: $7FA3C4A363D8
to: n/a
extra: $0
int: default-name>int
comp: default-name>comp
string: named>string
link: named>link
: NAME>COMPILE ( nt -- w xt )
dup nt>trans l@ cell+ @ ;
That's interesting. Instead of defining TRANSLATE-NAME's compilation
action in terms of NAME>COMPILE, you put the differences between
different names into TRANSLATE-NAME, and implement NAME>COMPILE by
accessing the internals of TRANSLATE-NAME.
With the 3 translate-name-xxx all the flag testing is gone!
Yes, we also eliminated nearly all flags with the new header format.
We kept a compile-only flag (for warning about compile-only words),
and added an obsolete flag (for warning about words that are going to
be removed from a future Gforth), because warnings do not introduce complicated control flow.
The ability to set a specific translation record for the
state smart words comes as an extra benefit.
I guess you mean words with non-immediate non-default compilation
semantics, and yes, being able to tell the Forth system how it should
treat such a word at text interpretation time avoids the unpleasant
surprises that STATE-smart immediate words (that try to figure out at run-time by inspecting STATE what they should do, but the STATE at
run-time does not provide information about whether their
interpretation semantics or compilation semantics is performed).
To really integrate the recognizers well I made REC-NAME theprimary name finding function. Fnd-name is then defined as:
: FIND-NAME ( caddr u -- ht | 0)
rec-name dup if drop then ;
Translate-none returns a null pointer in my system.
Yes, we have written about the idea of unifying recognizers and
wordlists [paysan20]. Development Gforth implements this idea. E.g.,
if you do
s" dup" forth-wordlist execute
you find a translation on the stack, consisting of the nt of DUP and
of TRANSLATE-NAME. The implementations of FIND-NAME-IN and FIND-NAME
are:
: find-name-in ( c-addr u wid -- nt | 0 ) \ gforth
execute translate-none = IF 0 THEN ;
: find-name ( c-addr u -- nt | 0 ) \ gforth
['] rec-name find-name-in ;
The latter makes use of the fact that the recognizer sequence in
REC-NAME can be treated as wordlist. That relies on the fact that
only wordlists are in the search order (and the search-order is in the deferred word REC-NAME). If you put, e.g., REC-NUMBER into the search
order, the result will be that FIND-NAME will push a single-cell or double-cell number when you pass it something that is recognized by
that REC-NUMBER. But that's the usual fare in Forth, if you hold it
wrong, it produces the wrong result.
The current proposal proposes the nested recognizers, but does not
require wordlists to work as recognizers.
@InProceedings{paysan20,
author = {Bernd Paysan and M. Anton Ertl},
title = {The Grand Recognizer Unification},
crossref = {euroforth20},
pages = {19--22},
url = {http://www.euroforth.org/ef20/papers/paysan.pdf},
url-slides = {http://www.euroforth.org/ef20/papers/paysan-slides.pdf},
video = {https://www.youtube.com/watch?v=VUi6uYqIbTI},
OPTnote = {not refereed},
abstract = {There is an obvious similarity between the search
order and a recognizer sequence, which has led to
similarities in proposed words (e.g.,
\code{get-recognizer} is modeled on
\code{get-order}). By turning word lists into
recognizers, we unify these concepts. We also turn
recognizer sequences (and be extension the search
order) into a recognizer, which allows nestable
recognizer sequences and wordlist sequences in the
search order. The implementation becomes simpler,
too.}
}
@Proceedings{euroforth20,
title = {36th EuroForth Conference},
booktitle = {36th EuroForth Conference},
year = {2020},
key = {EuroForth'20},
url = {http://www.euroforth.org/ef20/papers/proceedings.pdf}
}
- anton
| Sysop: | DaiTengu |
|---|---|
| Location: | Appleton, WI |
| Users: | 1,123 |
| Nodes: | 10 (0 / 10) |
| Uptime: | 38:08:42 |
| Calls: | 14,371 |
| Files: | 186,380 |
| D/L today: |
5,624 files (1,634M bytes) |
| Messages: | 2,540,681 |