On 2026-01-11 08:32, Michael S wrote:
On Sun, 11 Jan 2026 04:59:47 -0800
Keith Thompson <[email protected]> wrote:
Michael S <[email protected]> writes:
On Sat, 10 Jan 2026 22:02:03 -0500
"James Russell Kuyper Jr." <[email protected]> wrote:
On 2026-01-09 07:18, Michael S wrote:
On Thu, 8 Jan 2026 19:31:13 -0500...
"James Russell Kuyper Jr." <[email protected]>
wrote:
I'd have no problem with your approach if you hadn't falsely
claimed that "It is correct on all platforms".
Which I didn't.
On 2026-01-07 19:38, Michael S wrote:
...
> No, it is correct on all implementation.
The quote is taken out of context.
The context was that on platforms that have properties (a) and (b)
(see below) printing variables declared as uint32_t via %u is
probably UB according to the Standard (I don't know for sure,
however it is probable),
I'm sure. uint32_t is an alias for some predefined integer type.
This:
uint32_t n = 42;
printf("%u\n", n);
has undefined behavior *unless* uint32_t happens to an alias for
unsigned int in the current implementation -- not just any 32-bit
unsigned integer type, only unsigned int.
If uint32_t is an alias for unsigned long (which implies that
unsigned long is exactly 32 bits), then the call's behavior is
undefined. (It might happen to "work".)
What exactly, assuming that conditions (a) and (b) fulfilled, should
implementation do to prevent it from working?
I mean short of completely crazy things that will make maintainer
immediately fired?
I'm quite positive that you would consider anything that might give unexpected behavior to such code to be "crazy". The simplest example I
can think of is that unsigned int is big-endian, while unsigned long is little-endian, and I would even agree that such an implementation would
be peculiar, but such an implementation could be fully conforming to the
C standard.
If uint32_t and unsigned long have different sizes, it still might
happen happen to "work", depending on calling conventions. Passing a
32-bit argument and telling printf to expect a 64-bit value clearly
has undefined behavior, but perhaps both happen to be passed in 64-bit
registers, for example.
And that is sort of intimate knowledge of the ABI that I don't want to
exploit, as already mentioned in my other post in this sub-thread.
Which is precisely what's wrong about your approach - it relies upon intimate knowledge of the ABI. Specifically, it relies on unsigned int
and unsigned long happening to have exactly the same size and representation.
On 2026-01-11 08:32, Michael S wrote:
On Sun, 11 Jan 2026 04:59:47 -0800
Keith Thompson <[email protected]> wrote:
Michael S <[email protected]> writes:
On Sat, 10 Jan 2026 22:02:03 -0500
"James Russell Kuyper Jr." <[email protected]>
wrote:
On 2026-01-09 07:18, Michael S wrote:
On Thu, 8 Jan 2026 19:31:13 -0500...
"James Russell Kuyper Jr." <[email protected]>
wrote:
I'd have no problem with your approach if you hadn't falsely
claimed that "It is correct on all platforms".
Which I didn't.
On 2026-01-07 19:38, Michael S wrote:
...
> No, it is correct on all implementation.
The quote is taken out of context.
The context was that on platforms that have properties (a) and (b)
(see below) printing variables declared as uint32_t via %u is
probably UB according to the Standard (I don't know for sure,
however it is probable),
I'm sure. uint32_t is an alias for some predefined integer type.
This:
uint32_t n = 42;
printf("%u\n", n);
has undefined behavior *unless* uint32_t happens to an alias for
unsigned int in the current implementation -- not just any 32-bit
unsigned integer type, only unsigned int.
If uint32_t is an alias for unsigned long (which implies that
unsigned long is exactly 32 bits), then the call's behavior is
undefined. (It might happen to "work".)
What exactly, assuming that conditions (a) and (b) fulfilled, should implementation do to prevent it from working?
I mean short of completely crazy things that will make maintainer immediately fired?
I'm quite positive that you would consider anything that might give unexpected behavior to such code to be "crazy". The simplest example
I can think of is that unsigned int is big-endian, while unsigned
long is little-endian, and I would even agree that such an
implementation would be peculiar, but such an implementation could be
fully conforming to the C standard.
If uint32_t and unsigned long have different sizes, it still might
happen happen to "work", depending on calling conventions.
Passing a 32-bit argument and telling printf to expect a 64-bit
value clearly has undefined behavior, but perhaps both happen to
be passed in 64-bit registers, for example.
And that is sort of intimate knowledge of the ABI that I don't want
to exploit, as already mentioned in my other post in this
sub-thread.
Which is precisely what's wrong about your approach - it relies upon intimate knowledge of the ABI. Specifically, it relies on unsigned
int and unsigned long happening to have exactly the same size and representation.
On 2026-01-11 06:20, Michael S wrote:
On Sat, 10 Jan 2026 22:02:03 -0500
"James Russell Kuyper Jr." <[email protected]> wrote:
On 2026-01-09 07:18, Michael S wrote:
On Thu, 8 Jan 2026 19:31:13 -0500...
"James Russell Kuyper Jr." <[email protected]>
wrote:
I'd have no problem with your approach if you hadn't falsely
claimed that "It is correct on all platforms".
Which I didn't.
On 2026-01-07 19:38, Michael S wrote:
...
No, it is correct on all implementation.
The quote is taken out of context.
The context was that on platforms that have properties (a) and (b)
(see below) printing variables declared as uint32_t via %u is
probably UB according to the Standard (I don't know for sure,
however it is probable), but it can't cause troubles with
production C compiler. Or with any C compiler that is made in
intention of being used rather than crafted to prove theoretical
points. Properties are:
a) uint32_t aliased to 'unsigned long'
b) 'unsigned int' is at least 32-bit wide.
I never claimed that it is good idea on targets with 'unsigned int'
that is narrower.
I've looked for a previous restriction of this discussion to cases
covered by a) and b) above. The closest I could find is the following:
In the case I am talking about n declared as uint32_t.
uint32_t is an alias of 'unsigned long' on 32-bit embedded targets,
on 32-bit Linux, on 32-bit Windows and on 64-bit Windows. It is
alias of 'unsigned int' on 64-bit Linux.
Note several points: that is a period after the first use of
"uint32_t", so "the case" you're specifying ends there. I read the
next three lines as information about your working environment, not restrictions on the claimed validity of your preference for "%u" over
"%lu". There is no mention of a restriction on the size of "unsigned
int".
Would it be allowed (in the sense of being possible in a hypothetical
but fully conforming implementation) to have "unsigned long" be
32-bit, without padding, while "unsigned int" is 64-bit wide with 32
value bits and 32 padding bits? A cpu might be able to handle 64-bit
lumps faster than 32-bit lumps and choose such a setup to make
"unsigned int" as fast as it can. (uint32_t in this case would be an
alias for "unsigned long", as it can't have padding bits.)
On Tue, 13 Jan 2026 21:24:16 -0500...
"James Russell Kuyper Jr." <[email protected]> wrote:
I'm quite positive that you would consider anything that might give
unexpected behavior to such code to be "crazy". The simplest example
I can think of is that unsigned int is big-endian, while unsigned
long is little-endian, and I would even agree that such an
implementation would be peculiar, but such an implementation could be
fully conforming to the C standard.
You are inventive!
I've spent most of my career working under rules that explicitlyWhich is precisely what's wrong about your approach - it relies upon
intimate knowledge of the ABI. Specifically, it relies on unsigned
int and unsigned long happening to have exactly the same size and
representation.
I consider the latter a basic knowledge of ABI rather than an intimate.
For me programming feels uncomfortable without such knowledge. That is,
I can manage without, but do not want to.
Your mileage appears to vary.
David Brown <[email protected]> writes:
[...]
Would it be allowed (in the sense of being possible in a hypothetical
but fully conforming implementation) to have "unsigned long" be
32-bit, without padding, while "unsigned int" is 64-bit wide with 32
value bits and 32 padding bits? A cpu might be able to handle 64-bit
lumps faster than 32-bit lumps and choose such a setup to make
"unsigned int" as fast as it can. (uint32_t in this case would be an
alias for "unsigned long", as it can't have padding bits.)
The *width* of an integer type is the number of value bits plus the sign
bit, if any, so "64-bit wide" is an incorrect description.
What would be possible is:
- CHAR_BIT * sizeof (unsigned int) == 64
- UINT_WIDTH == 32 (32 padding bits)
- CHAR_BIT * sizeof (unsigned long) == 32
- ULONG_WIDTH == 32 (no padding bits)
The *_WIDTH macros are new in C23.
[...]
On 2026-01-14 04:03, Michael S wrote:
On Tue, 13 Jan 2026 21:24:16 -0500...
"James Russell Kuyper Jr." <[email protected]> wrote:
I'm quite positive that you would consider anything that might give
unexpected behavior to such code to be "crazy". The simplest
example I can think of is that unsigned int is big-endian, while
unsigned long is little-endian, and I would even agree that such an
implementation would be peculiar, but such an implementation could
be fully conforming to the C standard.
You are inventive!
As a programmer, I paid close attention to what was and was not
guaranteed about the software that I used. As a result, I've noticed
many things, such as the fact that the standard imposes no
requirements on the order of the bytes (or even of the bits) that are
used to represent arithmetic values.
...
Which is precisely what's wrong about your approach - it relies
upon intimate knowledge of the ABI. Specifically, it relies on
unsigned int and unsigned long happening to have exactly the same
size and representation.
I consider the latter a basic knowledge of ABI rather than anI've spent most of my career working under rules that explicitly
intimate. For me programming feels uncomfortable without such
knowledge. That is, I can manage without, but do not want to.
Your mileage appears to vary.
prohibited me from writing code that depends upon such details. As a
result, I actually have relatively little knowledge of how any
particular implementation of C that I used decided to handle issues
that the C standard left unspecified. I've always written my code so
that it would do what it's supposed to do, regardless of which
choices any given implementation made about things that are
unspecified.
On Tue, 13 Jan 2026 22:17:09 -0500
"James Russell Kuyper Jr." <[email protected]> wrote:
On 2026-01-11 06:20, Michael S wrote:
On Sat, 10 Jan 2026 22:02:03 -0500
"James Russell Kuyper Jr." <[email protected]> wrote:
On 2026-01-09 07:18, Michael S wrote:
On Thu, 8 Jan 2026 19:31:13 -0500...
"James Russell Kuyper Jr." <[email protected]>
wrote:
I'd have no problem with your approach if you hadn't falsely
claimed that "It is correct on all platforms".
Which I didn't.
On 2026-01-07 19:38, Michael S wrote:
...
No, it is correct on all implementation.
The quote is taken out of context.
The context was that on platforms that have properties (a) and (b)
(see below) printing variables declared as uint32_t via %u is
probably UB according to the Standard (I don't know for sure,
however it is probable), but it can't cause troubles with
production C compiler. Or with any C compiler that is made in
intention of being used rather than crafted to prove theoretical
points. Properties are:
a) uint32_t aliased to 'unsigned long'
b) 'unsigned int' is at least 32-bit wide.
I never claimed that it is good idea on targets with 'unsigned int'
that is narrower.
I've looked for a previous restriction of this discussion to cases
covered by a) and b) above. The closest I could find is the following:
In the case I am talking about n declared as uint32_t.
uint32_t is an alias of 'unsigned long' on 32-bit embedded targets,
on 32-bit Linux, on 32-bit Windows and on 64-bit Windows. It is
alias of 'unsigned int' on 64-bit Linux.
Note several points: that is a period after the first use of
"uint32_t", so "the case" you're specifying ends there. I read the
next three lines as information about your working environment, not
restrictions on the claimed validity of your preference for "%u" over
"%lu". There is no mention of a restriction on the size of "unsigned
int".
Ignoring for a minute that what I claimed about 32-bit Linux is
at best non-universal and at worst universally wrong, how would you
formulate what I meant?
My knowledge of English punctuation rules is rather minimal and even
less than that for its US American variant.
I'm not sure exactly what you intended. And, as I mentioned in another sub-thread, I've worked for most of my career under rules that[...]
prohibited me from writing code that depends upon the kinds of details
that you're talking about - as a result, I've had little reason to familiarize myself with those details. However, I can say that using
"%u" to print a value of unsigned long type has no chance of working
unless unsigned int and unsigned long have the same size and
representation. Even if they do, the behavior is still undefined, but
there's a pretty good chance it will work.
James Kuyper <[email protected]> writes:
[...]
I'm not sure exactly what you intended. And, as I mentioned in another[...]
sub-thread, I've worked for most of my career under rules that
prohibited me from writing code that depends upon the kinds of details
that you're talking about - as a result, I've had little reason to
familiarize myself with those details. However, I can say that using
"%u" to print a value of unsigned long type has no chance of working
unless unsigned int and unsigned long have the same size and
representation. Even if they do, the behavior is still undefined, but
there's a pretty good chance it will work.
On one implementation (gcc, glibc, 64 bits), it *can* "work":
```
#include <stdio.h>
int main(void) {
unsigned long x = 123456789;
printf("sizeof (unsigned) = %zu\n", sizeof (unsigned));
printf("sizeof (unsigned long) = %zu\n", sizeof (unsigned long));
printf("x = %u\n", x);
}
```
The output on my system (after some compiler warnings):
```
sizeof (unsigned) = 4
sizeof (unsigned long) = 8
x = 123456789
```
Apparently printf tries to grab a 32-bit value and happens to get
the low-order 32 bits of the 64-bit value that was passed. A value
exceeding LONG_MAX is not printed correctly, but in principle it
could be.
On 2026-01-15 07:00, Keith Thompson wrote:
James Kuyper <[email protected]> writes:
[...]
I'm not sure exactly what you intended. And, as I mentioned in another[...]
sub-thread, I've worked for most of my career under rules that
prohibited me from writing code that depends upon the kinds of details
that you're talking about - as a result, I've had little reason to
familiarize myself with those details. However, I can say that using
"%u" to print a value of unsigned long type has no chance of working
unless unsigned int and unsigned long have the same size and
representation. Even if they do, the behavior is still undefined, but
there's a pretty good chance it will work.
On one implementation (gcc, glibc, 64 bits), it *can* "work":
```
#include <stdio.h>
int main(void) {
unsigned long x = 123456789;
printf("sizeof (unsigned) = %zu\n", sizeof (unsigned));
printf("sizeof (unsigned long) = %zu\n", sizeof (unsigned long));
printf("x = %u\n", x);
}
```
The output on my system (after some compiler warnings):
```
sizeof (unsigned) = 4
sizeof (unsigned long) = 8
x = 123456789
```
Apparently printf tries to grab a 32-bit value and happens to get
the low-order 32 bits of the 64-bit value that was passed. A value
exceeding LONG_MAX is not printed correctly, but in principle it
could be.
I knew about that possibility, and had intended to word my comment to
cover it, but I forgot. Thanks for covering it. The key point is that
this only works for a large but limited range of values - it cannot work
in general.
Tim Rentsch <[email protected]> writes:
[email protected] (Scott Lurndal) writes:
Tim Rentsch <[email protected]> writes:
I was responding to Scotty Lurndal's statement that the C
standard was being paraphrased (by someone, it didn't matter to
me who). I don't care about whether his statement is true; my
interest is only in what part of the C standard he thinks is
being paraphrased. He is in a position to answer that question,
and more to the point he is the only person who is.
It's pretty clear that the standard describes the printf
function and the methods used to match the format characters
to the data types of the arguments. The fact that James
framed that as advice doesn't change interpretation of
the text of the standard, whether or not you consider
that to be a paraphrase.
"The main rules for paraphrasing are to fully understand the
original text, restate its core idea in your own words and
sentence structure, use synonyms, and always cite the original
source to avoid plagiarism, even if the wording is different.
I see where the C standard says the macros in inttypes.h are
suitable for use with printf (and scanf). That isn't at all
the same as saying people should use them.
Why on earth would the put them there if they didn't expect
them to be used?
On Sun, 11 Jan 2026 11:51:43 -0800
Tim Rentsch <[email protected]> wrote:
Michael S <[email protected]> writes:
On Sat, 10 Jan 2026 22:02:03 -0500
"James Russell Kuyper Jr." <[email protected]> wrote:
On 2026-01-09 07:18, Michael S wrote:
On Thu, 8 Jan 2026 19:31:13 -0500
"James Russell Kuyper Jr." <[email protected]>
wrote:
...
I'd have no problem with your approach if you hadn't falsely
claimed that "It is correct on all platforms".
Which I didn't.
On 2026-01-07 19:38, Michael S wrote:
...
No, it is correct on all implementation.
The quote is taken out of context.
The context was that on platforms that have properties (a) and (b)
(see below) printing variables declared as uint32_t via %u is
probably UB according to the Standard (I don't know for sure,
however it is probable), but it can't cause troubles with
production C compiler. Or with any C compiler that is made in
intention of being used rather than crafted to prove theoretical
points. Properties are:
a) uint32_t aliased to 'unsigned long'
b) 'unsigned int' is at least 32-bit wide.
It seems unlikely that any implementation would make such a
choice. Can you name one that does?
Four out of four target for which I write C programs for living in this decade:
- Altera Nios2 (nios2-elf-gcc)
- Arm Cortex-M bare metal (arm-none-eabi-gcc)
- Win32-i386, various compilers
- Win64-Amd64,various compilers
Well, if I would be pedantic, then in this decade I also wrote several programs for Arm32 Linux, where I don't know whether uint32_t is alias
of 'unsigned int' or 'unsigned long', few programs for AMD64 Linux,
where I know that uint32_t is an alias of 'unsigned long' and may be one program for ARM64 Linux that is the same as AMD64 Linux.
But all those outliers together constitute a tiny fraction of the code
that I wrote recently.
Michael S <[email protected]> writes:
On Sun, 11 Jan 2026 11:51:43 -0800
Tim Rentsch <[email protected]> wrote:
Michael S <[email protected]> writes:
On Sat, 10 Jan 2026 22:02:03 -0500
"James Russell Kuyper Jr." <[email protected]> wrote:
On 2026-01-09 07:18, Michael S wrote:
On Thu, 8 Jan 2026 19:31:13 -0500
"James Russell Kuyper Jr." <[email protected]>
wrote:
...
I'd have no problem with your approach if you hadn't falsely
claimed that "It is correct on all platforms".
Which I didn't.
On 2026-01-07 19:38, Michael S wrote:
...
No, it is correct on all implementation.
The quote is taken out of context.
The context was that on platforms that have properties (a) and (b)
(see below) printing variables declared as uint32_t via %u is
probably UB according to the Standard (I don't know for sure,
however it is probable), but it can't cause troubles with
production C compiler. Or with any C compiler that is made in
intention of being used rather than crafted to prove theoretical
points. Properties are:
a) uint32_t aliased to 'unsigned long'
b) 'unsigned int' is at least 32-bit wide.
It seems unlikely that any implementation would make such a
choice. Can you name one that does?
Four out of four target for which I write C programs for living in this
decade:
- Altera Nios2 (nios2-elf-gcc)
- Arm Cortex-M bare metal (arm-none-eabi-gcc)
- Win32-i386, various compilers
- Win64-Amd64,various compilers
Interesting. I wonder what factors motivated such a choice.
Well, if I would be pedantic, then in this decade I also wrote several
programs for Arm32 Linux, where I don't know whether uint32_t is alias
of 'unsigned int' or 'unsigned long', few programs for AMD64 Linux,
where I know that uint32_t is an alias of 'unsigned long' and may be one
program for ARM64 Linux that is the same as AMD64 Linux.
But all those outliers together constitute a tiny fraction of the code
that I wrote recently.
If variable 'u' is declared as uint32_t, a way to print it that is
easy and also type-safe is
printf( " u is %lu\n", u+0LU );
If variable 'u' is declared as uint32_t, a way to print it that is
easy and also type-safe is
printf( " u is %lu\n", u+0LU );
Tim Rentsch <[email protected]> writes:
[...]
If variable 'u' is declared as uint32_t, a way to print it that is
easy and also type-safe is
printf( " u is %lu\n", u+0LU );
I prefer
printf("u is %lu\n", (unsigned)long_u);
I find it clearer.
On 03/02/2026 22:43, Keith Thompson wrote:
Tim Rentsch <[email protected]> writes:
[...]
If variable 'u' is declared as uint32_t, a way to print it that is
easy and also type-safe is
printf( " u is %lu\n", u+0LU );
I prefer
printf("u is %lu\n", (unsigned)long_u);
I find it clearer.
Is there a typo in there, or is the variable actually called 'long_u'?
Then the message doesn't match.
David Brown <[email protected]> writes:
[...]
C23 includes length specifiers with explicit bit counts, so "%w32u" is
for an unsigned integer argument of 32 bits:
"""
wN Specifies that a following b, B, d, i, o, u, x, or X conversion
specifier applies to an integer argument with a specific width
where N is a positive decimal integer with no leading zeros
(the argument will have been promoted according to the integer
promotions, but its value shall be converted to the unpromoted
type); or that a following n conversion specifier applies to a
pointer to an integer type argument with a width of N bits. All
minimum-width integer types (7.22.1.2) and exact-width integer
types (7.22.1.1) defined in the header <stdint.h> shall be
supported. Other supported values of N are implementation-defined.
"""
That looks to me that it would be a correct specifier for uint32_t,
Yes, so for example this:
uint32_t n = 42;
printf("n = %w32u\n", n);
is correct, if I'm reading it correctly. It's also correct for uint_least32_t, which is expected to be the same type as uint32_t
if the latter exists. There's also support for the [u]int_fastN_t
types, using for example "%wf32u" in place of "%w32u".
and should also be fully defined behaviour for unsigned int and
unsigned long if these are 32 bits wide.
No, I don't think C23 says that.
If int and long happen to be the same
width, they are still incompatible, and there is no printf format
specifier that has defined behavior for both.
That first sentence is a bit ambiguous
wN Specifies that a following b, B, d, i, o, u, x, or X conversion
specifier applies to an integer argument with a specific width ...
but I don't think it means that it must accept *any* integer type
of the specified width.
Keith Thompson <[email protected]> writes:
David Brown <[email protected]> writes:
[...]
C23 includes length specifiers with explicit bit counts, so "%w32u" is
for an unsigned integer argument of 32 bits:
"""
wN Specifies that a following b, B, d, i, o, u, x, or X conversion
specifier applies to an integer argument with a specific width
where N is a positive decimal integer with no leading zeros
(the argument will have been promoted according to the integer
promotions, but its value shall be converted to the unpromoted
type); or that a following n conversion specifier applies to a
pointer to an integer type argument with a width of N bits. All
minimum-width integer types (7.22.1.2) and exact-width integer
types (7.22.1.1) defined in the header <stdint.h> shall be
supported. Other supported values of N are implementation-defined.
"""
That looks to me that it would be a correct specifier for uint32_t,
Yes, so for example this:
uint32_t n = 42;
printf("n = %w32u\n", n);
is correct, if I'm reading it correctly. It's also correct for
uint_least32_t, which is expected to be the same type as uint32_t
if the latter exists. There's also support for the [u]int_fastN_t
types, using for example "%wf32u" in place of "%w32u".
and should also be fully defined behaviour for unsigned int and
unsigned long if these are 32 bits wide.
No, I don't think C23 says that.
Right, it doesn't.
If int and long happen to be the same
width, they are still incompatible, and there is no printf format
specifier that has defined behavior for both.
That first sentence is a bit ambiguous
wN Specifies that a following b, B, d, i, o, u, x, or X conversion
specifier applies to an integer argument with a specific width ...
but I don't think it means that it must accept *any* integer type
of the specified width.
As I read the standard there is no ambiguity. The first sentence
says what the length modifier means. The second sentence says
which types (if any) correspond to the description in the first
sentence.
Michael S <[email protected]> writes:
On Tue, 06 Jan 2026 16:29:04 -0800
Keith Thompson <[email protected]> wrote:
Michael S <[email protected]> writes:
On Tue, 6 Jan 2026 10:31:41 -0500
James Kuyper <[email protected]> wrote:
On 2026-01-06 04:29, Michael S wrote:
On Tue, 6 Jan 2026 00:27:04 -0000 (UTC)
Lawrence D?Oliveiro <[email protected]d> wrote:
...
Section 7.8 of the C spec defines macros you can use so you
don?t have to hard-code assumptions about the lengths of
integers in printf-format strings.
Did you ever try to use them? They look ugly.
Which is more important, correctness or beauty?
It depends.
When I know for sure that incorrectness has no consequences, like
in case of using %u to print 'unsigned long' on target with 32-bit
longs, or like using %llu to print 'unsigned long' on target with
64-bit longs, then beauty wins. Easily.
Seriously?
An example:
unsigned long n = 42;
printf("%u\n", n); // incorrect
printf("%lu\n", n); // correct
Are you really saying that the second version is so much uglier
than the first that you'd rather write incorrect code?
No, I don't think that it is much uglier. At worst, I think that
correct version is tiny bit uglier. Not enough for beauty to win
over "correctness", even when correctness is non-consequential.
I hoped that you followed the sub-thread from the beginning and
did not lost the context yet.
The context to which I replied was you favoring beauty over
correctness and using "%u" to print an unsigned long value as
an example.
I find it difficult to express how strongly I disagree.
On 03/02/2026 15:47, Tim Rentsch wrote:[...]
If variable 'u' is declared as uint32_t, a way to print it that is
easy and also type-safe is
printf( " u is %lu\n", u+0LU );
What about a compound expression of several variables of mixed
integer types, possibly even mixed with floats, some of whose types
might either be conditional (depending on some macro), or opaque?
Bart <[email protected]> writes:
On 03/02/2026 15:47, Tim Rentsch wrote:[...]
If variable 'u' is declared as uint32_t, a way to print it that is
easy and also type-safe is
printf( " u is %lu\n", u+0LU );
What about a compound expression of several variables of mixed
integer types, possibly even mixed with floats, some of whose types
might either be conditional (depending on some macro), or opaque?
What is an example of a conditional/macro-dependent type?
Also what sort of opaque types do you have in mind?
What is the problem you want to solve here?
On 04/02/2026 14:22, Tim Rentsch wrote:
Bart <[email protected]> writes:
On 03/02/2026 15:47, Tim Rentsch wrote:[...]
If variable 'u' is declared as uint32_t, a way to print it that is
easy and also type-safe is
printf( " u is %lu\n", u+0LU );
What about a compound expression of several variables of mixed
integer types, possibly even mixed with floats, some of whose types
might either be conditional (depending on some macro), or opaque?
What is an example of a conditional/macro-dependent type?
Example from SDL2:
#if defined(_MSC_VER) && (_MSC_VER < 1600)
...
#ifndef _UINTPTR_T_DEFINED
#ifdef _WIN64
typedef unsigned __int64 uintptr_t;
#else
typedef unsigned int uintptr_t;
...
Example from SQLITE3:
#ifdef SQLITE_OMIT_FLOATING_POINT
# define double sqlite3_int64
#endif
Also what sort of opaque types do you have in mind?
Things like time_t and clock_t, or the equivalent from libraries.
Yes you could hunt down the exact underlying type (for clock_t in one
case, it was under 6 layers of typedefs and macros), but that would be
for a specific set of headers.
For system headers, somebody could be using a header with different definitions. For user-libraries, it might be a slightly different version.
What is the problem you want to solve here?
The problem is that C expects an exact format-code when trying to use *printf functions, and for that you need to know the exact types of the expressions being passed. For example:
uintptr_t x; // from above examples
double y; //
printf("x * y is %?", x * y); // What's '?'
On 04/02/2026 17:44, Bart wrote:
On 04/02/2026 14:22, Tim Rentsch wrote:
What is the problem you want to solve here?
The problem is that C expects an exact format-code when trying to use
*printf functions, and for that you need to know the exact types of
the expressions being passed. [...]
[...], or because you are perfectly
aware that C's printf has limitations and you want to post about how terrible C is and how great your own language is?
On 04/02/2026 17:44, Bart wrote:
On 04/02/2026 14:22, Tim Rentsch wrote:
Bart <[email protected]> writes:
On 03/02/2026 15:47, Tim Rentsch wrote:[...]
If variable 'u' is declared as uint32_t, a way to print it that is
easy and also type-safe is
printf( " u is %lu\n", u+0LU );
What about a compound expression of several variables of mixed
integer types, possibly even mixed with floats, some of whose types
might either be conditional (depending on some macro), or opaque?
What is an example of a conditional/macro-dependent type?
Example from SDL2:
#if defined(_MSC_VER) && (_MSC_VER < 1600)
...
#ifndef _UINTPTR_T_DEFINED
#ifdef _WIN64
typedef unsigned __int64 uintptr_t;
#else
typedef unsigned int uintptr_t;
...
Example from SQLITE3:
#ifdef SQLITE_OMIT_FLOATING_POINT
# define double sqlite3_int64
#endif
Also what sort of opaque types do you have in mind?
Things like time_t and clock_t, or the equivalent from libraries.
Yes you could hunt down the exact underlying type (for clock_t in one
case, it was under 6 layers of typedefs and macros), but that would be
for a specific set of headers.
For system headers, somebody could be using a header with different
definitions. For user-libraries, it might be a slightly different
version.
What is the problem you want to solve here?
The problem is that C expects an exact format-code when trying to use
*printf functions, and for that you need to know the exact types of
the expressions being passed. For example:
uintptr_t x; // from above examples
double y; //
printf("x * y is %?", x * y); // What's '?'
So are you asking because you don't know what Tim's construction does
with these types, or because you want to know if there is a portable and safe way to print out any arithmetic type, or because you are perfectly aware that C's printf has limitations and you want to post about how terrible C is and how great your own language is?
and you want to post about how
terrible C is and how great your own language is?
The point of both Tim's and Keith's solutions is that you do /not/ need
to know the exact type of the expression you are printing - C's
conversion rules let them work with a range of different original types.
They were both picked specifically for the case of "uint32_t", but can handle more types. Tim's can be used for any integer type of rank up to "unsigned long int" (and thus not "long long" types),
be fine with any integer type and any floating point type as long as the value of the integer part of the floating point value is within the
range of "unsigned long int".
On 04/02/2026 17:12, David Brown wrote:
On 04/02/2026 17:44, Bart wrote:
On 04/02/2026 14:22, Tim Rentsch wrote:
Bart <[email protected]> writes:
On 03/02/2026 15:47, Tim Rentsch wrote:[...]
If variable 'u' is declared as uint32_t, a way to print it that is >>>>>> easy and also type-safe is
printf( " u is %lu\n", u+0LU );
What about a compound expression of several variables of mixed
integer types, possibly even mixed with floats, some of whose types
might either be conditional (depending on some macro), or opaque?
What is an example of a conditional/macro-dependent type?
Example from SDL2:
#if defined(_MSC_VER) && (_MSC_VER < 1600)
...
#ifndef _UINTPTR_T_DEFINED
#ifdef _WIN64
typedef unsigned __int64 uintptr_t;
#else
typedef unsigned int uintptr_t;
...
Example from SQLITE3:
#ifdef SQLITE_OMIT_FLOATING_POINT
# define double sqlite3_int64
#endif
Also what sort of opaque types do you have in mind?
Things like time_t and clock_t, or the equivalent from libraries.
Yes you could hunt down the exact underlying type (for clock_t in one
case, it was under 6 layers of typedefs and macros), but that would
be for a specific set of headers.
For system headers, somebody could be using a header with different
definitions. For user-libraries, it might be a slightly different
version.
What is the problem you want to solve here?
The problem is that C expects an exact format-code when trying to use
*printf functions, and for that you need to know the exact types of
the expressions being passed. For example:
uintptr_t x; // from above examples
double y; //
printf("x * y is %?", x * y); // What's '?'
So are you asking because you don't know what Tim's construction does
with these types, or because you want to know if there is a portable
and safe way to print out any arithmetic type, or because you are
perfectly aware that C's printf has limitations and you want to post
about how terrible C is and how great your own language is?
I was reponding to the example of a single variable with ONE type, that happens to be uint32_t, apparently a standard C type.
Yes maybe that particular strategy might work (you know it is an integer
and that it is unsigned).
But it doesn't solve the general problem: even if there is a single type involved, it might be conditional or opaque (or its type is changed
required all format codes to be revised.
Or there is an expression of mixed types.
and you want to post about how
terrible C is and how great your own language is?
I think pretty much every language except C seems to have solved this.
The point of both Tim's and Keith's solutions is that you do /not/
need to know the exact type of the expression you are printing - C's
conversion rules let them work with a range of different original types.
OK.
They were both picked specifically for the case of "uint32_t", but
can handle more types. Tim's can be used for any integer type of rank
up to "unsigned long int" (and thus not "long long" types),
So not such a great range.
while Keith's will
be fine with any integer type and any floating point type as long as
the value of the integer part of the floating point value is within
the range of "unsigned long int".
Better, *if* you know the expression has an unsigned integer type.
So as far as I'm concerned, the general problem remains. There are only workarounds and special cases that every user has to work out for themselves.
Meanwhile C11 (_Generic) and C23 ("%w" formats) don't appear to have
made much impact. It's not fixing it at the right level. But at least
you can now have a 999-bit type that you probably can't print even if
you wrote "%w999d"; or can you?
On 04/02/2026 19:11, Bart wrote:
On 04/02/2026 17:12, David Brown wrote:
So are you asking because you don't know what Tim's construction does
with these types, or because you want to know if there is a portable
and safe way to print out any arithmetic type, or because you are
perfectly aware that C's printf has limitations and you want to post
about how terrible C is and how great your own language is?
I was reponding to the example of a single variable with ONE type,
that happens to be uint32_t, apparently a standard C type.
You know perfectly well that "uint32_t" is not a standard type - it is a typedef for a standard or extended integer type.
And you know perfectly well that the constructions here from Tim and
Keith demonstrate safe ways to print values of type "uint32_t",
regardless of whether it is a typedef for "unsigned int", "unsigned long int", or an extend integer type. That was the point of their posts.
Yes maybe that particular strategy might work (you know it is an
integer and that it is unsigned).
What did you think an "uint32_t" was, if not a type of unsigned integer?
And there is no "maybe" about it - the strategies work.
If you have other arithmetic types, then you need to adapt the strategy
to fit - you need something that covers the ranges of the data you are dealing with.
But it doesn't solve the general problem: even if there is a single
type involved, it might be conditional or opaque (or its type is
changed required all format codes to be revised.
Or there is an expression of mixed types.
There is no such thing as an "expression of mixed types". There are expressions formed with operators applied to subexpressions of different types - the rules of C state very clearly how those subexpressions are converted. (For most binary operators, these are the "usual arithmetic conversions".) You know this too.
and you want to post about how
terrible C is and how great your own language is?
I think pretty much every language except C seems to have solved this.
No, not all - but certainly many languages have more convenient handling
of printing expressions. C's method works - it has done its job for
half a century - but no one will argue that it is a bit clumsy. And if
you are rough or lazy about it, it can be unsafe. If it bothers you too much, you can make a reasonable enough type-safe print facility with _Generic and variadic macros.
Changing it to "%g", "double" and "0.0"
covers all integer types and floats and doubles. (Supporting long
doubles is left as an exercise for the reader.)
On 04/02/2026 17:12, David Brown wrote:
and you want to post about how
terrible C is and how great your own language is?
I think pretty much every language except C seems to have solved this.
On Wed, 4 Feb 2026 18:11:34 +0000
Bart <[email protected]> wrote:
On 04/02/2026 17:12, David Brown wrote:
> and you want to post about how
> terrible C is and how great your own language is?
I think pretty much every language except C seems to have solved this.
How do you do it in Fortran?
Also, there are many languages that "solved" it at very high cost of primitivity of their formatting features. E.g. Pascal.
I don't remember where Ada stands in this picturee. In case of Ada95 or newer, more like don't know rather then don't remember.
The problem is that C expects an exact format-code when trying to use
*printf functions, and for that you need to know the exact types of
the expressions being passed. For example:
uintptr_t x; // from above examples
double y; //
printf("x * y is %?", x * y); // What's '?'
Bart <[email protected]> writes:
[...]
The problem is that C expects an exact format-code when trying to use
*printf functions, and for that you need to know the exact types of
the expressions being passed. For example:
uintptr_t x; // from above examples
double y; //
printf("x * y is %?", x * y); // What's '?'
Since you asked...
'?' is 'f' (or 'g' or 'e', or 'a', or any of those in upper case).
`x * y` is of type double.
Meanwhile C11 (_Generic) and C23 ("%w" formats) don't appear to have
made much impact. It's not fixing it at the right level. But at least
you can now have a 999-bit type that you probably can't print even if
you wrote "%w999d"; or can you?
On 04/02/2026 20:09, David Brown wrote:
On 04/02/2026 19:11, Bart wrote:
On 04/02/2026 17:12, David Brown wrote:
So are you asking because you don't know what Tim's construction
does with these types, or because you want to know if there is a
portable and safe way to print out any arithmetic type, or because
you are perfectly aware that C's printf has limitations and you want
to post about how terrible C is and how great your own language is?
I was reponding to the example of a single variable with ONE type,
that happens to be uint32_t, apparently a standard C type.
You know perfectly well that "uint32_t" is not a standard type - it is
a typedef for a standard or extended integer type.
And you know perfectly well that the constructions here from Tim and
Keith demonstrate safe ways to print values of type "uint32_t",
regardless of whether it is a typedef for "unsigned int", "unsigned
long int", or an extend integer type. That was the point of their posts.
And one of mine is that you might not know the type is 'uint32_t'.
Even if you were 100% sure, an update might change it, and the format
might no longer be appropriate. (Eg. it might become signed, but gcc
will not report that, at least not with Wall + Wextra + Wpedantic.)
Yes maybe that particular strategy might work (you know it is an
integer and that it is unsigned).
What did you think an "uint32_t" was, if not a type of unsigned
integer? And there is no "maybe" about it - the strategies work.
See above.
If you have other arithmetic types, then you need to adapt the
strategy to fit - you need something that covers the ranges of the
data you are dealing with.
But it doesn't solve the general problem: even if there is a single
type involved, it might be conditional or opaque (or its type is
changed required all format codes to be revised.
Or there is an expression of mixed types.
There is no such thing as an "expression of mixed types". There are
expressions formed with operators applied to subexpressions of
different types - the rules of C state very clearly how those
subexpressions are converted. (For most binary operators, these are
the "usual arithmetic conversions".) You know this too.
An expression of mixed types means one that involves a number of
different types amongst its types.
Sure, the rules will tell you what the result will be, but you have to
work it out, and to do that, you have to know what each of those types
are (again, see above).
Try this one for example; T, U and V are three numeric types exported by version 2.1 of some library:
T x;
U y;
V z;
printf("%?", x + y * z);
You can spend some time hunting down those types and figuring out the
result type (either one of T U V or maybe W). But how confident will you
be that it will still work on 2.2?
The change may be subtle enough that no warning is given, but enough to
give a wrong result.
and you want to post about how
terrible C is and how great your own language is?
I think pretty much every language except C seems to have solved this.
No, not all - but certainly many languages have more convenient
handling of printing expressions. C's method works - it has done its
job for half a century - but no one will argue that it is a bit
clumsy. And if you are rough or lazy about it, it can be unsafe. If
it bothers you too much, you can make a reasonable enough type-safe
print facility with _Generic and variadic macros.
So, a workaround that every user has to bother with. That's a bad sign.
On 04/02/2026 23:39, Keith Thompson wrote:
Bart <[email protected]> writes:
[...]
The problem is that C expects an exact format-code when trying to use
*printf functions, and for that you need to know the exact types of
the expressions being passed. For example:
uintptr_t x; // from above examples
double y; //
printf("x * y is %?", x * y); // What's '?'
Since you asked...
'?' is 'f' (or 'g' or 'e', or 'a', or any of those in upper case).
`x * y` is of type double.
The 'from above examples' applies to both x and y. That means that
'double' /may/ have been redefined like this (from my post):
#ifdef SQLITE_OMIT_FLOATING_POINT
# define double sqlite3_int64
#endif
I don't know what 'sqlite3_int64' is, but it sounds like a signed integer.
I was asked to give examples of conditional types, and thought it best
to do so from actual programs.
On 04/02/2026 21:42, Bart wrote:
Usually when you have a type T in your code, you know some things about
it - you typically know if it is arithmetic, integer, floating point,
you know something about its range.
How much you know will vary, but a
type you know absolutely nothing about is unlikely to be of any use in
code.
Yes maybe that particular strategy might work (you know it is an
integer and that it is unsigned).
What did you think an "uint32_t" was, if not a type of unsigned
integer? And there is no "maybe" about it - the strategies work.
See above.
If you have other arithmetic types, then you need to adapt the
strategy to fit - you need something that covers the ranges of the
data you are dealing with.
But it doesn't solve the general problem: even if there is a single
type involved, it might be conditional or opaque (or its type is
changed required all format codes to be revised.
Or there is an expression of mixed types.
There is no such thing as an "expression of mixed types". There are
expressions formed with operators applied to subexpressions of
different types - the rules of C state very clearly how those
subexpressions are converted. (For most binary operators, these are
the "usual arithmetic conversions".) You know this too.
An expression of mixed types means one that involves a number of
different types amongst its types.
Okay, that's what /you/ mean by that phrase. It is not an accurate description - in any statically typed language, an expression will have
a single type. Subexpressions can be different types. But while I do
not approve of your terms here, I do understand what you are talking about.
Sure, the rules will tell you what the result will be, but you have to
work it out, and to do that, you have to know what each of those types
are (again, see above).
No, the /compiler/ has to work it out. Whether /you/ need to work it
out or not, depends on what you are doing with the result.
If you have "T x;", and you write "(unsigned long) x" (as Keith
suggested), then you know the type of that expression - without knowing
the type of T. You need to know that "T" is a type that can be
converted to "unsigned long" (any arithmetic or pointer type will do),
and you need to know that the value of "x" is suitable for the
conversion to be defined (so if "x" is floating point, it needs to be in range). If you don't know at least that much about "x", you probably should not be writing code with it.
On 05/02/2026 11:41, David Brown wrote:
On 04/02/2026 21:42, Bart wrote:
Usually when you have a type T in your code, you know some things about
it - you typically know if it is arithmetic, integer, floating point,
you know something about its range.
How about time_t, clock_t, off_t?
How many people do you know who have actually written and use a
C11 print system using _Generic and variadic macros? I don't know
any. (I've written simple examples as proofs of concept, posted
in this group, but not for real use.) It turns out that people
/don't/ have to have workarounds. "printf" has its limitations -
there's no doubt there. But it is good enough for most people
and most uses.
On 05/02/2026 00:52, Bart wrote:
On 04/02/2026 23:39, Keith Thompson wrote:
Bart <[email protected]> writes:
[...]
The problem is that C expects an exact format-code when trying to use
*printf functions, and for that you need to know the exact types of
the expressions being passed. For example:
uintptr_t x; // from above examples
double y; //
printf("x * y is %?", x * y); // What's '?'
Since you asked...
'?' is 'f' (or 'g' or 'e', or 'a', or any of those in upper case).
`x * y` is of type double.
The 'from above examples' applies to both x and y. That means that
'double' /may/ have been redefined like this (from my post):
#ifdef SQLITE_OMIT_FLOATING_POINT
# define double sqlite3_int64
#endif
I don't know what 'sqlite3_int64' is, but it sounds like a signed
integer. I was asked to give examples of conditional types, and
thought it best to do so from actual programs.
What you have found is an idiocy in SQLITE, not a problem in the C
language or printf. If the macro "SQLITE_OMIT_FLOATING_POINT" is
defined, then the type named "sqlite3_int64" is not an integer type,
nor can it hold arbitrary 64-bit integers (as Michael S pointed out,
and I assume accurately, it can hold 53 bit integers). I do not know
what this type is used for in the code, but something like
"sqlite3_bignum" would be a far better choice of name. And if it is
intended that people print out these values directly, defining "PRsqlite3_bignum" to "%g" or "%llu" as appropriate would be helpful.
(Yes, the resulting printf statements would be ugly - better ugly and
correct than wrong).
On 05/02/2026 11:41, David Brown wrote:
No, the /compiler/ has to work it out. Whether /you/ need to work it
out or not, depends on what you are doing with the result.
The compiler will not tell you the format codes to use!
On 2026-02-05 18:42, Bart wrote:
On 05/02/2026 11:41, David Brown wrote:
No, the /compiler/ has to work it out. Whether /you/ need to work it
out or not, depends on what you are doing with the result.
The compiler will not tell you the format codes to use!
Well, it seems the compiler I have here does it quite verbosely...
$ cc -o prtfmt prtfmt.c
prtfmt.c: In function ‘main’:
prtfmt.c:8:19: warning: format ‘%d’ expects argument of type ‘int’, but
argument 2 has type ‘double’ [-Wformat=]
8 | printf ("%d\n", f);
| ~^ ~
| | |
| int double
| %f
prtfmt.c:9:19: warning: format ‘%f’ expects argument of type ‘double’,
but argument 2 has type ‘int’ [-Wformat=]
9 | printf ("%f\n", i);
| ~^ ~
| | |
| | int
| double
| %d
...giving information of every kind - here for two basic types, but
it has also the same verbose diagnostics with the '_t' types I tried
(e.g. suggesting '%ld' for a 'time_t' argument).
Note: I'm still acknowledging the unfortunate type/formatter-coupling notwithstanding.
/Some/ compilers with /some/ options will /sometimes/ tell you when
you've got it wrong.
But you first have to make an educated guess, or put in some dummy
format code.
Eventually, it will compile. Until someone else builds your program,
using a slightly different set of headers where certain types are
defined, and then it might either give compiler messages that they
have to fix, or it show wrong results.
If I compile this code with 'gcc -Wall -Wextra -Wpedantic':
#include <stdio.h>
int main() {
int a = -1;
printf("%u", a);
}
it says nothing. The program displays 4294967295 instead of -1.
If compile this version (using %v) using a special extension:
#include <stdio.h>
int main() {
int a = -1;
printf("%v", a);
}
it shows -1. Which is better?
Bart <[email protected]> writes:
If I compile this code with 'gcc -Wall -Wextra -Wpedantic':
#include <stdio.h>
int main() {
int a = -1;
printf("%u", a);
}
it says nothing. The program displays 4294967295 instead of -1.
The behavior is unsurprising. The lack of a warning is very mildly inconvenient.
On 04/02/2026 14:22, Tim Rentsch wrote:
Bart <[email protected]> writes:
On 03/02/2026 15:47, Tim Rentsch wrote:
[...]
If variable 'u' is declared as uint32_t, a way to print it that is
easy and also type-safe is
printf( " u is %lu\n", u+0LU );
What about a compound expression of several variables of mixed
integer types, possibly even mixed with floats, some of whose types
might either be conditional (depending on some macro), or opaque?
What is an example of a conditional/macro-dependent type?
Example from SDL2:
#if defined(_MSC_VER) && (_MSC_VER < 1600)
...
#ifndef _UINTPTR_T_DEFINED
#ifdef _WIN64
typedef unsigned __int64 uintptr_t;
#else
typedef unsigned int uintptr_t;
...
Example from SQLITE3:
#ifdef SQLITE_OMIT_FLOATING_POINT
# define double sqlite3_int64
#endif
Also what sort of opaque types do you have in mind?
Things like time_t and clock_t, or the equivalent from libraries.
Yes you could hunt down the exact underlying type (for clock_t in one
case, it was under 6 layers of typedefs and macros), but that would be
for a specific set of headers.
For system headers, somebody could be using a header with different definitions. For user-libraries, it might be a slightly different
version.
What is the problem you want to solve here?
The problem is that C expects an exact format-code when trying to use
*printf functions, and for that you need to know the exact types of
the expressions being passed. For example:
uintptr_t x; // from above examples
double y; //
printf("x * y is %?", x * y); // What's '?'
David Brown <[email protected]> writes:
[...]
How many people do you know who have actually written and use a
C11 print system using _Generic and variadic macros? I don't know
any. (I've written simple examples as proofs of concept, posted
in this group, but not for real use.) It turns out that people
/don't/ have to have workarounds. "printf" has its limitations -
there's no doubt there. But it is good enough for most people
and most uses.
I recently played around with an attempted framework using _Generic.
The goal was to be able to write something like
print(s(x), s(y), s(z));
where x, y, and z can be of more or less arbitrary types (integer, floating-point char*). The problem I ran into was that only one of
the generic associations is evaluated (which one is determined at
compile time), but *all* of them have to be valid code.
On 05/02/2026 11:41, David Brown wrote:
On 04/02/2026 21:42, Bart wrote:
Usually when you have a type T in your code, you know some things
about it - you typically know if it is arithmetic, integer, floating
point, you know something about its range.
How about time_t, clock_t, off_t?
How much you know will vary, but a type you know absolutely nothing
about is unlikely to be of any use in code.
The problem is that that format code is tied to the type of the
expression. That means that as your program evolves and the types
change, or the expression changes (so another term's type becomes
dominant), then you have to check all such format codes.
Yes maybe that particular strategy might work (you know it is an
integer and that it is unsigned).
What did you think an "uint32_t" was, if not a type of unsigned
integer? And there is no "maybe" about it - the strategies work.
See above.
If you have other arithmetic types, then you need to adapt the
strategy to fit - you need something that covers the ranges of the
data you are dealing with.
But it doesn't solve the general problem: even if there is a single >>>>> type involved, it might be conditional or opaque (or its type is
changed required all format codes to be revised.
Or there is an expression of mixed types.
There is no such thing as an "expression of mixed types". There are >>>> expressions formed with operators applied to subexpressions of
different types - the rules of C state very clearly how those
subexpressions are converted. (For most binary operators, these are >>>> the "usual arithmetic conversions".) You know this too.
An expression of mixed types means one that involves a number of
different types amongst its types.
(Here I meant 'amongst its terms'.)
Okay, that's what /you/ mean by that phrase. It is not an accurate
description - in any statically typed language, an expression will
have a single type. Subexpressions can be different types. But while
I do not approve of your terms here, I do understand what you are
talking about.
Sure, the rules will tell you what the result will be, but you have
to work it out, and to do that, you have to know what each of those
types are (again, see above).
No, the /compiler/ has to work it out. Whether /you/ need to work it
out or not, depends on what you are doing with the result.
The compiler will not tell you the format codes to use!
If you have "T x;", and you write "(unsigned long) x" (as Keith
suggested), then you know the type of that expression - without
knowing the type of T. You need to know that "T" is a type that can
be converted to "unsigned long" (any arithmetic or pointer type will
do), and you need to know that the value of "x" is suitable for the
conversion to be defined (so if "x" is floating point, it needs to be
in range). If you don't know at least that much about "x", you
probably should not be writing code with it.
I tried this program:
#include <stdio.h>
#include "t.h" // defines T
T F();
int main() {
T x;
x=F();
printf("%lu\n", (unsigned long)x);
}
T happens to be 'int', and F() returns -1. This program however prints 4294967295.
If I change it so that T is 'long long int' and F returns 5000000000,
then it shows 705032704. Not really ideal.
Here a better bet for an unknown type is %f, which gives the right
values, but it appear as -1.00000 etc.
Better still is to use exactly the right format, but that has the issues
I mentioned.
David Brown <[email protected]> writes:
[...]
How many people do you know who have actually written and use a
C11 print system using _Generic and variadic macros? I don't know
any. (I've written simple examples as proofs of concept, posted
in this group, but not for real use.) It turns out that people
/don't/ have to have workarounds. "printf" has its limitations -
there's no doubt there. But it is good enough for most people
and most uses.
I recently played around with an attempted framework using _Generic.
The goal was to be able to write something like
print(s(x), s(y), s(z));
where x, y, and z can be of more or less arbitrary types (integer, floating-point char*). The problem I ran into was that only one of
the generic associations is evaluated (which one is determined at
compile time), but *all* of them have to be valid code. There's a
proposal to change this for C 202y.
I didn't spend a lot of time on it.
David Brown <[email protected]> writes:
On 05/02/2026 00:52, Bart wrote:
On 04/02/2026 23:39, Keith Thompson wrote:
Bart <[email protected]> writes:
[...]
The problem is that C expects an exact format-code when trying to use >>>>> *printf functions, and for that you need to know the exact types of
the expressions being passed. For example:
uintptr_t x; // from above examples
double y; //
printf("x * y is %?", x * y); // What's '?'
Since you asked...
'?' is 'f' (or 'g' or 'e', or 'a', or any of those in upper case).
`x * y` is of type double.
The 'from above examples' applies to both x and y. That means that
'double' /may/ have been redefined like this (from my post):
#ifdef SQLITE_OMIT_FLOATING_POINT
# define double sqlite3_int64
#endif
I don't know what 'sqlite3_int64' is, but it sounds like a signed
integer. I was asked to give examples of conditional types, and
thought it best to do so from actual programs.
What you have found is an idiocy in SQLITE, not a problem in the C
language or printf. If the macro "SQLITE_OMIT_FLOATING_POINT" is
defined, then the type named "sqlite3_int64" is not an integer type,
nor can it hold arbitrary 64-bit integers (as Michael S pointed out,
and I assume accurately, it can hold 53 bit integers). I do not know
what this type is used for in the code, but something like
"sqlite3_bignum" would be a far better choice of name. And if it is
intended that people print out these values directly, defining
"PRsqlite3_bignum" to "%g" or "%llu" as appropriate would be helpful.
(Yes, the resulting printf statements would be ugly - better ugly and
correct than wrong).
The macro doesn't define "sqlite3_int64", which as far as I can tell is always an integer type. It redefines "double".
That macro in isolation does seem deeply silly, but I haven't worked on sqlite3's source code. Apparently the authors found it convenient. Presumably anyone working on the source code has to keep in mind that
the word "double" doesn't necessarily mean what it normally means. It's
not the way I would have written it. I probably would have defined a
type name that can be either "double" or "sqlite3_int64", depending on
the setting of SQLITE_OMIT_FLOATING_POINT. But I don't know enough
about the sqlite3 source code to be able to meaningfully criticize it.
In almost all contexts, it's perfectly reasonable to assume that the
word "double" in C code refers to the predefined floating-point type.
Bart <[email protected]> writes:
[...]
/Some/ compilers with /some/ options will /sometimes/ tell you when
you've got it wrong.
But you first have to make an educated guess, or put in some dummy
format code.
Eventually, it will compile. Until someone else builds your program,
using a slightly different set of headers where certain types are
defined, and then it might either give compiler messages that they
have to fix, or it show wrong results.
That's not how I do it, and I don't think it's how most programmers do
it.
I know the rules well enough that I can usually write a correct format
string in the first place. If I make a mistake, gcc's warnings are a
nice check.
On 2026-02-06 06:10, Keith Thompson wrote:
Bart <[email protected]> writes:
If I compile this code with 'gcc -Wall -Wextra -Wpedantic':
#include <stdio.h>
int main() {
int a = -1;
printf("%u", a);
}
it says nothing. The program displays 4294967295 instead of -1.
Yes. You instruct 'printf' with '%u' to interpret and display it
(the variable 'a') as unsigned. ('-1' is not an unsigned numeric representation.) - I wonder what you are thinking here.
The behavior is unsurprising. The lack of a warning is very mildly
inconvenient.
Indeed unsurprising. And I even don't see any inconvenience given
that even an initialized declaration of 'unsigned a = -1;' is not
considered a problem in "C". I rather learned that to be a useful
code pattern when programming in "C".
On 06/02/2026 05:10, Keith Thompson wrote:
Bart <[email protected]> writes:
[...]
/Some/ compilers with /some/ options will /sometimes/ tell you when
you've got it wrong.
But you first have to make an educated guess, or put in some dummy
format code.
Eventually, it will compile. Until someone else builds your
program, using a slightly different set of headers where certain
types are defined, and then it might either give compiler messages
that they have to fix, or it show wrong results.
That's not how I do it, and I don't think it's how most programmers
do it.
I know the rules well enough that I can usually write a correct
format string in the first place. If I make a mistake, gcc's
warnings are a nice check.
I guess you've never used printf-family functions via the FFI of
another language!
Keith Thompson <[email protected]> writes:[...]
I recently played around with an attempted framework using _Generic.
The goal was to be able to write something like
print(s(x), s(y), s(z));
where x, y, and z can be of more or less arbitrary types (integer,
floating-point char*). The problem I ran into was that only one of
the generic associations is evaluated (which one is determined at
compile time), but *all* of them have to be valid code.
That is annoying but it shouldn't be too hard to work around
it. To verify that hypothesis I wrote this test case:
#include <stdio.h>[30 lines deleted]
#include <time.h>
#include <stdint.h>
#include "h/show.h"
int
main(){
show([23 lines deleted]
uc,sc,us,ss,ui,si,ul,sl,ull,sll,
c,f,d,ld,yes,no,u16,s16,uge32,sge32,
runtime,now,offset,uf32,sf32,
c * now / 1e8 * ld,
foo, bas
);
printf( "\n" );
return 0;
}
which compiles under C11 and (along with the show.h include file)
produces output:
uc = 255
sc = -1
us = 65535
foo = "foo"
bas = (const char *) "bas"
On Fri, 6 Feb 2026 12:39:55 +0000
Bart <[email protected]> wrote:
On 06/02/2026 05:10, Keith Thompson wrote:
Bart <[email protected]> writes:
[...]
/Some/ compilers with /some/ options will /sometimes/ tell you when
you've got it wrong.
But you first have to make an educated guess, or put in some dummy
format code.
Eventually, it will compile. Until someone else builds your
program, using a slightly different set of headers where certain
types are defined, and then it might either give compiler messages
that they have to fix, or it show wrong results.
That's not how I do it, and I don't think it's how most programmers
do it.
I know the rules well enough that I can usually write a correct
format string in the first place. If I make a mistake, gcc's
warnings are a nice check.
I guess you've never used printf-family functions via the FFI of
another language!
Vararg via FFI? Is it really a good idea?
Tim Rentsch <[email protected]> writes:
Keith Thompson <[email protected]> writes:[...]
[30 lines deleted]I recently played around with an attempted framework using _Generic.
The goal was to be able to write something like
print(s(x), s(y), s(z));
where x, y, and z can be of more or less arbitrary types (integer,
floating-point char*). The problem I ran into was that only one of
the generic associations is evaluated (which one is determined at
compile time), but *all* of them have to be valid code.
That is annoying but it shouldn't be too hard to work around
it. To verify that hypothesis I wrote this test case:
#include <stdio.h>
#include <time.h>
#include <stdint.h>
#include "h/show.h"
int
main(){
show([23 lines deleted]
uc,sc,us,ss,ui,si,ul,sl,ull,sll,
c,f,d,ld,yes,no,u16,s16,uge32,sge32,
runtime,now,offset,uf32,sf32,
c * now / 1e8 * ld,
foo, bas
);
printf( "\n" );
return 0;
}
which compiles under C11 and (along with the show.h include file)
produces output:
uc = 255
sc = -1
us = 65535
foo = "foo"
bas = (const char *) "bas"
Were you planning to show us what show.h looks like?
On Fri, 6 Feb 2026 12:39:55 +0000
Bart <[email protected]> wrote:
On 06/02/2026 05:10, Keith Thompson wrote:
Bart <[email protected]> writes:
[...]
/Some/ compilers with /some/ options will /sometimes/ tell you when
you've got it wrong.
But you first have to make an educated guess, or put in some dummy
format code.
Eventually, it will compile. Until someone else builds your
program, using a slightly different set of headers where certain
types are defined, and then it might either give compiler messages
that they have to fix, or it show wrong results.
That's not how I do it, and I don't think it's how most programmers
do it.
I know the rules well enough that I can usually write a correct
format string in the first place. If I make a mistake, gcc's
warnings are a nice check.
I guess you've never used printf-family functions via the FFI of
another language!
Vararg via FFI? Is it really a good idea?
I guess you've never used printf-family functions via the FFI of
another language!
Bart <[email protected]> writes:
[...]
I guess you've never used printf-family functions via the FFI of
another language!
As it happens, I haven't.
I presume if there were a point, you would have made it by now.
On 06/02/2026 13:08, Keith Thompson wrote:
Bart <[email protected]> writes:
[...]
I guess you've never used printf-family functions via the FFI of
another language!
As it happens, I haven't.
I presume if there were a point, you would have made it by now.
I thought you might have infered it.
All shortcomings of C can apparently be fixed by employing a selection
of it gcc's 200+ options.
Bart <[email protected]> writes:
[...]
/Some/ compilers with /some/ options will /sometimes/ tell you when
you've got it wrong.
But you first have to make an educated guess, or put in some dummy
format code.
Eventually, it will compile. Until someone else builds your program,
using a slightly different set of headers where certain types are
defined, and then it might either give compiler messages that they
have to fix, or it show wrong results.
That's not how I do it, and I don't think it's how most programmers do
it.
I know the rules well enough that I can usually write a correct format
string in the first place. If I make a mistake, gcc's warnings are a
nice check.
If I compile this code with 'gcc -Wall -Wextra -Wpedantic':
#include <stdio.h>
int main() {
int a = -1;
printf("%u", a);
}
it says nothing. The program displays 4294967295 instead of -1.
The behavior is unsurprising. The lack of a warning is very mildly >inconvenient.
On 06/02/2026 13:08, Keith Thompson wrote:
Bart <[email protected]> writes:
[...]
I guess you've never used printf-family functions via the FFI ofAs it happens, I haven't.
another language!
I presume if there were a point, you would have made it by now.
I thought you might have infered it.
All shortcomings of C can apparently be fixed by employing a selection
of it gcc's 200+ options. So no point in making the language better
instead.
But that doesn't work when some bits of raw C need to be used from
another language. For example, library headers containing a C API,
which some languages may use as the basis for their own bindings.
Bart <[email protected]> writes:
On 06/02/2026 13:08, Keith Thompson wrote:
Bart <[email protected]> writes:
[...]
I guess you've never used printf-family functions via the FFI ofAs it happens, I haven't.
another language!
I presume if there were a point, you would have made it by now.
I thought you might have infered it.
All shortcomings of C can apparently be fixed by employing a selection
of it gcc's 200+ options. So no point in making the language better
instead.
Nobody said any of that.
But that doesn't work when some bits of raw C need to be used from
another language. For example, library headers containing a C API,
which some languages may use as the basis for their own bindings.
Sure, FFIs can be tricky.
You randomly introduced FFIs into a discussion of printf formats. What irrelevant argument are you going to make next?
https://en.wikipedia.org/wiki/Gish_gallop
On 06/02/2026 19:21, Keith Thompson wrote:
Bart <[email protected]> writes:
On 06/02/2026 13:08, Keith Thompson wrote:Nobody said any of that.
Bart <[email protected]> writes:
[...]
I guess you've never used printf-family functions via the FFI of
another language!
As it happens, I haven't.
I presume if there were a point, you would have made it by now.
I thought you might have infered it.
All shortcomings of C can apparently be fixed by employing a selection
of it gcc's 200+ options. So no point in making the language better
instead.
But that doesn't work when some bits of raw C need to be used from
another language. For example, library headers containing a C API,
which some languages may use as the basis for their own bindings.
Sure, FFIs can be tricky.
You randomly introduced FFIs into a discussion of printf formats. What
irrelevant argument are you going to make next?
https://en.wikipedia.org/wiki/Gish_gallop
You seem to have introduced some nonsense of your own.
I'm simply saying that people discussing here C are often blind to its problems because they employ an advanced C compiler or other
analytical tools to mitigate them.
You can't do that if working with raw C like I do. I've been using C libraries via FFIs and C header files for about 30 years.
And so, if you want to use *printf functions like that, then the fact
that gcc can sometimes report on incorrect format codes is no help at
all; I'm not using gcc, /or/ writing C!
It doesn't help when I'm writing C either, as I either use lesser
compilers, or use gcc without any fancy options.
I believe a language should stand by itself and not rely on complex
tooling to make it practical to use. Not even syntax highlighting
should be necessary.
On 05/02/2026 22:55, Janis Papanagnou wrote:
On 2026-02-05 18:42, Bart wrote:
On 05/02/2026 11:41, David Brown wrote:
No, the /compiler/ has to work it out. Whether /you/ need to work it >>>> out or not, depends on what you are doing with the result.
The compiler will not tell you the format codes to use!
Well, it seems the compiler I have here does it quite verbosely...
$ cc -o prtfmt prtfmt.c
prtfmt.c: In function ‘main’:
prtfmt.c:8:19: warning: format ‘%d’ expects argument of type ‘int’, but
argument 2 has type ‘double’ [-Wformat=]
8 | printf ("%d\n", f);
| ~^ ~
| | |
| int double
| %f
prtfmt.c:9:19: warning: format ‘%f’ expects argument of type ‘double’,
but argument 2 has type ‘int’ [-Wformat=]
9 | printf ("%f\n", i);
| ~^ ~
| | |
| | int
| double
| %d
...giving information of every kind - here for two basic types, but
it has also the same verbose diagnostics with the '_t' types I tried
(e.g. suggesting '%ld' for a 'time_t' argument).
Note: I'm still acknowledging the unfortunate type/formatter-coupling
notwithstanding.
/Some/ compilers with /some/ options will /sometimes/ tell you when
you've got it wrong.
Eventually, it will compile. Until someone else builds your program,
using a slightly different set of headers where certain types are
defined, and then it might either give compiler messages that they have
to fix, or it show wrong results.
If I compile this code with 'gcc -Wall -Wextra -Wpedantic':
#include <stdio.h>
int main() {
int a = -1;
printf("%u", a);
}
it says nothing. The program displays 4294967295 instead of -1.
If compile this version (using %v) using a special extension:
#include <stdio.h>
int main() {
int a = -1;
printf("%v", a);
}
it shows -1. Which is better?
On Fri, 6 Feb 2026 12:39:55 +0000
Bart <[email protected]> wrote:
On 06/02/2026 05:10, Keith Thompson wrote:
Bart <[email protected]> writes:
[...]
/Some/ compilers with /some/ options will /sometimes/ tell you when
you've got it wrong.
But you first have to make an educated guess, or put in some dummy
format code.
Eventually, it will compile. Until someone else builds your
program, using a slightly different set of headers where certain
types are defined, and then it might either give compiler messages
that they have to fix, or it show wrong results.
That's not how I do it, and I don't think it's how most programmers
do it.
I know the rules well enough that I can usually write a correct
format string in the first place. If I make a mistake, gcc's
warnings are a nice check.
I guess you've never used printf-family functions via the FFI of
another language!
Vararg via FFI? Is it really a good idea?
(with-dyn-lib nil(deffi printf-int "printf" int (str : int)))
(printf-int "%d\n" 42)42
On 2026-02-06, Michael S <[email protected]> wrote:
On Fri, 6 Feb 2026 12:39:55 +0000
Bart <[email protected]> wrote:
On 06/02/2026 05:10, Keith Thompson wrote:
Bart <[email protected]> writes:
[...]
/Some/ compilers with /some/ options will /sometimes/ tell you when
you've got it wrong.
But you first have to make an educated guess, or put in some dummy
format code.
Eventually, it will compile. Until someone else builds your
program, using a slightly different set of headers where certain
types are defined, and then it might either give compiler messages
that they have to fix, or it show wrong results.
That's not how I do it, and I don't think it's how most programmers
do it.
I know the rules well enough that I can usually write a correct
format string in the first place. If I make a mistake, gcc's
warnings are a nice check.
I guess you've never used printf-family functions via the FFI of
another language!
Vararg via FFI? Is it really a good idea?
I support it in TXR Lisp. However, it's limited in that the FFI
definition is nailed to a particular choice of arguments.
For instance we could make a function foo which takes two arguments:
a str and an int, and calls the variadic printf.
Then we can call (foo "%d" 42). It will call printf("%d", 42).
We cannot pass fewer or more than two arguments to foo, and they have to
be compatible with str and int.
Demo:
$ txr
This is the TXR Lisp interactive listener of TXR 302.
Quit with :quit or Ctrl-D on an empty line. Ctrl-X ? for cheatsheet.
(with-dyn-lib nil(deffi printf-int "printf" int (str : int)))
printf-int
(printf-int "%d\n" 42)42
3
42 is output; 3 is the result value (3 characters output).
The : syntax in the deffi macro call indicates the variadic list.
It's not the case that we can make a variadic Lisp function pass its arguments
as an arbitrarily long variadic list with arbitrary types to the wrapped FFI function. Fixed parameters must be declared after the colon.
A dynamic treatment could be arranged via a heavy weight wrapper mechanism which
dynamically analyzes the actual arguments, builds a libffi function descriptor
on the fly, then uses it to make the call; it could be wortwhile for someone, but I didn't implement such a thing. Metaprogramming tricks revolving around dynamically evaluating deffi are also possible.
On 2026-02-05, Bart <[email protected]> wrote:
On 05/02/2026 22:55, Janis Papanagnou wrote:
On 2026-02-05 18:42, Bart wrote:
On 05/02/2026 11:41, David Brown wrote:
No, the /compiler/ has to work it out. Whether /you/ need to work it >>>>> out or not, depends on what you are doing with the result.
The compiler will not tell you the format codes to use!
Well, it seems the compiler I have here does it quite verbosely...
$ cc -o prtfmt prtfmt.c
prtfmt.c: In function ‘main’:
prtfmt.c:8:19: warning: format ‘%d’ expects argument of type ‘int’, but
argument 2 has type ‘double’ [-Wformat=]
8 | printf ("%d\n", f);
| ~^ ~
| | |
| int double
| %f
prtfmt.c:9:19: warning: format ‘%f’ expects argument of type ‘double’,
but argument 2 has type ‘int’ [-Wformat=]
9 | printf ("%f\n", i);
| ~^ ~
| | |
| | int
| double
| %d
...giving information of every kind - here for two basic types, but
it has also the same verbose diagnostics with the '_t' types I tried
(e.g. suggesting '%ld' for a 'time_t' argument).
Note: I'm still acknowledging the unfortunate type/formatter-coupling
notwithstanding.
/Some/ compilers with /some/ options will /sometimes/ tell you when
you've got it wrong.
That's an excellent reason to keep the bulk of your code portable, and
offer it to multiple compilers.
I think the only way you are going to run into a crappy compiler in a
real job situation in 2026 is if you're an embedded developer working
with some very proprietary processor for which the only compiler comes
from its vendor. Even so the bits of your code not specific to that
chip can be compiled with something else. Which you want to do not just
for diagnostics but to be able to run unit tests on that code on a
regular developer machine.
Eventually, it will compile. Until someone else builds your program,
using a slightly different set of headers where certain types are
defined, and then it might either give compiler messages that they have
to fix, or it show wrong results.
If I compile this code with 'gcc -Wall -Wextra -Wpedantic':
#include <stdio.h>
int main() {
int a = -1;
printf("%u", a);
}
it says nothing. The program displays 4294967295 instead of -1.
For that you need this:
$ gcc -Wall -pedantic -W -Wformat -Wformat-signedness printf.c
There is probably a good reason for that; passing a signed argument
to an unsigned conversion specifier de facto works fine, and
some code relies on it; i.e. the 4294967295 is what the programmer
wanted.
You often see that with %x, which also takes unsigned int;
the programmer wants -16 to come out as "FFFFFFF0", and not -10.
Someone with code like that might want to catch other problems with
printf calls, and not be bothered with those.
If compile this version (using %v) using a special extension:
#include <stdio.h>
int main() {
int a = -1;
printf("%v", a);
}
it shows -1. Which is better?
Both are undefined behavior. The latter is a documented extension
that works where it works, which is good.
Using %u with int /de facto/ works (and could also be a documented extension).
/de facto/ is weaker than documented. But on the other hand, /de facto/
works in more places than %v.
If you hit a library that doesn't have %v, it doesn't work at all.
I've never seen int passed to %d or %x not work in the manner you
would expect if int and unsigned int arguments were passed in
exactly the same way and subject to a reinterpretation of the bits.
On 07/02/2026 18:07, Kaz Kylheku wrote:
On 2026-02-06, Michael S <[email protected]> wrote:
On Fri, 6 Feb 2026 12:39:55 +0000
Bart <[email protected]> wrote:
On 06/02/2026 05:10, Keith Thompson wrote:
Bart <[email protected]> writes:
[...]
/Some/ compilers with /some/ options will /sometimes/ tell you when >>>>>> you've got it wrong.
But you first have to make an educated guess, or put in some dummy >>>>>> format code.
Eventually, it will compile. Until someone else builds your
program, using a slightly different set of headers where certain
types are defined, and then it might either give compiler messages >>>>>> that they have to fix, or it show wrong results.
That's not how I do it, and I don't think it's how most programmers
do it.
I know the rules well enough that I can usually write a correct
format string in the first place. If I make a mistake, gcc's
warnings are a nice check.
I guess you've never used printf-family functions via the FFI of
another language!
Vararg via FFI? Is it really a good idea?
I support it in TXR Lisp. However, it's limited in that the FFI
definition is nailed to a particular choice of arguments.
For instance we could make a function foo which takes two arguments:
a str and an int, and calls the variadic printf.
Then we can call (foo "%d" 42). It will call printf("%d", 42).
We cannot pass fewer or more than two arguments to foo, and they have to
be compatible with str and int.
Demo:
$ txr
This is the TXR Lisp interactive listener of TXR 302.
Quit with :quit or Ctrl-D on an empty line. Ctrl-X ? for cheatsheet.
(with-dyn-lib nil(deffi printf-int "printf" int (str : int)))
printf-int
(printf-int "%d\n" 42)42
3
42 is output; 3 is the result value (3 characters output).
The : syntax in the deffi macro call indicates the variadic list.
It's not the case that we can make a variadic Lisp function pass its arguments
as an arbitrarily long variadic list with arbitrary types to the wrapped FFI >> function. Fixed parameters must be declared after the colon.
There's another issue with calling variadic functions, unrelated to the number of args. I can't tell from the above whether it's convered.
Normally an arg that is passed in a register, will be passed in GPR for integer, or a floating point register if not.
But a variadic float argument has to be passed in both, so for Win64
ABI/x64 it might be in both rcx and xmm1. I think it is similar on SYS V
for both x64 and arm64 (maybe on the latter both are passed in the GPR;
I'd have to go and look it up).
A dynamic treatment could be arranged via a heavy weight wrapper mechanism which
dynamically analyzes the actual arguments, builds a libffi function descriptor
on the fly, then uses it to make the call; it could be wortwhile for someone,
but I didn't implement such a thing. Metaprogramming tricks revolving around
dynamically evaluating deffi are also possible.
My LIBFFI approach just uses assembly; it's the simplest way to do it.
(The LIBFFI 'C' library also uses assembly to do the tricky bits.)
There, for Win64 ABI, I found it easiest to just load all the register
args to both integer and float registers, whether the called function
was variadic or not. That's far more efficient than figuring out the
right register argument by argument.
I haven't implemented that for SYS V; that's more of a nightmare ABI
where up to 6-12 args (8-16 on aarch64) can be passed between int and
float registers depending on the mix of types.
On Win64 ABI, it is 4 args, always.
Bart <[email protected]> wrote:
On 07/02/2026 18:07, Kaz Kylheku wrote:
The : syntax in the deffi macro call indicates the variadic list.
It's not the case that we can make a variadic Lisp function pass its arguments
as an arbitrarily long variadic list with arbitrary types to the wrapped FFI
function. Fixed parameters must be declared after the colon.
There's another issue with calling variadic functions, unrelated to the
number of args. I can't tell from the above whether it's convered.
Normally an arg that is passed in a register, will be passed in GPR for
integer, or a floating point register if not.
But a variadic float argument has to be passed in both, so for Win64
ABI/x64 it might be in both rcx and xmm1. I think it is similar on SYS V
for both x64 and arm64 (maybe on the latter both are passed in the GPR;
I'd have to go and look it up).
In SYS V convention argument is passed in exactly one place. It may
be GPR, may be XMM register, may be on the stack. If you put right
thing in RAX, then your arguments are valid regardless if the function
is a vararg function or not.
A dynamic treatment could be arranged via a heavy weight wrapper mechanism which
dynamically analyzes the actual arguments, builds a libffi function descriptor
on the fly, then uses it to make the call; it could be wortwhile for someone,
but I didn't implement such a thing. Metaprogramming tricks revolving around
dynamically evaluating deffi are also possible.
My LIBFFI approach just uses assembly; it's the simplest way to do it.
(The LIBFFI 'C' library also uses assembly to do the tricky bits.)
There, for Win64 ABI, I found it easiest to just load all the register
args to both integer and float registers, whether the called function
was variadic or not. That's far more efficient than figuring out the
right register argument by argument.
I haven't implemented that for SYS V; that's more of a nightmare ABI
where up to 6-12 args
(8-16 on aarch64) can be passed between int and
float registers depending on the mix of types.
On Win64 ABI, it is 4 args, always.
My code works fine for SYS V on amd64 and arm32. I do not think FFI
for aarch64 will be any harder, but ATM I do not have code generator
for aarch64, no need for FFI there.
I did not bother with Windows, since I do not use it it would be
untested and hence buggy code anyway.
and hence buggy code anyway.
On 06/02/2026 12:47, Michael S wrote:
On Fri, 6 Feb 2026 12:39:55 +0000
Bart <[email protected]> wrote:
On 06/02/2026 05:10, Keith Thompson wrote:
Bart <[email protected]> writes:
[...]
/Some/ compilers with /some/ options will /sometimes/ tell you
when you've got it wrong.
But you first have to make an educated guess, or put in some
dummy format code.
Eventually, it will compile. Until someone else builds your
program, using a slightly different set of headers where certain
types are defined, and then it might either give compiler
messages that they have to fix, or it show wrong results.
That's not how I do it, and I don't think it's how most
programmers do it.
I know the rules well enough that I can usually write a correct
format string in the first place. If I make a mistake, gcc's
warnings are a nice check.
I guess you've never used printf-family functions via the FFI of
another language!
Vararg via FFI? Is it really a good idea?
It's covered by platform ABIs of both Windows and SYS V.
This is printf called from an interpreted language with dynamic
typing:
a := 12345678987654321
b := pi
c := "A"*10
d := &a
printf("%lld %f %s %p\n", a, b, c, d)
Output is:
12345678987654321 3.141593 AAAAAAAAAA 00000000036A1D48
(Strings are converted to zero-terminated form for the FFI.)
If I try it like this however:
printf("%lld %f %s %p\n", d, c, b, a)
It will go wrong (crashing inside the C library). With the built-in
print feature, that doesn't happen:
println a, b, c, d
println d, c, b, a
On Fri, 6 Feb 2026 13:04:34 +0000
Bart <[email protected]> wrote:
On 06/02/2026 12:47, Michael S wrote:
On Fri, 6 Feb 2026 12:39:55 +0000
Bart <[email protected]> wrote:
I guess you've never used printf-family functions via the FFI of
another language!
Vararg via FFI? Is it really a good idea?
[...][...]
I asked whether it is a good idea.
Is not it simpler for you and for your potential users to declare that
your language can not call external C functions with variable number of arguments? To me it does not sound like this ability is either necessary
or very valuable.
[...]
C's "varargs" mechanism always appeared kludgy to me. Though I
haven't followed "varargs" in "C" since K&R times, but I've a
faint impression that something has been done and changed since
back then. What was it, or is it still supported [only] in its
original form?
On Fri, 6 Feb 2026 13:04:34 +0000
Bart <[email protected]> wrote:
On 06/02/2026 12:47, Michael S wrote:
On Fri, 6 Feb 2026 12:39:55 +0000
Bart <[email protected]> wrote:
On 06/02/2026 05:10, Keith Thompson wrote:
Bart <[email protected]> writes:
[...]
/Some/ compilers with /some/ options will /sometimes/ tell you
when you've got it wrong.
But you first have to make an educated guess, or put in some
dummy format code.
Eventually, it will compile. Until someone else builds your
program, using a slightly different set of headers where certain
types are defined, and then it might either give compiler
messages that they have to fix, or it show wrong results.
That's not how I do it, and I don't think it's how most
programmers do it.
I know the rules well enough that I can usually write a correct
format string in the first place. If I make a mistake, gcc's
warnings are a nice check.
I guess you've never used printf-family functions via the FFI of
another language!
Vararg via FFI? Is it really a good idea?
It's covered by platform ABIs of both Windows and SYS V.
My question was not about technical possibility. I understand that with enough of effort everything is possible.
I asked whether it is a good idea.
Is not it simpler for you and for your potential users to declare that
your language can not call external C functions with variable number of arguments? To me it does not sound like this ability is either necessary
or very valuable.
Above I assume that we are talking about your scripting language.
W.r.t. your other language, I have no strong opinion. But my weak
opinion is that it also does not need it, possibly with exception for
ability to do few (very few, hopefully) historically idiotically
defined Unix system calls that can be handled individually as special
cases.
Michael S <[email protected]> wrote:
On Fri, 6 Feb 2026 13:04:34 +0000
Bart <[email protected]> wrote:
On 06/02/2026 12:47, Michael S wrote:
On Fri, 6 Feb 2026 12:39:55 +0000
Bart <[email protected]> wrote:
On 06/02/2026 05:10, Keith Thompson wrote:
Bart <[email protected]> writes:
[...]
/Some/ compilers with /some/ options will /sometimes/ tell you
when you've got it wrong.
But you first have to make an educated guess, or put in some
dummy format code.
Eventually, it will compile. Until someone else builds your
program, using a slightly different set of headers where certain >>>>>>> types are defined, and then it might either give compiler
messages that they have to fix, or it show wrong results.
That's not how I do it, and I don't think it's how most
programmers do it.
I know the rules well enough that I can usually write a correct
format string in the first place. If I make a mistake, gcc's
warnings are a nice check.
I guess you've never used printf-family functions via the FFI of
another language!
Vararg via FFI? Is it really a good idea?
It's covered by platform ABIs of both Windows and SYS V.
My question was not about technical possibility. I understand that with
enough of effort everything is possible.
I asked whether it is a good idea.
Is not it simpler for you and for your potential users to declare that
your language can not call external C functions with variable number of
arguments? To me it does not sound like this ability is either necessary
or very valuable.
Above I assume that we are talking about your scripting language.
W.r.t. your other language, I have no strong opinion. But my weak
opinion is that it also does not need it, possibly with exception for
ability to do few (very few, hopefully) historically idiotically
defined Unix system calls that can be handled individually as special
cases.
Well, some important interfaces depend on varargs functions. And
while you may handle such cases via user-written wrappers, it is
much nicer if FFI machinery handles varargs.
On 07/02/2026 22:48, Waldek Hebisch wrote:
Bart <[email protected]> wrote:
On 07/02/2026 18:07, Kaz Kylheku wrote:
The : syntax in the deffi macro call indicates the variadic list.
It's not the case that we can make a variadic Lisp function pass its arguments
as an arbitrarily long variadic list with arbitrary types to the wrapped FFI
function. Fixed parameters must be declared after the colon.
There's another issue with calling variadic functions, unrelated to the
number of args. I can't tell from the above whether it's convered.
Normally an arg that is passed in a register, will be passed in GPR for
integer, or a floating point register if not.
But a variadic float argument has to be passed in both, so for Win64
ABI/x64 it might be in both rcx and xmm1. I think it is similar on SYS V >>> for both x64 and arm64 (maybe on the latter both are passed in the GPR;
I'd have to go and look it up).
In SYS V convention argument is passed in exactly one place. It may
be GPR, may be XMM register, may be on the stack. If you put right
thing in RAX, then your arguments are valid regardless if the function
is a vararg function or not.
I had to go and check this, and you're right. SYS V does nothing special when calling variadic functions.
I guess that makes implementing the body of variadic functions harder,
since it doesn't know where to look for the n'th variadic argument
unless it knows the type.
And even then, because the int and non-int args are spilled to separate blocks, it has to keep track of where the next arg is in which block.
I think MS made the better call here; the necessary code is trivial for Win64 ABI.
A dynamic treatment could be arranged via a heavy weight wrapper mechanism which
dynamically analyzes the actual arguments, builds a libffi function descriptor
on the fly, then uses it to make the call; it could be wortwhile for someone,
but I didn't implement such a thing. Metaprogramming tricks revolving around
dynamically evaluating deffi are also possible.
My LIBFFI approach just uses assembly; it's the simplest way to do it.
(The LIBFFI 'C' library also uses assembly to do the tricky bits.)
There, for Win64 ABI, I found it easiest to just load all the register
args to both integer and float registers, whether the called function
was variadic or not. That's far more efficient than figuring out the
right register argument by argument.
I haven't implemented that for SYS V; that's more of a nightmare ABI
where up to 6-12 args
(Actually, 6-14 args; 6 max in GPRs and 8 in xmm regs)
(8-16 on aarch64) can be passed between int and
float registers depending on the mix of types.
On Win64 ABI, it is 4 args, always.
My code works fine for SYS V on amd64 and arm32. I do not think FFI
for aarch64 will be any harder, but ATM I do not have code generator
for aarch64, no need for FFI there.
I did not bother with Windows, since I do not use it it would be
untested and hence buggy code anyway.
I started generating code for ARM64, but gave up because it was too hard
and not fun (the RISC processor turned out to be a LOT more complex than
the CISC x64!).
The last straw was precisely to do with the SYS V call-conventions, and
I hadn't even gotten to variadic arguments yet, nor to structs passed by-value, where the rules are labyrinthine.
(I understand that neither LLVM or Cranelist backends support them
directly; they need to be dealt with earlier on, so the user of those
IRs needs to figure it out.)
and hence buggy code anyway.
I think you'd find it much simpler.
On Fri, 6 Feb 2026 13:04:34 +0000
Bart <[email protected]> wrote:
Vararg via FFI? Is it really a good idea?
It's covered by platform ABIs of both Windows and SYS V.
My question was not about technical possibility. I understand that with enough of effort everything is possible.
I asked whether it is a good idea.
Is not it simpler for you and for your potential users to declare that
your language can not call external C functions with variable number of arguments? To me it does not sound like this ability is either necessary
or very valuable.
Above I assume that we are talking about your scripting language.
W.r.t. your other language, I have no strong opinion. But my weak
opinion is that it also does not need it, possibly with exception for
ability to do few (very few, hopefully) historically idiotically
defined Unix system calls that can be handled individually as special
cases.
On 08/02/2026 17:50, Waldek Hebisch wrote:
Michael S <[email protected]> wrote:
On Fri, 6 Feb 2026 13:04:34 +0000
Bart <[email protected]> wrote:
On 06/02/2026 12:47, Michael S wrote:
On Fri, 6 Feb 2026 12:39:55 +0000
Bart <[email protected]> wrote:
On 06/02/2026 05:10, Keith Thompson wrote:
Bart <[email protected]> writes:
[...]
/Some/ compilers with /some/ options will /sometimes/ tell you >>>>>>>> when you've got it wrong.
But you first have to make an educated guess, or put in some
dummy format code.
Eventually, it will compile. Until someone else builds your
program, using a slightly different set of headers where certain >>>>>>>> types are defined, and then it might either give compiler
messages that they have to fix, or it show wrong results.
That's not how I do it, and I don't think it's how most
programmers do it.
I know the rules well enough that I can usually write a correct
format string in the first place. If I make a mistake, gcc's
warnings are a nice check.
I guess you've never used printf-family functions via the FFI of
another language!
Vararg via FFI? Is it really a good idea?
It's covered by platform ABIs of both Windows and SYS V.
My question was not about technical possibility. I understand that with
enough of effort everything is possible.
I asked whether it is a good idea.
Is not it simpler for you and for your potential users to declare that
your language can not call external C functions with variable number of
arguments? To me it does not sound like this ability is either necessary >>> or very valuable.
Above I assume that we are talking about your scripting language.
W.r.t. your other language, I have no strong opinion. But my weak
opinion is that it also does not need it, possibly with exception for
ability to do few (very few, hopefully) historically idiotically
defined Unix system calls that can be handled individually as special
cases.
Well, some important interfaces depend on varargs functions. And
while you may handle such cases via user-written wrappers, it is
much nicer if FFI machinery handles varargs.
There are only a few standard C functions that are variadic, but they
are quite important ones (with the *printf family at the top). But
there is little doubt that C's handling of variadic functions is very unsafe, and implementations are often challenging (the SysV ABI for
x86-64 is awkward and complex for variadic functions - the Win64 ABI is easier for variadics at the expense of making many other function calls
less efficient). It is a weak point in the design of C.
And for other languages trying to access C code via a FFI, variadic functions are going to be inefficient, complicated, and unsafe (unless
you are /very/ sure of the calling parameters). If your alternative language is safer, neater, or easier to write than C (and if it isn't,
why bother with it?), you'll want a better way to handle formatted
printing than C's printf.
On 08/02/2026 17:50, Waldek Hebisch wrote:
Michael S <[email protected]> wrote:
On Fri, 6 Feb 2026 13:04:34 +0000
Bart <[email protected]> wrote:
On 06/02/2026 12:47, Michael S wrote:
On Fri, 6 Feb 2026 12:39:55 +0000
Bart <[email protected]> wrote:
On 06/02/2026 05:10, Keith Thompson wrote:
Bart <[email protected]> writes:
[...]
/Some/ compilers with /some/ options will /sometimes/ tell you >>>>>>>> when you've got it wrong.
But you first have to make an educated guess, or put in some
dummy format code.
Eventually, it will compile. Until someone else builds your
program, using a slightly different set of headers where certain >>>>>>>> types are defined, and then it might either give compiler
messages that they have to fix, or it show wrong results.
That's not how I do it, and I don't think it's how most
programmers do it.
I know the rules well enough that I can usually write a correct
format string in the first place. If I make a mistake, gcc's
warnings are a nice check.
I guess you've never used printf-family functions via the FFI of
another language!
Vararg via FFI? Is it really a good idea?
It's covered by platform ABIs of both Windows and SYS V.
My question was not about technical possibility. I understand that with
enough of effort everything is possible.
I asked whether it is a good idea.
Is not it simpler for you and for your potential users to declare that
your language can not call external C functions with variable number of
arguments? To me it does not sound like this ability is either necessary >>> or very valuable.
Above I assume that we are talking about your scripting language.
W.r.t. your other language, I have no strong opinion. But my weak
opinion is that it also does not need it, possibly with exception for
ability to do few (very few, hopefully) historically idiotically
defined Unix system calls that can be handled individually as special
cases.
Well, some important interfaces depend on varargs functions. And
while you may handle such cases via user-written wrappers, it is
much nicer if FFI machinery handles varargs.
There are only a few standard C functions that are variadic, but they
are quite important ones (with the *printf family at the top). But
there is little doubt that C's handling of variadic functions is very unsafe, and implementations are often challenging (the SysV ABI for
x86-64 is awkward and complex for variadic functions - the Win64 ABI is easier for variadics at the expense of making many other function calls
less efficient). It is a weak point in the design of C.
And for other languages trying to access C code via a FFI, variadic functions are going to be inefficient, complicated, and unsafe (unless
you are /very/ sure of the calling parameters). If your alternative language is safer, neater, or easier to write than C (and if it isn't,
why bother with it?), you'll want a better way to handle formatted
printing than C's printf.
On 08/02/2026 17:55, David Brown wrote:
On 08/02/2026 17:50, Waldek Hebisch wrote:
Michael S <[email protected]> wrote:
On Fri, 6 Feb 2026 13:04:34 +0000
Bart <[email protected]> wrote:
On 06/02/2026 12:47, Michael S wrote:
On Fri, 6 Feb 2026 12:39:55 +0000
Bart <[email protected]> wrote:
On 06/02/2026 05:10, Keith Thompson wrote:
Bart <[email protected]> writes:
[...]
/Some/ compilers with /some/ options will /sometimes/ tell you >>>>>>>>> when you've got it wrong.
But you first have to make an educated guess, or put in some >>>>>>>>> dummy format code.
Eventually, it will compile. Until someone else builds your
program, using a slightly different set of headers where certain >>>>>>>>> types are defined, and then it might either give compiler
messages that they have to fix, or it show wrong results.
That's not how I do it, and I don't think it's how most
programmers do it.
I know the rules well enough that I can usually write a correct >>>>>>>> format string in the first place. If I make a mistake, gcc's >>>>>>>> warnings are a nice check.
I guess you've never used printf-family functions via the FFI of >>>>>>> another language!
Vararg via FFI? Is it really a good idea?
It's covered by platform ABIs of both Windows and SYS V.
My question was not about technical possibility. I understand that with >>>> enough of effort everything is possible.
I asked whether it is a good idea.
Is not it simpler for you and for your potential users to declare that >>>> your language can not call external C functions with variable number of >>>> arguments? To me it does not sound like this ability is either
necessary
or very valuable.
Above I assume that we are talking about your scripting language.
W.r.t. your other language, I have no strong opinion. But my weak
opinion is that it also does not need it, possibly with exception for
ability to do few (very few, hopefully) historically idiotically
defined Unix system calls that can be handled individually as special
cases.
Well, some important interfaces depend on varargs functions. And
while you may handle such cases via user-written wrappers, it is
much nicer if FFI machinery handles varargs.
There are only a few standard C functions that are variadic, but they
are quite important ones (with the *printf family at the top). But
there is little doubt that C's handling of variadic functions is very
unsafe, and implementations are often challenging (the SysV ABI for
x86-64 is awkward and complex for variadic functions - the Win64 ABI
is easier for variadics at the expense of making many other function
calls less efficient). It is a weak point in the design of C.
And for other languages trying to access C code via a FFI, variadic
functions are going to be inefficient, complicated, and unsafe (unless
you are /very/ sure of the calling parameters). If your alternative
language is safer, neater, or easier to write than C (and if it isn't,
why bother with it?), you'll want a better way to handle formatted
printing than C's printf.
For about the last 30 years I've used the C library to do i/o, as it was simpler than Win32 calls. (I didn't even know it was supposed to be for
C; it just look like an easier library with shorter function names, that also shipped with Windows.)
So while I do do most of my own conversion and formating, actual i/o,
and a few cases such as floating point which are fiddly, uses calls like this:
sprintf(s, "%f", x)
sscanf(str, "%lf%n", &x, &numlength)
printf("%.*s", length, s)
fprintf(f, "%.*s", length, s)
The last two are generally called when I have a buffer-full of output.
The sscanf call is the only I ever time use *scanf functions.
In the prior 15 years of course, I did have to do everything.
On 08/02/2026 17:50, Waldek Hebisch wrote:
Well, some important interfaces depend on varargs functions. And
while you may handle such cases via user-written wrappers, it is
much nicer if FFI machinery handles varargs.
There are only a few standard C functions that are variadic, but they
are quite important ones (with the *printf family at the top).
But
there is little doubt that C's handling of variadic functions is very unsafe, and implementations are often challenging (the SysV ABI for
x86-64 is awkward and complex for variadic functions - the Win64 ABI is easier for variadics at the expense of making many other function calls
less efficient). It is a weak point in the design of C.
And for other languages trying to access C code via a FFI, variadic functions are going to be inefficient, complicated, and unsafe (unless
you are /very/ sure of the calling parameters). If your alternative language is safer, neater, or easier to write than C (and if it isn't,
why bother with it?), you'll want a better way to handle formatted
printing than C's printf.
Bart <[email protected]> wrote:
On 07/02/2026 22:48, Waldek Hebisch wrote:
In SYS V convention argument is passed in exactly one place. It may
be GPR, may be XMM register, may be on the stack. If you put right
thing in RAX, then your arguments are valid regardless if the function
is a vararg function or not.
I had to go and check this, and you're right. SYS V does nothing special
when calling variadic functions.
Well, there is special thing: RAX should contain number of SSE
registers used for passing parameters. You do not need to set
RAX for normal calls (at least on Linux, some other systems
require it for all calls).
I guess that makes implementing the body of variadic functions harder,
since it doesn't know where to look for the n'th variadic argument
unless it knows the type.
Well, if a function wants to do actual computation with an argument
it should better know its type.
I started generating code for ARM64, but gave up because it was too hard
and not fun (the RISC processor turned out to be a LOT more complex than
the CISC x64!).
Well, RISC processor means that compiler have to do work which is
frequently done by hardware on a CISC. Concerning arm32, most
annoying for me was limited range of constants, especially limit
on offsets that can be part of an instruction. With my current implementation that puts something like 2kB limit on size of local
variables. And my generator mixes instructions and constant data
(otherwise it could not access constant data using limited available offsets), which works but compilcates code generator and probably
gives suboptimal performance.
The last straw was precisely to do with the SYS V call-conventions, and
I hadn't even gotten to variadic arguments yet, nor to structs passed
by-value, where the rules are labyrinthine.
My low-level code only handles scalar arguments. That includes pointer
to structures, but not structures passed by value. Structures passed by value could be handled by higher-level code, but up to now there was
no need to do this.
BTW, my amd64 code is assembler, so off-topic here, but arm32 code
is mostly C. I use two helper structures:
struct registers_buffer {
int i_reg[4];
union {double d; struct {float sl; float sh;} sf2;} f_reg[8];
};
typedef struct registers_buffer reg_buff;
typedef struct arg_state { int ni; int sfi; int dfi; int si;} arg_state;
C code fills 'reg_buff' with values and later low-level assembly
copies values from the buffer to registers. I allocate enough space on
the stack so that C code can write to the stack without risk of
stack overflow.
There are 3 helper routines:
On 08/02/2026 19:21, Waldek Hebisch wrote:
Bart <[email protected]> wrote:
On 07/02/2026 22:48, Waldek Hebisch wrote:
In SYS V convention argument is passed in exactly one place. It may
be GPR, may be XMM register, may be on the stack. If you put right
thing in RAX, then your arguments are valid regardless if the function >>>> is a vararg function or not.
I had to go and check this, and you're right. SYS V does nothing special >>> when calling variadic functions.
Well, there is special thing: RAX should contain number of SSE
registers used for passing parameters. You do not need to set
RAX for normal calls (at least on Linux, some other systems
require it for all calls).
I looked out for that but don't remember seeing in on godbolt.org, and I think it was for SYS V.
But I tried it again, and AL is being set to some count, which appears
to be the total number of float arguments (and rereading your comment,
you say the same thing).
I guess that makes implementing the body of variadic functions harder,
since it doesn't know where to look for the n'th variadic argument
unless it knows the type.
Well, if a function wants to do actual computation with an argument
it should better know its type.
On Windows, it will know the location of the next vararg and can access
its value before it knows the type. The user-provided type (eg.
'var_arg(p, int)') can simple do a type-punning cast on the value.
All args: fixed, variadic-reg, variadic-pushed, will also all be in consecutive stack slots, regardless of type (This is the real reason why floats should be loaded to GPRs for variadics: entry code just needs to spill those 4 GPRs, it anyway won't know the mix of types.)
I started generating code for ARM64, but gave up because it was too hard >>> and not fun (the RISC processor turned out to be a LOT more complex than >>> the CISC x64!).
Well, RISC processor means that compiler have to do work which is
frequently done by hardware on a CISC. Concerning arm32, most
annoying for me was limited range of constants, especially limit
on offsets that can be part of an instruction. With my current
implementation that puts something like 2kB limit on size of local
variables. And my generator mixes instructions and constant data
(otherwise it could not access constant data using limited available
offsets), which works but compilcates code generator and probably
gives suboptimal performance.
There are a dozen annoying things like this on arm64. Even when you give
up and decide to load 64-bit constants from a memory pool, you find you can't even directly access that pool as it has an absolute address. That
can involve first loading the page address (ie. minus lower 12 bits) to
R, then you have to use an address mode involving R and the lower 12
bits as an offset.
The last straw was precisely to do with the SYS V call-conventions, and
I hadn't even gotten to variadic arguments yet, nor to structs passed
by-value, where the rules are labyrinthine.
My low-level code only handles scalar arguments. That includes pointer
to structures, but not structures passed by value. Structures passed by
value could be handled by higher-level code, but up to now there was
no need to do this.
BTW, my amd64 code is assembler, so off-topic here, but arm32 code
is mostly C. I use two helper structures:
struct registers_buffer {
int i_reg[4];
union {double d; struct {float sl; float sh;} sf2;} f_reg[8];
};
typedef struct registers_buffer reg_buff;
typedef struct arg_state { int ni; int sfi; int dfi; int si;} arg_state;
C code fills 'reg_buff' with values and later low-level assembly
copies values from the buffer to registers. I allocate enough space on
the stack so that C code can write to the stack without risk of
stack overflow.
There are 3 helper routines:
This looks pretty complicated, but what is it for: is it still to do
with variadic functions, or is to with the LIBFFI problem?
On 07/02/2026 22:48, Waldek Hebisch wrote:
BTW, my amd64 code is assembler, so off-topic here, but arm32 code
is mostly C. I use two helper structures:
struct registers_buffer {
int i_reg[4];
union {double d; struct {float sl; float sh;} sf2;} f_reg[8];
};
typedef struct registers_buffer reg_buff;
typedef struct arg_state { int ni; int sfi; int dfi; int si;} arg_state;
C code fills 'reg_buff' with values and later low-level assembly
copies values from the buffer to registers. I allocate enough space on
the stack so that C code can write to the stack without risk of
stack overflow.
There are 3 helper routines:
This looks pretty complicated, but what is it for: is it still to do
with variadic functions, or is to with the LIBFFI problem?
Bart <[email protected]> wrote:
On 08/02/2026 19:21, Waldek Hebisch wrote:
Bart <[email protected]> wrote:
On 07/02/2026 22:48, Waldek Hebisch wrote:
In SYS V convention argument is passed in exactly one place. It may >>>>> be GPR, may be XMM register, may be on the stack. If you put right
thing in RAX, then your arguments are valid regardless if the function >>>>> is a vararg function or not.
I had to go and check this, and you're right. SYS V does nothing special >>>> when calling variadic functions.
Well, there is special thing: RAX should contain number of SSE
registers used for passing parameters. You do not need to set
RAX for normal calls (at least on Linux, some other systems
require it for all calls).
I looked out for that but don't remember seeing in on godbolt.org, and I
think it was for SYS V.
But I tried it again, and AL is being set to some count, which appears
to be the total number of float arguments (and rereading your comment,
you say the same thing).
I guess that makes implementing the body of variadic functions harder, >>>> since it doesn't know where to look for the n'th variadic argument
unless it knows the type.
Well, if a function wants to do actual computation with an argument
it should better know its type.
On Windows, it will know the location of the next vararg and can access
its value before it knows the type. The user-provided type (eg.
'var_arg(p, int)') can simple do a type-punning cast on the value.
All args: fixed, variadic-reg, variadic-pushed, will also all be in
consecutive stack slots, regardless of type (This is the real reason why
floats should be loaded to GPRs for variadics: entry code just needs to
spill those 4 GPRs, it anyway won't know the mix of types.)
I started generating code for ARM64, but gave up because it was too hard >>>> and not fun (the RISC processor turned out to be a LOT more complex than >>>> the CISC x64!).
Well, RISC processor means that compiler have to do work which is
frequently done by hardware on a CISC. Concerning arm32, most
annoying for me was limited range of constants, especially limit
on offsets that can be part of an instruction. With my current
implementation that puts something like 2kB limit on size of local
variables. And my generator mixes instructions and constant data
(otherwise it could not access constant data using limited available
offsets), which works but compilcates code generator and probably
gives suboptimal performance.
There are a dozen annoying things like this on arm64. Even when you give
up and decide to load 64-bit constants from a memory pool, you find you
can't even directly access that pool as it has an absolute address. That
can involve first loading the page address (ie. minus lower 12 bits) to
R, then you have to use an address mode involving R and the lower 12
bits as an offset.
As I wrote I generate constant pool as part of instruction stream.
I use PC-relative adressing so as long as constant is close
enough to instruction using it I can use short offsets.
There is some extra effort, normally I am trying to put constants
after unconditional jump and before next label, but I may need
extra jump to "jump around" constants.
The last straw was precisely to do with the SYS V call-conventions, and >>>> I hadn't even gotten to variadic arguments yet, nor to structs passed
by-value, where the rules are labyrinthine.
My low-level code only handles scalar arguments. That includes pointer
to structures, but not structures passed by value. Structures passed by >>> value could be handled by higher-level code, but up to now there was
no need to do this.
BTW, my amd64 code is assembler, so off-topic here, but arm32 code
is mostly C. I use two helper structures:
struct registers_buffer {
int i_reg[4];
union {double d; struct {float sl; float sh;} sf2;} f_reg[8];
};
typedef struct registers_buffer reg_buff;
typedef struct arg_state { int ni; int sfi; int dfi; int si;} arg_state; >>>
C code fills 'reg_buff' with values and later low-level assembly
copies values from the buffer to registers. I allocate enough space on
the stack so that C code can write to the stack without risk of
stack overflow.
There are 3 helper routines:
This looks pretty complicated, but what is it for: is it still to do
with variadic functions, or is to with the LIBFFI problem?
This is to handle a call, it does not matter variadic or not.
The call is from dynamically typed language and argument types are
known only at runtime
(actually, argument types _may_ be statically
known at higher level, but for simplicity "all" (see below) calls go
trough a single low-level routine that handles general dynamic case. Dispatcher routine (which I did not show) loops over arguments,
decodes their types and converts them to C representation. Then
it calls one of the 3 helper routines to place each argument in the buffer
or on the stack.
There is simpler integer only code which is mainly used to perform
low level system calls. This iterface do not convert arguments
(the assumption is that caller passes C-compatible representation)
and logic is simpler as it just puts what fits in registers and
the rest on the stack.
Both interfaces spill all registers used by calling language to
the stack before actual processing of the call. This is because
C code may perform a callback and callback may trigger garbage
On 09/02/2026 01:27, Waldek Hebisch wrote:
Bart <[email protected]> wrote:
On 08/02/2026 19:21, Waldek Hebisch wrote:
Bart <[email protected]> wrote:
On 07/02/2026 22:48, Waldek Hebisch wrote:
The last straw was precisely to do with the SYS V call-conventions, and >>>>> I hadn't even gotten to variadic arguments yet, nor to structs passed >>>>> by-value, where the rules are labyrinthine.
My low-level code only handles scalar arguments. That includes pointer >>>> to structures, but not structures passed by value. Structures passed by >>>> value could be handled by higher-level code, but up to now there was
no need to do this.
BTW, my amd64 code is assembler, so off-topic here, but arm32 code
is mostly C. I use two helper structures:
struct registers_buffer {
int i_reg[4];
union {double d; struct {float sl; float sh;} sf2;} f_reg[8];
};
typedef struct registers_buffer reg_buff;
typedef struct arg_state { int ni; int sfi; int dfi; int si;} arg_state; >>>>
C code fills 'reg_buff' with values and later low-level assembly
copies values from the buffer to registers. I allocate enough space on >>>> the stack so that C code can write to the stack without risk of
stack overflow.
There are 3 helper routines:
This looks pretty complicated, but what is it for: is it still to do
with variadic functions, or is to with the LIBFFI problem?
This is to handle a call, it does not matter variadic or not.
The call is from dynamically typed language and argument types are
known only at runtime
OK, so it is probably performing the LIBFFI thing, or rather what needs
to come before, which is to marshall the arguments into homogenous array
of values. The actual LIBFFI task needs some ASM support as you say.
In that case, my own code to do the same (not in C) is probably more fiddly:
https://github.com/sal55/langs/blob/master/calldll.m
The interpreter calls 'calldll()'. 'vartopacked()' does the translation
from tagged dynamic values to suitable FFI types. Here the args are assembled into an i64 array.
The actual call is done with 'os_calldllfunction' which uses inline assembly; that has been appended.
I also have a version of that in pure HLL code, but it only handles the
most common combinations. (I used that if transpiling to C.)
(actually, argument types _may_ be statically
known at higher level, but for simplicity "all" (see below) calls go
trough a single low-level routine that handles general dynamic case.
Dispatcher routine (which I did not show) loops over arguments,
decodes their types and converts them to C representation. Then
it calls one of the 3 helper routines to place each argument in the buffer >> or on the stack.
There is simpler integer only code which is mainly used to perform
low level system calls. This iterface do not convert arguments
(the assumption is that caller passes C-compatible representation)
and logic is simpler as it just puts what fits in registers and
the rest on the stack.
Both interfaces spill all registers used by calling language to
the stack before actual processing of the call. This is because
C code may perform a callback and callback may trigger garbage
So here the dynamic code is not interpreted?
That's a problem for me: I can't pass a the address of a callback
function which only exists as bytecode!
(For the purpose of running WinAPI GUI programs, I set up a special
callback function within the compiler, and that then has to start a new interpreter instance briefly.)
Tim Rentsch <[email protected]> writes:
Keith Thompson <[email protected]> writes:
[...]
I recently played around with an attempted framework using _Generic.
The goal was to be able to write something like
print(s(x), s(y), s(z));
where x, y, and z can be of more or less arbitrary types (integer,
floating-point char*). The problem I ran into was that only one of
the generic associations is evaluated (which one is determined at
compile time), but *all* of them have to be valid code.
That is annoying but it shouldn't be too hard to work around
it. To verify that hypothesis I wrote this test case:
#include <stdio.h>
#include <time.h>
#include <stdint.h>
#include "h/show.h"
int
main(){
[30 lines deleted]
show(
uc,sc,us,ss,ui,si,ul,sl,ull,sll,
c,f,d,ld,yes,no,u16,s16,uge32,sge32,
runtime,now,offset,uf32,sf32,
c * now / 1e8 * ld,
foo, bas
);
printf( "\n" );
return 0;
}
which compiles under C11 and (along with the show.h include file)
produces output:
uc = 255
sc = -1
us = 65535
[23 lines deleted]
foo = "foo"
bas = (const char *) "bas"
Were you planning to show us what show.h looks like?
Tim Rentsch <[email protected]> writes:
Keith Thompson <[email protected]> writes:
David Brown <[email protected]> writes:
[...]
C23 includes length specifiers with explicit bit counts, so "%w32u" is >>>> for an unsigned integer argument of 32 bits:
"""
wN Specifies that a following b, B, d, i, o, u, x, or X conversion
specifier applies to an integer argument with a specific width
where N is a positive decimal integer with no leading zeros
(the argument will have been promoted according to the integer
promotions, but its value shall be converted to the unpromoted
type); or that a following n conversion specifier applies to a
pointer to an integer type argument with a width of N bits. All
minimum-width integer types (7.22.1.2) and exact-width integer
types (7.22.1.1) defined in the header <stdint.h> shall be
supported. Other supported values of N are implementation-defined.
"""
That looks to me that it would be a correct specifier for uint32_t,
Yes, so for example this:
uint32_t n = 42;
printf("n = %w32u\n", n);
is correct, if I'm reading it correctly. It's also correct for
uint_least32_t, which is expected to be the same type as uint32_t
if the latter exists. There's also support for the [u]int_fastN_t
types, using for example "%wf32u" in place of "%w32u".
and should also be fully defined behaviour for unsigned int and
unsigned long if these are 32 bits wide.
No, I don't think C23 says that.
Right, it doesn't.
If int and long happen to be the same
width, they are still incompatible, and there is no printf format
specifier that has defined behavior for both.
That first sentence is a bit ambiguous
wN Specifies that a following b, B, d, i, o, u, x, or X conversion
specifier applies to an integer argument with a specific width ... >>>
but I don't think it means that it must accept *any* integer type
of the specified width.
As I read the standard there is no ambiguity. The first sentence
says what the length modifier means. The second sentence says
which types (if any) correspond to the description in the first
sentence.
The descriptions for all the other length modifiers name the types
to which they apply in the first sentence. "hh" applies to signed
char or unsigned char, "l" applies to long int or unsigned long int,
"z" applies to size_t, and so forth. The first sentence of the
description for "wN" says it "applies to an integer argument with
a specific width".
The intent is that "%w32d" applies to an argument of type
int_least32_t or int32_t (if the latter exists, it must be the same
type as the former).
Suppose, hypothetically, that it had been the intent that "%w32d"
applies to *any* signed integer type with a width of 32 bits (e.g.,
both int and long if both are 32 bits wide). I think that the
current wording could express that intent. The second sentence
could taken as a clarification rather than a restriction. (An
irrelevant aside: That might actually be a nice feature.)
Assume an implementation with 32-bit int, 32-bit long, and 64-bit
long long, where int32_t and int64_t are int and long long,
respectively (e.g., "gcc -m32" with glibc on 64-bit Ubuntu), so
none of the intN_t types are defined as long. Then this:
printf("%w32d\n", 0L);
has undefined behavior if we assume (as I do) that "%w32d" applies
only to the type defined as int32_t (and int_least32_t). But the
0L argument *is* "an integer argument with a specific width", and
the following sentence "All minimum-width integer types (7.22.1.2)
and exact-width integer types (7.22.1.1) defined in the header
<stdint.h> shall be supported." does not contradict that.
I think the phrase "an integer argument with a specific width" was
an attempt to describe a specific set of types, but it was worded
in a way that applies a larger set of types. I think the following
sentence is not sufficiently clear in its attempt to restrict the
list of applicable types.
I understand the intent. Adding a format string that can apply
to distinct incompatible types would be a major change that would
surely be discussed in greater detail. But the current wording
does not clearly express that intent, [...]
On 2026-01-07 08:02, Tim Rentsch wrote:
James Kuyper <[email protected]> writes:
On 2026-01-05 03:17, Andrey Tarasevich wrote:
...
You can't. As far as the language is concerned, `time_t` is intended
to be an opaque type. It has to be a real type, ...
In C99, it was only required to be an arithmetic type. I pointed out
that this would permit it to be, for example, double _Imaginary. [...]
It's hard to imagine how time_t being an imaginary type could
provide the semantics described in the C standard for time_t.
You'll need to elaborate on that. time_t is an opaque type which
could, on one implementation, have been long double. Another
implementation could have stored the same value as the imaginary
component of _Imaginary long double, and could work with that value
the same way as the first one. [...]
"James Russell Kuyper Jr." <[email protected]> writes:
On 2026-01-07 08:02, Tim Rentsch wrote:
James Kuyper <[email protected]> writes:
On 2026-01-05 03:17, Andrey Tarasevich wrote:It's hard to imagine how time_t being an imaginary type could
You can't. As far as the language is concerned, `time_t` is intended >>>>> to be an opaque type. It has to be a real type, ...
In C99, it was only required to be an arithmetic type. I pointed out
that this would permit it to be, for example, double _Imaginary. [...] >>>
provide the semantics described in the C standard for time_t.
You'll need to elaborate on that. time_t is an opaque type which
could, on one implementation, have been long double. Another
implementation could have stored the same value as the imaginary
component of _Imaginary long double, and could work with that value
the same way as the first one. [...]
The C standard doesn't say that time_t is an opaque type.
Besides all of that, _Imaginary types don't satisify the condition
for time_t to be an arithmetic type. Annex G says the imaginary
types are floating types, but in C99 Annex G is informative, not
normative.
"James Russell Kuyper Jr." <[email protected]> writes:
On 2026-01-07 08:02, Tim Rentsch wrote:
James Kuyper <[email protected]> writes:
On 2026-01-05 03:17, Andrey Tarasevich wrote:It's hard to imagine how time_t being an imaginary type could
You can't. As far as the language is concerned, `time_t` is intended >>>>> to be an opaque type. It has to be a real type, ...
In C99, it was only required to be an arithmetic type. I pointed out
that this would permit it to be, for example, double _Imaginary. [...] >>>
provide the semantics described in the C standard for time_t.
You'll need to elaborate on that. time_t is an opaque type which
could, on one implementation, have been long double. Another
implementation could have stored the same value as the imaginary
component of _Imaginary long double, and could work with that value
the same way as the first one. [...]
The C standard doesn't say that time_t is an opaque type.
Besides all of that, _Imaginary types don't satisify the condition
for time_t to be an arithmetic type. Annex G says the imaginary
types are floating types, but in C99 Annex G is informative, not
normative.
Tim Rentsch <[email protected]> writes:[...]
Besides all of that, _Imaginary types don't satisify the condition
for time_t to be an arithmetic type. Annex G says the imaginary
types are floating types, but in C99 Annex G is informative, not
normative.
True, but if _Imaginary types are supported by an implementation, Annex
G describes what it is that is being supported. I don't see a problem,
though I'm sure that you do.
I hadn't remembered that the imaginary types were removed from the draft version of C202X. They were still present in n3220.pdf, dated 2024. In
that version, Annex G (which describes imaginary types) is described as normative. 7.3.1p6 indicates that they were optional in that version. However, if they were supported, they did meet the requirements of an arithmetic type.
| Sysop: | DaiTengu |
|---|---|
| Location: | Appleton, WI |
| Users: | 1,099 |
| Nodes: | 10 (0 / 10) |
| Uptime: | 492379:04:54 |
| Calls: | 14,106 |
| Calls today: | 2 |
| Files: | 187,124 |
| D/L today: |
2,546 files (1,099M bytes) |
| Messages: | 2,496,244 |