• Re: UB or not UB? was: On Undefined Behavior

    From James Russell Kuyper Jr.@[email protected] to comp.lang.c on Tue Jan 13 22:20:03 2026
    From Newsgroup: comp.lang.c

    On 2026-01-12 12:36, Michael S wrote:
    On Mon, 12 Jan 2026 08:03:31 -0800
    Andrey Tarasevich <[email protected]> wrote:

    On Mon 1/12/2026 6:28 AM, Michael S wrote:

    According to C Standard, access to p->table[4] in foo1() is UB.
    ...
    Now the question.
    What The Standard says about foo2() ? Is there UB in foo2() as
    well?
    ...
    gcc code generator does not think so.

    When the behavior is undefined, there's no such thing as incorrect
    generated code. In particular, undefined behavior includes the
    possibility of your code producing precisely the same behavior that you incorrectly thought it was required to have.

    Do you have citation from the Standard?

    table[4] is defined as equivalent to *(table+4), and and the relevant
    rule for that expression is "If the addition or subtraction produces
    an overflow, the behavior is undefined." (6.5.7p9)

    ...
    But I was interested in the "opinion" of C Standard rather than of gcc compiler.
    Is it full nasal UB or merely "implementation-defined behavior"?

    UB.

    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From James Russell Kuyper Jr.@[email protected] to comp.lang.c on Tue Jan 13 22:20:28 2026
    From Newsgroup: comp.lang.c

    On 2026-01-12 15:41, Michael S wrote:
    ...
    May be. But it's not expressed by gcc code generator or by any wranings.
    So, how can we know?

    It's impossible to determine whether the behavior of a piece of code is defined or undefined by examining the output of code generator, because there's nothing that a code generator is required to do when the
    behavior is undefined, that it's not allowed to do when the behavior is defined (and vice-versa). The only way to determine whether the behavior
    is defined is by examining the code and understanding what the relevant clauses of the standard say about it's behavior.

    ...
    I am reading N3220 draft https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3220.pdf
    Here section 6.5.6 has no paragraph 8 :(

    The latest version is n3685, but the behavior is still undefined.

    There is amplification in Annex J.2, roughly three pages
    after the start of J.2. You can search for "an array
    subscript is out of range", where there is a clarifying
    example.

    I see the following text:
    "An array subscript is out of range, even if an object is apparently accessible with the given subscript (as in the lvalue expression
    a[1][7] given the declaration int a[4][5]) (6.5.7)."

    That's what you had in mind?

    Yes, a[1][7] is defined by the standard as being equivalent to
    *(*(a+1)+7). The +7 produces the overflow referred to in 6.5.7p9,
    because 7 is greater than the 5 in "int a[4][5]", which makes it clear
    that it's the length of the sub-array that matters, the fact that
    there's another sub-array immediately following it does not render the behavior defined.

    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From James Russell Kuyper Jr.@[email protected] to comp.lang.c on Tue Jan 13 22:21:00 2026
    From Newsgroup: comp.lang.c

    On 2026-01-13 04:31, Michael S wrote:
    On Mon, 12 Jan 2026 21:09:25 -0500
    "James Russell Kuyper Jr." <[email protected]> wrote:

    On 2026-01-12 15:02, Scott Lurndal wrote:
    Michael S <[email protected]> writes:
    On Mon, 12 Jan 2026 15:58:15 +0000
    bart <[email protected]> wrote:
    On 12/01/2026 14:28, Michael S wrote:
    ...
    struct bar1 {
    int table[4];
    int other_table[4];
    };
    ...
    So you want to deliberately read one element past the end because
    you know it will be the first element of other_table?

    Yes. I primarily want it for multi-dimensional arrays.

    So declare it as int table[4][4].


    Note that this suggestion does not make the behavior defined. It is
    undefined behavior to make dereference table[0]+4, and it is
    undefined behavior to make any use of table[0]+5.


    Pay attention that Scott didn't suggest that dereferencing table[0][4]
    in his example is defined.
    Not that I understood what he wanted to suggest :(
    That's the difference - I did understand. In your struct, other_table is
    not required to immediately follow table, but in the 2D array, table[0]
    is guaranteed to follow table[1]. That's not sufficient to make
    table[0][5] have defined behavior, but many people are unaware of that,
    or are willing to take the chance.

    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From antispam@[email protected] (Waldek Hebisch) to comp.lang.c on Wed Jan 14 03:57:41 2026
    From Newsgroup: comp.lang.c

    Kaz Kylheku <[email protected]> wrote:
    On 2026-01-10, Michael S <[email protected]> wrote:
    On Fri, 9 Jan 2026 20:14:04 -0000 (UTC)
    Kaz Kylheku <[email protected]> wrote:

    On 2026-01-09, Michael S <[email protected]> wrote:
    On Fri, 09 Jan 2026 01:42:53 -0800
    Tim Rentsch <[email protected]> wrote:


    The important thing to realize is that the fundamental issue here
    is not a technical question but a social question. In effect what
    you are asking is "why doesn't gcc (or clang, or whatever) do what
    I want or expect?". The answer is different people want or expect
    different things. For some people the behavior described is
    egregiously wrong and must be corrected immediately. For other
    people the compiler is acting just as they think it should,
    nothing to see here, just fix the code and move on to the next
    bug. Different people have different priorities.


    I have hard time imagining sort of people that would have
    objections in case compiler generates the same code as today, but
    issues diagnostic.

    If false positives occur for the diagnostic frequently, there
    will be legitimate complaint.

    If there is only a simple switch for it, it will get turned off
    and then it no longer serves its purpose of catching errors.

    There are all kinds of optimizations compilers commonly do that could
    also be erroneous situations. For instance, eliminating dead code.


    <snip>

    I am not talking about some general abstraction, but about specific
    case.
    You example is irrelevant.
    -Warray-bounds exists for a long time.
    -Warray-bounds=1 is a part of -Wall set.

    In your particular example, it is crystal clear that the "return 0"
    statement is elided away due to being considered unreachable, and the
    only reason for that can be undefined behavior, and the only undefined behavior is accessing the array out of bounds.

    The compiler has decided to use the undefined behavior of the OOB array access as an unreachable() assertion, and at the same time neglected to
    issue the -Warray-bounds diagnostic which is expected to be issued for
    OOB access situations that the compiler can identify.

    No one can claim that the OOB situation in the code has escaped identification, because a code-eliminating optimization was predicated
    on it.

    It looks as if the logic for identifying OOB accesses for diagnosis is
    out of sync with the logic for identifying OOB accesses as assertions of undefined behavior.

    AFAIK gcc warning machinery depends on information gathered during optimization. In this case reasonable guess is that optimizer
    deleted offending access before warning machinery could see it.
    I do not know how hard is to fix this.
    --
    Waldek Hebisch
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Keith Thompson@[email protected] to comp.lang.c on Tue Jan 13 22:02:46 2026
    From Newsgroup: comp.lang.c

    "James Russell Kuyper Jr." <[email protected]> writes:
    On 2026-01-13 16:54, Tristan Wibberley wrote:
    [...]
    IIRC indexing a table follows the rules of pointers and doing so
    outside of a table's bounds is generally U/B except for very peculiar
    specific cases. You can do it in a struct across members /sometimes/
    because a struct is a single object. ...

    No, there is no such exception in the standard. It is still undefined behavior. One of the most annoying ways undefined behavior can
    manifest is that you get exactly the same behavior that you
    incorrectly thought you were guaranteed to get. That's a problem,
    because it can leave you unaware of your error.
    [...]

    Perhaps the exception Tristan was referring to (though it doesn't apply
    to indexing) is this, in N3220 6.5.10p7:

    Two pointers compare equal if and only if both are null pointers,
    both are pointers to the same object (including a pointer to an
    object and a subobject at its beginning) or function, both are
    pointers to one past the last element of the same array object,
    or one is a pointer to one past the end of one array object and
    the other is a pointer to the start of a different array object
    that happens to immediately follow the first array object in
    the address space.

    with a footnote:

    Two objects can be adjacent in memory because they are adjacent
    elements of a larger array or adjacent members of a structure
    with no padding between them, or because the implementation
    chose to place them so, even though they are unrelated. If prior
    invalid pointer operations (such as accesses outside array
    bounds) produced undefined behavior, subsequent comparisons
    also produce undefined behavior.

    The idea, I think, is that without that paragraph, given something like
    this:

    #include <stdio.h>
    int main(void) {
    struct {
    int a[10];
    int b[10];
    } obj;

    printf("obj.a+10 %s obj.b\n",
    obj.a+10 == obj.b ? "==" : "!=");
    }

    the compiler would have to go out of its way to treat obj.a+10 and obj.b
    as unequal. (The output on my system is "obj.a+10 == obj.b", but the
    pointers could be unequal if there's padding between a and b -- which is unlikely.)

    (I reported a relevant bug in gcc, where for objects that happen to be
    adjacent the addresses are reported as unequal with -O1 or higher; the
    gcc maintainers disagreed. <https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63611>)
    --
    Keith Thompson (The_Other_Keith) [email protected]
    void Void(void) { Void(); } /* The recursive call of the void */
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Tristan Wibberley@[email protected] to comp.lang.c on Wed Jan 14 08:06:50 2026
    From Newsgroup: comp.lang.c

    On 13/01/2026 23:53, Tristan Wibberley wrote:
    Combining these, and padding requirements, you can definedly reach

    I recall padding requirements for the extremes of array object types
    from discussions on usenet years ago, however, perhaps they were for C++ because I can find nonsuch, nonsuch at all, not even the slightest peep,
    in the standard final-drafts. There are several lingering evidences of
    the requirement to have no padding even at the extremes of arrays with
    element size 2 and above but all direct statement of such is missing.
    The lingering evidence leaves a nondeterminism or unspecified nature to
    some matters such as whether sizeof includes any padding at the extremes
    of an array or not, while it is explicit about the matter for structs
    and unions.

    Even the example in the drafts of the use of sizeof array/sizeof
    array[0] to determine the number of objects is excluded for arrays with elements of size 1 due to being unspecified by the standard by the
    decree of the limitation in the section on representation that all representation constraints are found in that section alone and are
    otherwise unspecified.

    No constraints on representation of arrays is provided in any way
    because the contiguousness of elements is mooted outside the
    representation subclause, as is the sizeof trick, and the sizeof trick
    can only work if the array is represented as contiguous representations
    of its elements and is represented with no padding at its extremes,
    neither of which is stipulated in the representation subclause that
    allows stipulations on the matter only within its own bounds.

    Furthermore, the representation stipulation horizon is in the "General" subclause preventing the "integer types" subclause from effecting specifications of representations.

    If the drafts I can see actually reflect the standards as they were
    rather than merely a history of them as the history now is then it was
    always impractical to use ISO C anywhere a system had to be safe to use
    and nearly all advice from outside the standard was unreliable and some
    of the advice within it. An implementers document for C implementations
    that ought be implemented but ought not have any programs to translate.

    I have to recommend avoiding it everywhere that matters.
    --
    Tristan Wibberley

    The message body is Copyright (C) 2026 Tristan Wibberley except
    citations and quotations noted. All Rights Reserved except that you may,
    of course, cite it academically giving credit to me, distribute it
    verbatim as part of a usenet system or its archives, and use it to
    promote my greatness and general superiority without misrepresentation
    of my opinions other than my opinion of my greatness and general
    superiority which you _may_ misrepresent. You definitely MAY NOT train
    any production AI system with it but you may train experimental AI that
    will only be used for evaluation of the AI methods it implements.

    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From David Brown@[email protected] to comp.lang.c on Wed Jan 14 09:35:22 2026
    From Newsgroup: comp.lang.c

    On 14/01/2026 04:19, James Russell Kuyper Jr. wrote:
    On 2026-01-12 13:08, Michael S wrote:
    On Mon, 12 Jan 2026 15:58:15 +0000
    bart <[email protected]> wrote:
    ...
    struct bar1 {
    union {
    struct {
    int table[4];
    int other_table[4];
    };
    int xtable[8];
    };
    };
    ...
    I'm not even sure about there being no padding between .table and
    .other_table.

    Considering that they both 'int' I don't think that it could happen,
    even in standard C.

    "Each non-bit-field member of a structure or union object is aligned in
    an implementation-defined manner appropriate to its type." (6.7.3.2p16)
    "... There can be unnamed padding within a structure object, but not
    at its beginning." (6.7.3.2p17)


    Does this allow different alignment rules for a type when it is
    stand-alone, in an array, or in a struct? I don't think so - I have
    always interpreted this to mean that the alignment is tied to the type,
    not where the type is used.

    Thus if "int" has 4-byte size and 4-byte alignment, and you have :

    struct X {
    char a;
    int b;
    int c;
    int ds[4];
    }

    then there will be 3 bytes of padding between "a" and "b", but cannot be
    any between "b" and "c" or between "c" and "ds".

    Even if you have a weird system that has, say, 3-byte "int" with 4-byte alignment, where you would have a byte of padding between "b" and "c",
    you would have the same padding there as between "ds[0]" and "ds[1]".

    (None of this means you are allowed to access data with "p[i]" or "p +
    i" outside of the range of the object that "p" points to or into.)

    While I can't think of any good reason for an implementation to insert padding between those objects, it would not violate any requirement of
    the standard if one did.


    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Michael S@[email protected] to comp.lang.c on Wed Jan 14 10:47:21 2026
    From Newsgroup: comp.lang.c

    On Tue, 13 Jan 2026 23:31:26 -0000 (UTC)
    Kaz Kylheku <[email protected]> wrote:

    On 2026-01-10, Michael S <[email protected]> wrote:
    On Fri, 9 Jan 2026 20:14:04 -0000 (UTC)
    Kaz Kylheku <[email protected]> wrote:

    On 2026-01-09, Michael S <[email protected]> wrote:
    On Fri, 09 Jan 2026 01:42:53 -0800
    Tim Rentsch <[email protected]> wrote:


    The important thing to realize is that the fundamental issue
    here is not a technical question but a social question. In
    effect what you are asking is "why doesn't gcc (or clang, or
    whatever) do what I want or expect?". The answer is different
    people want or expect different things. For some people the
    behavior described is egregiously wrong and must be corrected
    immediately. For other people the compiler is acting just as
    they think it should, nothing to see here, just fix the code
    and move on to the next bug. Different people have different
    priorities.

    I have hard time imagining sort of people that would have
    objections in case compiler generates the same code as today, but
    issues diagnostic.

    If false positives occur for the diagnostic frequently, there
    will be legitimate complaint.

    If there is only a simple switch for it, it will get turned off
    and then it no longer serves its purpose of catching errors.

    There are all kinds of optimizations compilers commonly do that
    could also be erroneous situations. For instance, eliminating dead
    code.

    <snip>

    I am not talking about some general abstraction, but about specific
    case.
    You example is irrelevant.
    -Warray-bounds exists for a long time.
    -Warray-bounds=1 is a part of -Wall set.

    In your particular example, it is crystal clear that the "return 0"
    statement is elided away due to being considered unreachable, and the
    only reason for that can be undefined behavior, and the only undefined behavior is accessing the array out of bounds.

    The compiler has decided to use the undefined behavior of the OOB
    array access as an unreachable() assertion, and at the same time
    neglected to issue the -Warray-bounds diagnostic which is expected to
    be issued for OOB access situations that the compiler can identify.

    No one can claim that the OOB situation in the code has escaped identification, because a code-eliminating optimization was predicated
    on it.

    It looks as if the logic for identifying OOB accesses for diagnosis is
    out of sync with the logic for identifying OOB accesses as assertions
    of undefined behavior.

    In some situations, a surprising optimization occurs not because of
    undefined behavior, but because the compiler is assuming well-defined behavior (absence of UB).

    That's not the case here; it is relying on the presence of UB.

    Or rather, it is relyiing on the absence of UB in an assinine way:
    it is assuming that the program does not reach the out-of-bounds
    access, because the sought-after value is found in the array.

    But that reasoning requires awareness of the existence of the
    out-of-bounds access.

    That's the crux of the issue there.

    There is an unreachable() assertion in modern C. And it works by
    invoking undefined behavior; it means "let's have undefined behavior
    in this spot of the code". And then, since the compiler assumes
    behavior is well-defined, assumes that that statement is not reached,
    nor anything after it, and can eliminate it.

    The problem is that an OOB array access should not be treated
    as the same thing, as if it were unreachable(). Or, rather, no,
    sure it's okay to treat an OOB arrary access as unreachable() --- IF
    you generate the diagonstic about OOB array access that you
    were asked to generate!!!


    Would you be so kind to submit a bug report to gcc bugzilla?
    In theory, I can do it myself, but I have a tendency to be lazy.








    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Tristan Wibberley@[email protected] to comp.lang.c on Wed Jan 14 14:24:08 2026
    From Newsgroup: comp.lang.c

    On 14/01/2026 06:02, Keith Thompson wrote:
    Perhaps the exception Tristan was referring to (though it doesn't apply
    to indexing) is this, in N3220 6.5.10p7:

    I think I was referring either to C++ or to an inference somebody else
    had made once upon a time - or else the historical final-drafts
    documents are different now than they were :/ Possibly K&R C specified something more defined than ISO C, I have lent my book out so I can't
    check. It was supposedly updated to ISO C. Possibly it was pre-K&R C or implementation specific that I picked up as a teen and my uni professor cemented it in my mind as correct C when he gave me 100% for an
    assignment in which I used it - and he was a stickler regarding
    undefined behaviour.

    Furthermore, to prevent similar lingering misconceptions indexing is not specified in the standards I'm looking at beyond that it's just pointer arithmetic - that is, a pointer derived from an array-name may not be
    used with pointer arithmetic /alone/ that adjusts the pointer down by
    any nor up by more than the number of elements in the array + 1, and if adjusted up by num_elements it can't be used with *. Note that in some standards' final-drafts I see that intermediate (perhaps, generally, non-lvalue) pointers may be created by pointer arithmetic when they are de-created again - and perhaps the pattern of arithmetic is very limited.

    Some standard version says if you convert it to a large enough integer
    type and back then it is not undefined behaviour but
    "implementation-specific" instead, which is a new term on me ("implementation-specific" is not the same term as "implementation
    specified").


    ... pointer equality snipped ...


    The idea, I think, is that without that paragraph, given something like
    this:

    #include <stdio.h>
    int main(void) {
    struct {
    int a[10];
    int b[10];
    } obj;

    printf("obj.a+10 %s obj.b\n",
    obj.a+10 == obj.b ? "==" : "!=");
    }

    the compiler would have to go out of its way to treat obj.a+10 and obj.b
    as unequal

    No it wouldn't. The standard could have just made the comparison
    undefined behaviour or unspecified, or implementation specified in all
    those cases when dereferencing was undefined or unspecified.
    --
    Tristan Wibberley

    The message body is Copyright (C) 2026 Tristan Wibberley except
    citations and quotations noted. All Rights Reserved except that you may,
    of course, cite it academically giving credit to me, distribute it
    verbatim as part of a usenet system or its archives, and use it to
    promote my greatness and general superiority without misrepresentation
    of my opinions other than my opinion of my greatness and general
    superiority which you _may_ misrepresent. You definitely MAY NOT train
    any production AI system with it but you may train experimental AI that
    will only be used for evaluation of the AI methods it implements.

    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Michael S@[email protected] to comp.lang.c on Wed Jan 14 16:48:24 2026
    From Newsgroup: comp.lang.c

    On Wed, 14 Jan 2026 14:24:08 +0000
    Tristan Wibberley <[email protected]>
    wrote:

    On 14/01/2026 06:02, Keith Thompson wrote:


    The idea, I think, is that without that paragraph, given something
    like this:

    #include <stdio.h>
    int main(void) {
    struct {
    int a[10];
    int b[10];
    } obj;

    printf("obj.a+10 %s obj.b\n",
    obj.a+10 == obj.b ? "==" : "!=");
    }

    the compiler would have to go out of its way to treat obj.a+10 and
    obj.b as unequal

    No it wouldn't. The standard could have just made the comparison
    undefined behaviour or unspecified, or implementation specified in all
    those cases when dereferencing was undefined or unspecified.


    In this particular case both pointers are defined and there is no dereferencing.

    The issues as one above are treated in depth in this paper: https://gustedt.wordpress.com/2025/06/30/the-provenance-memory-model-for-c/ Which I naturally didn't read.

    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Tristan Wibberley@[email protected] to comp.lang.c on Wed Jan 14 14:49:06 2026
    From Newsgroup: comp.lang.c

    On 13/01/2026 23:31, Kaz Kylheku wrote:
    No one can claim that the OOB situation in the code has escaped identification, because a code-eliminating optimization was predicated
    on it.

    One can. "Identification" means that inference or definition happened of
    a proposition that two are the same. Here, merely behaviour was affected consistent with some of the consequences of identification, which is weaker.
    --
    Tristan Wibberley

    The message body is Copyright (C) 2026 Tristan Wibberley except
    citations and quotations noted. All Rights Reserved except that you may,
    of course, cite it academically giving credit to me, distribute it
    verbatim as part of a usenet system or its archives, and use it to
    promote my greatness and general superiority without misrepresentation
    of my opinions other than my opinion of my greatness and general
    superiority which you _may_ misrepresent. You definitely MAY NOT train
    any production AI system with it but you may train experimental AI that
    will only be used for evaluation of the AI methods it implements.

    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From antispam@[email protected] (Waldek Hebisch) to comp.lang.c on Wed Jan 14 17:23:25 2026
    From Newsgroup: comp.lang.c

    David Brown <[email protected]> wrote:
    On 14/01/2026 04:19, James Russell Kuyper Jr. wrote:
    On 2026-01-12 13:08, Michael S wrote:
    On Mon, 12 Jan 2026 15:58:15 +0000
    bart <[email protected]> wrote:
    ...
    struct bar1 {
    union {
    struct {
    int table[4];
    int other_table[4];
    };
    int xtable[8];
    };
    };
    ...
    I'm not even sure about there being no padding between .table and
    .other_table.

    Considering that they both 'int' I don't think that it could happen,
    even in standard C.

    "Each non-bit-field member of a structure or union object is aligned in
    an implementation-defined manner appropriate to its type." (6.7.3.2p16)
    "... There can be unnamed padding within a structure object, but not
    at its beginning." (6.7.3.2p17)


    Does this allow different alignment rules for a type when it is
    stand-alone, in an array, or in a struct? I don't think so - I have
    always interpreted this to mean that the alignment is tied to the type,
    not where the type is used.

    Thus if "int" has 4-byte size and 4-byte alignment, and you have :

    struct X {
    char a;
    int b;
    int c;
    int ds[4];
    }

    then there will be 3 bytes of padding between "a" and "b", but cannot be
    any between "b" and "c" or between "c" and "ds".

    Why not? Assuming 4 byte int with 4 byte alignment I see nothing
    wrong with adding 4 byte padding between b and c. More precisely, implementation could say that after first int field in a struct
    there is always 4 byte padding. AFAICS alignment constraints
    and initial segment rule are satified, padding is not at start
    of the struct. Are there any other restrictions?

    Even if you have a weird system that has, say, 3-byte "int" with 4-byte alignment, where you would have a byte of padding between "b" and "c",
    you would have the same padding there as between "ds[0]" and "ds[1]".

    (None of this means you are allowed to access data with "p[i]" or "p +
    i" outside of the range of the object that "p" points to or into.)

    While I can't think of any good reason for an implementation to insert
    padding between those objects, it would not violate any requirement of
    the standard if one did.


    --
    Waldek Hebisch
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Tim Rentsch@[email protected] to comp.lang.c on Wed Jan 14 12:53:22 2026
    From Newsgroup: comp.lang.c

    [email protected] (Waldek Hebisch) writes:

    David Brown <[email protected]> wrote:

    On 14/01/2026 04:19, James Russell Kuyper Jr. wrote:

    On 2026-01-12 13:08, Michael S wrote:

    On Mon, 12 Jan 2026 15:58:15 +0000
    bart <[email protected]> wrote:

    ...

    struct bar1 {
    union {
    struct {
    int table[4];
    int other_table[4];
    };
    int xtable[8];
    };
    };

    ...

    I'm not even sure about there being no padding between .table and
    .other_table.

    Considering that they both 'int' I don't think that it could happen,
    even in standard C.

    "Each non-bit-field member of a structure or union object is aligned in
    an implementation-defined manner appropriate to its type." (6.7.3.2p16) >>> "... There can be unnamed padding within a structure object, but not
    at its beginning." (6.7.3.2p17)

    Does this allow different alignment rules for a type when it is
    stand-alone, in an array, or in a struct? I don't think so - I have
    always interpreted this to mean that the alignment is tied to the type,
    not where the type is used.

    Thus if "int" has 4-byte size and 4-byte alignment, and you have :

    struct X {
    char a;
    int b;
    int c;
    int ds[4];
    }

    then there will be 3 bytes of padding between "a" and "b", but cannot be
    any between "b" and "c" or between "c" and "ds".

    Why not? Assuming 4 byte int with 4 byte alignment I see nothing
    wrong with adding 4 byte padding between b and c.

    Right. As long as alignment requirements are satisfied, an
    implementation is free to put as much padding as it wants
    between struct members.

    More precisely,
    implementation could say that after first int field in a struct
    there is always 4 byte padding. AFAICS alignment constraints
    and initial segment rule are satified, padding is not at start
    of the struct. Are there any other restrictions?

    There are some consistency requirements with respect to other
    struct types that have a common starting sequence of members.
    Basically, as long as the rules stay the same from struct to
    struct (and alignment rules are respected), then there can be
    however much padding the implementation chooses to add, at any
    point between struct members (with some additional restrictions
    for bitfields).
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Keith Thompson@[email protected] to comp.lang.c on Wed Jan 14 14:43:08 2026
    From Newsgroup: comp.lang.c

    David Brown <[email protected]> writes:
    On 14/01/2026 04:19, James Russell Kuyper Jr. wrote:
    On 2026-01-12 13:08, Michael S wrote:
    On Mon, 12 Jan 2026 15:58:15 +0000
    bart <[email protected]> wrote:
    ...
    struct bar1 {
    union {
    struct {
    int table[4];
    int other_table[4];
    };
    int xtable[8];
    };
    };
    ...
    I'm not even sure about there being no padding between .table and
    .other_table.

    Considering that they both 'int' I don't think that it could happen,
    even in standard C.
    "Each non-bit-field member of a structure or union object is aligned
    in an implementation-defined manner appropriate to its type."
    (6.7.3.2p16)
    "... There can be unnamed padding within a structure object, but not
    at its beginning." (6.7.3.2p17)

    Does this allow different alignment rules for a type when it is
    stand-alone, in an array, or in a struct? I don't think so - I have
    always interpreted this to mean that the alignment is tied to the
    type, not where the type is used.

    Note that the alignof operator applies to a type, not to an expression
    or object.

    Thus if "int" has 4-byte size and 4-byte alignment, and you have :

    struct X {
    char a;
    int b;
    int c;
    int ds[4];
    }

    then there will be 3 bytes of padding between "a" and "b", but cannot
    be any between "b" and "c" or between "c" and "ds".

    There can be arbitrary padding between struct members, or after the last member. Almost(?) all implementations add padding only to satisfy
    alignment requirements, but the standard doesn't state any restrictions.
    There can be no padding before the first member, and offsets of members
    must be increasing.

    If alignof (int) is 4, a compiler must place an int object at an address
    that's a multiple of 4. It's free to place it at a multiple of 8, or
    16, if it chooses.

    Even if you have a weird system that has, say, 3-byte "int" with
    4-byte alignment, where you would have a byte of padding between "b"
    and "c", you would have the same padding there as between "ds[0]" and "ds[1]".

    sizeof (int) == 3 and alignof (int) == 4 is not possible. Each type's
    size is a multiple of its alignment. There is no padding between array elements.

    [...]
    --
    Keith Thompson (The_Other_Keith) [email protected]
    void Void(void) { Void(); } /* The recursive call of the void */
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From David Brown@[email protected] to comp.lang.c on Thu Jan 15 11:45:00 2026
    From Newsgroup: comp.lang.c

    On 14/01/2026 23:43, Keith Thompson wrote:
    David Brown <[email protected]> writes:
    On 14/01/2026 04:19, James Russell Kuyper Jr. wrote:
    On 2026-01-12 13:08, Michael S wrote:
    On Mon, 12 Jan 2026 15:58:15 +0000
    bart <[email protected]> wrote:
    ...
    struct bar1 {
    union {
    struct {
    int table[4];
    int other_table[4];
    };
    int xtable[8];
    };
    };
    ...
    I'm not even sure about there being no padding between .table and
    .other_table.

    Considering that they both 'int' I don't think that it could happen,
    even in standard C.
    "Each non-bit-field member of a structure or union object is aligned
    in an implementation-defined manner appropriate to its type."
    (6.7.3.2p16)
    "... There can be unnamed padding within a structure object, but not
    at its beginning." (6.7.3.2p17)

    Does this allow different alignment rules for a type when it is
    stand-alone, in an array, or in a struct? I don't think so - I have
    always interpreted this to mean that the alignment is tied to the
    type, not where the type is used.

    Note that the alignof operator applies to a type, not to an expression
    or object.

    Thus if "int" has 4-byte size and 4-byte alignment, and you have :

    struct X {
    char a;
    int b;
    int c;
    int ds[4];
    }

    then there will be 3 bytes of padding between "a" and "b", but cannot
    be any between "b" and "c" or between "c" and "ds".

    There can be arbitrary padding between struct members, or after the last member. Almost(?) all implementations add padding only to satisfy
    alignment requirements, but the standard doesn't state any restrictions. There can be no padding before the first member, and offsets of members
    must be increasing.


    On closer reading, I agree with you here. I find it a little surprising
    that this is not implementation-defined. If an implementation can
    arbitrarily add extra padding within a struct, it severely limits the
    use of structs in contexts outside the current translation unit.

    If alignof (int) is 4, a compiler must place an int object at an address that's a multiple of 4. It's free to place it at a multiple of 8, or
    16, if it chooses.

    Even if you have a weird system that has, say, 3-byte "int" with
    4-byte alignment, where you would have a byte of padding between "b"
    and "c", you would have the same padding there as between "ds[0]" and
    "ds[1]".

    sizeof (int) == 3 and alignof (int) == 4 is not possible. Each type's
    size is a multiple of its alignment. There is no padding between array elements.


    I have not, as yet, found a justification for those statements in the standards. But I'll keep looking!



    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From James Kuyper@[email protected] to comp.lang.c on Thu Jan 15 06:16:35 2026
    From Newsgroup: comp.lang.c

    On 2026-01-15 05:45, David Brown wrote:
    On 14/01/2026 23:43, Keith Thompson wrote:
    ...
    sizeof (int) == 3 and alignof (int) == 4 is not possible. Each type's
    size is a multiple of its alignment. There is no padding between array
    elements.


    I have not, as yet, found a justification for those statements in the standards. But I'll keep looking!
    They follow from a couple of facts:
    Each element in an array of type T must be correctly aligned for an
    object of type T.
    No space is allowed between the elements of an array. Note, in
    particular, that this implies that if a type uses only 3 bytes, but has
    an alignment requirement of 2, it must be padded to a length of 4 bytes,
    and sizeof(T) must reflect that size, and not the number of bytes that
    the type actually uses.
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Keith Thompson@[email protected] to comp.lang.c on Thu Jan 15 04:04:05 2026
    From Newsgroup: comp.lang.c

    David Brown <[email protected]> writes:
    On 14/01/2026 23:43, Keith Thompson wrote:
    [...]
    There can be arbitrary padding between struct members, or after the
    last member. Almost(?) all implementations add padding only to
    satisfy alignment requirements, but the standard doesn't state any
    restrictions. There can be no padding before the first member, and
    offsets of members must be increasing.

    On closer reading, I agree with you here. I find it a little
    surprising that this is not implementation-defined. If an
    implementation can arbitrarily add extra padding within a struct, it
    severely limits the use of structs in contexts outside the current translation unit.

    In practice, struct layouts are (I think) typically specified by
    a system's ABI, and ABIs generally permit/require only whatever
    padding is necessary to meet alignment requirements.

    And I think C has rules about type compatibility that are intended to
    cover the same struct definition being used in different translation
    units within a program, though I'm too lazy to look up the details.

    [...]
    --
    Keith Thompson (The_Other_Keith) [email protected]
    void Void(void) { Void(); } /* The recursive call of the void */
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From David Brown@[email protected] to comp.lang.c on Thu Jan 15 13:56:09 2026
    From Newsgroup: comp.lang.c

    On 15/01/2026 13:04, Keith Thompson wrote:
    David Brown <[email protected]> writes:
    On 14/01/2026 23:43, Keith Thompson wrote:
    [...]
    There can be arbitrary padding between struct members, or after the
    last member. Almost(?) all implementations add padding only to
    satisfy alignment requirements, but the standard doesn't state any
    restrictions. There can be no padding before the first member, and
    offsets of members must be increasing.

    On closer reading, I agree with you here. I find it a little
    surprising that this is not implementation-defined. If an
    implementation can arbitrarily add extra padding within a struct, it
    severely limits the use of structs in contexts outside the current
    translation unit.

    In practice, struct layouts are (I think) typically specified by
    a system's ABI, and ABIs generally permit/require only whatever
    padding is necessary to meet alignment requirements.

    Sure. I would be very surprised to see a real compiler add extra
    padding in a struct, beyond what was needed for alignment. And real
    compilers usually use well-documented ABI's that go beyond the
    requirements of C's implementation-defined behaviours in their detail.
    It just strikes me as a little odd that the standard says
    implementations must document things like how they split bit-fields
    between addressable units, but makes no requirements at all about how
    much extra padding they can have between struct fields.


    And I think C has rules about type compatibility that are intended to
    cover the same struct definition being used in different translation
    units within a program, though I'm too lazy to look up the details.

    [...]


    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From scott@[email protected] (Scott Lurndal) to comp.lang.c on Thu Jan 15 15:10:34 2026
    From Newsgroup: comp.lang.c

    David Brown <[email protected]> writes:
    On 14/01/2026 23:43, Keith Thompson wrote:
    David Brown <[email protected]> writes:
    On 14/01/2026 04:19, James Russell Kuyper Jr. wrote:

    There can be arbitrary padding between struct members, or after the last
    member. Almost(?) all implementations add padding only to satisfy
    alignment requirements, but the standard doesn't state any restrictions.
    There can be no padding before the first member, and offsets of members
    must be increasing.


    On closer reading, I agree with you here. I find it a little surprising >that this is not implementation-defined. If an implementation can >arbitrarily add extra padding within a struct, it severely limits the
    use of structs in contexts outside the current translation unit.

    Including representing typical networking packet headers as structs.

    Fortunately, most C compilers have some form of __attribute__((packed))
    to inform the compiler that the structure layout should not be padded.
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From David Brown@[email protected] to comp.lang.c on Thu Jan 15 16:23:43 2026
    From Newsgroup: comp.lang.c

    On 15/01/2026 16:10, Scott Lurndal wrote:
    David Brown <[email protected]> writes:
    On 14/01/2026 23:43, Keith Thompson wrote:
    David Brown <[email protected]> writes:
    On 14/01/2026 04:19, James Russell Kuyper Jr. wrote:

    There can be arbitrary padding between struct members, or after the last >>> member. Almost(?) all implementations add padding only to satisfy
    alignment requirements, but the standard doesn't state any restrictions. >>> There can be no padding before the first member, and offsets of members
    must be increasing.


    On closer reading, I agree with you here. I find it a little surprising
    that this is not implementation-defined. If an implementation can
    arbitrarily add extra padding within a struct, it severely limits the
    use of structs in contexts outside the current translation unit.

    Including representing typical networking packet headers as structs.

    Fortunately, most C compilers have some form of __attribute__((packed))
    to inform the compiler that the structure layout should not be padded.

    I very rarely find any benefit in using packed structs - typically only
    if the layout was originally designed without thought for alignment, or
    where the maximum considered alignment was smaller than on modern
    systems. My preference is to add padding fields manually, then have a static_assert on the size of the struct to check for problems. That
    does not necessarily mean the code is always portable, but at least if
    it is not going to work on a given platform, I get a compile-time error
    rather than a mystical bug!

    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Tim Rentsch@[email protected] to comp.lang.c on Tue Feb 3 05:29:36 2026
    From Newsgroup: comp.lang.c

    Michael S <[email protected]> writes:

    On Sun, 11 Jan 2026 11:48:08 -0800
    Tim Rentsch <[email protected]> wrote:

    Michael S <[email protected]> writes:

    On Fri, 09 Jan 2026 01:42:53 -0800
    Tim Rentsch <[email protected]> wrote:

    highcrew <[email protected]> writes:

    Hello,

    While I consider myself reasonably good as C programmer, I still
    have difficulties in understanding undefined behavior.
    I wonder if anyone in this NG could help me.

    Let's take an example. There's plenty here:
    https://en.cppreference.com/w/c/language/behavior.html
    So let's focus on https://godbolt.org/z/48bn19Tsb

    For the lazy, I report it here:

    int table[4] = {0};
    int exists_in_table(int v)
    {
    // return true in one of the first 4 iterations
    // or UB due to out-of-bounds access
    for (int i = 0; i <= 4; i++) {
    if (table[i] == v) return 1;
    }
    return 0;
    }

    This is compiled (with no warning whatsoever) into:

    exists_in_table:
    mov eax, 1
    ret
    table:
    .zero 16


    Well, this is *obviously* wrong. And sure, so is the original
    code, but I find it hard to think that the compiler isn't able to
    notice it, given that it is even "exploiting" it to produce very
    efficient code.

    I understand the formalism: the resulting assembly is formally
    "correct", in that UB implies that anything can happen.
    Yet I can't think of any situation where the resulting assembly
    could be considered sensible. The compiled function will
    basically return 1 for any input, and the final program will be
    buggy.

    Wouldn't it be more sensible to have a compilation error, or
    at least a warning? The compiler will be happy even with -Wall
    -Wextra -Werror.

    There's plenty of documentation, articles and presentations that
    explain how this can make very efficient code... but nothing
    will answer this question: do I really want to be efficiently
    wrong?

    I mean, yes I would find the problem, thanks to my 100% coverage
    unit testing, but couldn't the compiler give me a hint?

    Could someone drive me into this reasoning? I know there is a lot
    of thinking behind it, yet everything seems to me very incorrect!
    I'm in deep cognitive dissonance here! :) Help!

    The important thing to realize is that the fundamental issue here
    is not a technical question but a social question. In effect what
    you are asking is "why doesn't gcc (or clang, or whatever) do what
    I want or expect?". The answer is different people want or expect
    different things. For some people the behavior described is
    egregiously wrong and must be corrected immediately. For other
    people the compiler is acting just as they think it should,
    nothing to see here, just fix the code and move on to the next
    bug. Different people have different priorities.

    I have hard time imagining sort of people that would have objections
    in case compiler generates the same code as today, but issues
    diagnostic.

    It depends on what the tradeoffs are. For example, given a
    choice, I would rather have an option to prevent this particular
    death-by-UB optimization than an option to issue a diagnostic.
    Having both costs more effort than having just only one.

    Me too.
    But there are limits to what considered negotiable by worshippers of
    nasal demons and what is beyond that. Warning is negotiable, turning
    off the transformation is most likely beyond.

    What other people think on that matter doesn't change
    my comment.
    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From Tim Rentsch@[email protected] to comp.lang.c on Tue Feb 3 05:34:09 2026
    From Newsgroup: comp.lang.c

    Keith Thompson <[email protected]> writes:

    David Brown <[email protected]> writes:

    On 14/01/2026 23:43, Keith Thompson wrote:

    [...]

    There can be arbitrary padding between struct members, or after the
    last member. Almost(?) all implementations add padding only to
    satisfy alignment requirements, but the standard doesn't state any
    restrictions. There can be no padding before the first member, and
    offsets of members must be increasing.

    On closer reading, I agree with you here. I find it a little
    surprising that this is not implementation-defined. If an
    implementation can arbitrarily add extra padding within a struct, it
    severely limits the use of structs in contexts outside the current
    translation unit.

    In practice, struct layouts are (I think) typically specified by
    a system's ABI, and ABIs generally permit/require only whatever
    padding is necessary to meet alignment requirements.

    And I think C has rules about type compatibility that are intended to
    cover the same struct definition being used in different translation
    units within a program, though I'm too lazy to look up the details.

    In fact, the rules in the C standard imply that any two struct
    types that have members of the same types and in the same order
    have the same layout (conceivably with different amounts of
    padding at the end), regardless of whether the two struct types
    are compatible or not.
    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From Tim Rentsch@[email protected] to comp.lang.c on Tue Feb 3 21:53:49 2026
    From Newsgroup: comp.lang.c

    Michael S <[email protected]> writes:

    On Mon, 12 Jan 2026 12:03:36 -0800
    Tim Rentsch <[email protected]> wrote:

    Michael S <[email protected]> writes:

    On Mon, 12 Jan 2026 08:03:31 -0800
    Andrey Tarasevich <[email protected]> wrote:

    On Mon 1/12/2026 6:28 AM, Michael S wrote:

    According to C Standard, access to p->table[4] in foo1() is UB.
    ...
    Now the question.
    What The Standard says about foo2() ? Is there UB in foo2() as
    well?

    Yes, in the same sense as in `foo1`.

    gcc code generator does not think so.

    It definitely does.

    Right.

    May be. But it's not expressed by gcc code generator or by any
    wranings. So, how can we know?

    I know the behavior is undefined by what is said in the C standard.

    For what gcc developers think of the question, for me the totality
    of circumstantial evidence suffices. I have nothing to offer if the
    goal is to convince skeptics.

    Do you have citation from the Standard?

    The short answer is section 6.5.6 paragraph 8.

    I am reading N3220 draft https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3220.pdf
    Here section 6.5.6 has no paragraph 8 :(

    I hope it isn't too much to expect that if N3220 doesn't have
    what you are looking for then you would try looking in earlier
    versions of the C standard, either C99 (N1256) or C11 (N1570).

    There is amplification in Annex J.2, roughly three pages
    after the start of J.2. You can search for "an array
    subscript is out of range", where there is a clarifying
    example.

    I see the following text:
    "An array subscript is out of range, even if an object is apparently accessible with the given subscript (as in the lvalue expression
    a[1][7] given the declaration int a[4][5]) (6.5.7)."

    That's what you had in mind?

    Yes.

    Note the section quoted section number, 6.5.7, gives the correct
    section number in N3220 to locate the aforementioned reference.
    I see that in N3220 the relevant paragraph is paragraph 9 rather
    than paragraph 8. I hope that would be evident by looking at the
    contents of section 6.5.7.
    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From Tim Rentsch@[email protected] to comp.lang.c on Sun Mar 1 22:53:28 2026
    From Newsgroup: comp.lang.c

    Andrey Tarasevich <[email protected]> writes:

    On Mon 1/12/2026 9:36 AM, Michael S wrote:

    But I was interested in the "opinion" of C Standard rather than of gcc
    compiler.
    Is it full nasal UB or merely "implementation-defined behavior"?

    It is full nasal UB per the standard. And, of course, it is as "implementation-defined" as any other UB in a sense that the standard
    permits implementations to _extend_ the language in any way they
    please, as long as they don't forget to issue diagnostics when
    diagnostics are required (by the standard).

    There are two schools of thought on that question. For example, if
    an implementation extends the ISO standard by adding a syntax rule,
    then using a construct following the added rule does not violate the
    syntax and hence no diagnostic is required. Conversely, it would be
    silly for the C standard to say extensions are allowed if what the
    extensions do could be done anyway under the umbrella of undefined
    behavior (after a diagnostic is issued). The point of saying the
    standard allows extensions is so that an implementation may decline
    to issue a diagnostic in certain cases where one would otherwise be
    required.

    I'm not claiming that this view is the only view possible, only that
    it is consistent with what is said in the C standard.
    --- Synchronet 3.21c-Linux NewsLink 1.2