Undefined behavior when accessing inactive union members
Summary
Read from the member of the union that wasn't most recently written.
Environment
- Operating System : archlinux
- Architecture : x64
- Eigen Version : master branch
- Compiler Version : gcc14.2
- Compile Flags : -O3 -march=native
- Vector Extension :
Minimal Example
https://gitlab.com/libeigen/eigen/-/blob/master/Eigen/src/Core/arch/Default/Half.h?ref_type=heads#L525 https://gitlab.com/libeigen/eigen/-/blob/master/Eigen/src/Core/arch/Default/Half.h?ref_type=heads#L552 https://gitlab.com/libeigen/eigen/-/blob/master/Eigen/src/Core/arch/Default/Half.h?ref_type=heads#L553 https://gitlab.com/libeigen/eigen/-/blob/master/Eigen/src/Core/arch/Default/Half.h?ref_type=heads#L562
union float32_bits {
unsigned int u;
float f;
};
float32_bits f;
f.f = ff; // f.f union member is active, his lifetime has begun.
unsigned int sign = f.u & sign_mask; // undefined behavior... -- lifetime of f.u member is not started.
Steps to reproduce
Compile and run.
What is the current bug behavior?
Standard C++ (Working Draft):
3.65[defns.undefined]undefined behavior behavior for which this document imposes no requirements [Note 1: Undefined behavior may be expected when this document omits any explicit definition of behavior or when a program uses an incorrect construct or invalid data. Permissible undefined behavior ranges from ignoring the situation completely with unpredictable results, to behaving during translation or program execution in a documented manner characteristic of the environment (with or without the issuance of a diagnostic message ([defns.diagnostic])), to terminating a translation or execution (with the issuance of a diagnostic message). Many incorrect program constructs do not engender undefined behavior; they are required to be diagnosed. Evaluation of a constant expression ([expr.const]) never exhibits behavior explicitly specified as undefined in [intro] through [cpp]. — end note]
3.50[defns.undefined.runtime]runtime-undefined behavior behavior that is undefined except when it occurs during constant evaluation [Note 1: During constant evaluation, it is implementation-defined whether runtime-undefined behavior results in the expression being deemed non-constant (as specified in [expr.const]) and runtime-undefined behavior has no other effect. — end note]
except that if the object is a union member or subobject thereof, its lifetime only begins if that union member is the initialized member in the union ([dcl.init.aggr], [class.base.init]), or as described in [class.union], [class.copy.ctor], and [class.copy.assign], and except as described in [allocator.members]. The lifetime of an object o of type T ends when: -- if T is a non-class type, the object is destroyed, or -- if T is a class type, the destructor call starts, or -- the storage which the object occupies is released, or is reused by an object that is not nested within o ([intro.object]).
In a union, a non-static data member is active if its name refers to an object whose lifetime has begun and has not ended ([basic.life]). At most one of the non-static data members of an object of union type can be active at any time, that is, the value of at most one of the non-static data members can be stored in a union at any time.
Similarly, before the lifetime of an object has started but after the storage which the object will occupy has been allocated or, after the lifetime of an object has ended and before the storage which the object occupied is reused or released, any glvalue that refers to the original object may be used but only in limited ways. For an object under construction or destruction, see [class.cdtor]. Otherwise, such a glvalue refers to allocated storage ([basic.stc.dynamic.allocation]), and using the properties of the glvalue that do not depend on its value is well-defined. The program has undefined behavior if -- the glvalue is used to access the object, or -- the glvalue is used to call a non-static member function of the object, or -- the glvalue is bound to a reference to a virtual base class ([dcl.init.ref]), or -- the glvalue is used as the operand of a dynamic_cast ([expr.dynamic.cast]) or as the operand of typeid.
https://en.cppreference.com/w/cpp/language/union :
It is undefined behavior to read from the member of the union that wasn't most recently written.
#include <cstdint>
#include <iostream>
union S
{
std::int32_t n; // occupies 4 bytes
std::uint16_t s[2]; // occupies 4 bytes
std::uint8_t c; // occupies 1 byte
}; // the whole union occupies 4 bytes
int main()
{
S s = {0x12345678}; // initializes the first member, s.n is now the active member
// At this point, reading from s.s or s.c is undefined behavior,
// but most compilers define it.
std::cout << std::hex << "s.n = " << s.n << '\n';
s.s[0] = 0x0011; // s.s is now the active member
// At this point, reading from s.n or s.c is undefined behavior,
// but most compilers define it.
std::cout << "s.c is now " << +s.c << '\n' // 11 or 00, depending on platform
<< "s.n is now " << s.n << '\n'; // 12340011 or 00115678
}
What is the expected correct behavior?
Correct work.
Relevant logs
Warning Messages
Benchmark scripts and results
Anything else that might help
-
Have a plan to fix this issue.