I received comments on a recent question which suggests that pointers which are not aligned can created undefined behaviour when dereferenced (at least in newer C++ standards).
My question consists of two parts.
I am not sure I fully understand what constitutes an aligned pointer, or more accurately, what might make a pointer address mal-aligned or mis-aligned.
Going further, I am certainly not sure why this may cause undefined behaviour.
BTW - since my base level understanding of this subject matter is clearly quite limited, it may be the case that I misunderstood the comments which were made, or did not understand them completely. So it is possible this question doesn't make much sense.
I received comments on a recent question which suggests that pointers which are not aligned can created undefined behaviour when dereferenced (at least in newer C++ standards).
My question consists of two parts.
I am not sure I fully understand what constitutes an aligned pointer, or more accurately, what might make a pointer address mal-aligned or mis-aligned.
Going further, I am certainly not sure why this may cause undefined behaviour.
BTW - since my base level understanding of this subject matter is clearly quite limited, it may be the case that I misunderstood the comments which were made, or did not understand them completely. So it is possible this question doesn't make much sense.
Share Improve this question edited Feb 6 at 20:06 Remy Lebeau 596k36 gold badges498 silver badges843 bronze badges asked Feb 6 at 19:44 user2138149user2138149 16.8k30 gold badges145 silver badges287 bronze badges 8 | Show 3 more comments1 Answer
Reset to default 4What is alignment?
Typical memory doesn't deliver data a byte at a time, and typical CPUs don't want it that way. The data bus between them is 2, 4, or even 8 bytes wide. (Sometimes even more, but we'll go with this for now.) Moreover, you can't just read 8 bytes from any address, but only from an address that is divisible by 8. Such an address is called aligned.
So if you want to read 8 bytes from an address that isn't divisible by 8 (unaligned), what can you do? Well, you can read some bytes from the next lower aligned address, and some from the next upper, and then combine those you want together and throw the rest away. That's an unaligned load.
But while you the programmer can do that explicitly, whether the CPU will do it if you give it an unaligned address to load is a different question. Some CPU architectures will (e.g. x86), but usually at the cost of performance. Some won't, and instead will fault. Sometimes it depends on the instruction, e.g. SSE has both an aligned load instruction that will fault on unaligned addresses and an unaligned load instruction that won't.
In programming language terms, where we abstract away from the hardware, types have alignments, e.g. a 4-byte int
typically has 4-byte alignment, and so if your int object doesn't sit at an aligned address, it's unaligned, and reading it is undefined behavior.
So why undefined behavior?
Because that's generally how C dealt with differences between platforms. If doing something yields different results on different platforms, then very often the language just says it is undefined. Compare signed integer overflow (has different behavior on some old platforms) or shifting beyond the integer width (has different behavior on some rather recent platforms).
char buffer[100]; int* p = (int*)&buffer[1]; *p = 1234;
Notice that the pointer is intentionally misaligned (on some platforms). On a DEC Alpha, reading or writing to the memory would crash the program (that's due to undefined behavior), or if a compiler flag was provided the OS would "fix up" the misalignment using a trap and low level read/shift/write instructions to place the value into the misaligned memory — albeit x1000 slower than if it were aligned. – Eljay Commented Feb 6 at 19:58malloc
ing an array ofuint8_t
, which is interpreted as auint64_t
followed by auint8_t
followed by auint64_t
. Close packed with no padding. The final element will not be aligned. If the compiler aligns it, andmalloc
was called to allocate17
bytes of memory, then presumably the finaluint64_t
is partially off the end of the allocated block of memory. – user2138149 Commented Feb 6 at 22:23malloc
is properly aligned for all types. If you want to store mixed types in there, use a struct and its members will also be aligned. If you try to store a 64-bit value in the 10th byte, you have done something wrong to get there. – BoP Commented Feb 6 at 22:31