Given a union that contains const char*
and char*
members, can data set through the char*
be safely accessed through the const char*
, in C99 and newer versions?
For example, does the following have any undefined, unspecified, or implementation-defined behavior, assuming a hosted environment?
#include <stdbool.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
typedef struct {
union {
const char *ref;
char *alloc;
} data;
bool is_allocated;
} msg;
void display_msg(msg m) {
puts(m.data.ref);
if (m.is_allocated) free(m.data.alloc);
}
int main(void) {
msg hello = {.data.alloc = malloc(32), .is_allocated = true};
if (!hello.data.alloc) abort();
strcpy(hello.data.alloc, "Hello, world!");
display_msg(hello);
}
Given a union that contains const char*
and char*
members, can data set through the char*
be safely accessed through the const char*
, in C99 and newer versions?
For example, does the following have any undefined, unspecified, or implementation-defined behavior, assuming a hosted environment?
#include <stdbool.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
typedef struct {
union {
const char *ref;
char *alloc;
} data;
bool is_allocated;
} msg;
void display_msg(msg m) {
puts(m.data.ref);
if (m.is_allocated) free(m.data.alloc);
}
int main(void) {
msg hello = {.data.alloc = malloc(32), .is_allocated = true};
if (!hello.data.alloc) abort();
strcpy(hello.data.alloc, "Hello, world!");
display_msg(hello);
}
Share
Improve this question
edited Apr 1 at 6:55
Lundin
216k46 gold badges279 silver badges432 bronze badges
asked Apr 1 at 2:04
Eli MinkoffEli Minkoff
1535 bronze badges
New contributor
Eli Minkoff is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
3
|
1 Answer
Reset to default 15Your program writes data using the alloc
member and reads it using the ref
member. That read reinterprets the corresponding bytes of the union using the new type. (Accessing a union member this way was allowed in C 1999 but its behavior was not explicitly stated. C 2017 added a note that the read reinterprets the bytes, and that note persists into C 2024.)
If you wrote a char *
type and read an int *
type (const
or not), the behavior would not be guaranteed by the C standard because it does not require char *
and int *
to use the same representation (the manner in which the bytes in memory represent a value).
However, with char *
and const char *
, or generally T *
and const T *
for any type T
, the pointer types have the same representation, per C 1999 6.2.5 26:
… Similarly, pointers to qualified or unqualified versions of compatible types shall have the same representation and alignment requirements.
That language persists into C 2024. Any type T
is compatible with itself (C 1999 6.2.7 1: “Two types have compatible type if their types are the same.”), and const T
is a qualified version of T
. So reading the ref
member will produce the same pointer value as was written into the alloc
member.
Supplement
If your intent is to limit access to the data to reduce bugs where software intended only to read the data accidentally writes to it, then a more common method in C is to hide the structure contents from clients and give them access only through controlled interface routines. Here is a simple example in one file. In more developed code, the definition of struct msg
and the routines that access it would be placed in a separate source file, one header would be provided to be included by source files that only read the data, and another header could be provided for source files that modify the data. That first header would declare GetRef
but not declare any routines that could modify the data.
This method avoids messy things like reinterpreting bytes and gives more control over which software can do what.
#include <stdbool.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
typedef struct msg msg;
extern const char *GetRef(const msg *m);
extern void FreeAlloc(msg *m);
void display_msg(msg *m) {
puts(GetRef(m));
FreeAlloc(m);
}
struct msg { char *alloc; bool is_allocated; };
const char *GetRef(const msg *m)
{
if (!m->is_allocated)
{
fprintf(stderr, "Internal error, attempt to use unallocated memory.\n");
abort();
}
return m->alloc;
}
void FreeAlloc(msg *m)
{
if (m->alloc)
{
free(m->alloc);
m->is_allocated = false;
}
}
int main(void) {
msg hello = {.alloc = malloc(32), .is_allocated = true};
if (!hello.alloc) abort();
strcpy(hello.alloc, "Hello, world!");
display_msg(&hello);
}
<stdio.h>
,<stdlib.h>
, and<string.h>
are not required in a freestanding implementation. What happens withmain
is not defined in a freestanding implementation. – Eric Postpischil Commented Apr 1 at 2:50void display_msg (msg *m) { puts (m->data.ref); ... }
to simply allow passing the pointer instead of a copy of the struct itself (and thendisplay_msg(&hello);
inmain()
)? In your case with a union ofchar
type and abool
, the copy isn't much of a concern, but the larger the struct, the larger the copy unless you pass a pointer. – David C. Rankin Commented Apr 1 at 2:56