I am aware that multiple non volatile loads can be collapsed into a single load by the compiler. Does this mean that memory mapping functions can create undefined behavior?
for example:
int k = *p;
unmapFile(fm, fileHandle1);//the unmapped memory includes the address pointed to by p
//p is no longer valid, pointing to a virtual address that is unmapped
mapFile(fm, fileHanlde2);//p now points to an address somewhere in a different file
int j = *p;
//use k and j here
can the two dereferences of p be collapsed into one causing the code to behave unpredictably?
as an aside, I know file mapping functions don't allow you to specify the starting address of the map (at least in posix and windows), but this was the simplest hypothetical I could come up with. There are indeed use cases for this paradigm.
I am aware that multiple non volatile loads can be collapsed into a single load by the compiler. Does this mean that memory mapping functions can create undefined behavior?
for example:
int k = *p;
unmapFile(fm, fileHandle1);//the unmapped memory includes the address pointed to by p
//p is no longer valid, pointing to a virtual address that is unmapped
mapFile(fm, fileHanlde2);//p now points to an address somewhere in a different file
int j = *p;
//use k and j here
can the two dereferences of p be collapsed into one causing the code to behave unpredictably?
as an aside, I know file mapping functions don't allow you to specify the starting address of the map (at least in posix and windows), but this was the simplest hypothetical I could come up with. There are indeed use cases for this paradigm.
Share asked Feb 11 at 8:54 BadasahogBadasahog 7833 silver badges20 bronze badges 5 |2 Answers
Reset to default 2Does this mean that memory mapping functions can create undefined behavior?
C itself does not define any such functions or speak to how they could or should work if defined. Therefore, as far as the C language spec is concerned, anything and everything associated with memory mapping functions has undefined behavior. There is no other answer at the level of generality of the question. As @AtsushiYokoyama observes, there are practical issues that may dissuade the compiler from combining the two reads of *p
, but that should not be taken as a guarantee that any given compiler indeed won't combine them.
In a programming and execution environment that makes memory mapping functions available, most details related to their behavior, to the extent that they are defined, are implementation specific. Considering a POSIX environment supporting file mappings and the POSIX mmap()
and munmap()
functions as an example, POSIX says:
The
munmap()
function shall remove any mappings for those entire pages containing any part of the address space of the process starting ataddr
and continuing forlen
bytes. Further references to these pages shall result in the generation of aSIGSEGV
signal to the process.
And it says:
The
mmap()
function shall establish a mapping between the address space of the process at an addresspa
forlen
bytes to the memory object represented by the file descriptorfildes
at offsetoff
forlen
bytes [...]
and
The mapping established by
mmap()
shall replace any previous mappings for those whole pages containing any part of the address space of the process starting atpa
and continuing forlen
bytes.
It is the responsibility of the compiler to ensure that any optimizations it performs preserve program semantics as defined by the combination of all relevant specifications. I don't see any way that a compiler for a POSIX environment where your unmapFile()
corresponds to munmap()
and your mapFile()
corresponds to mmap()
could justify combining the two reads of *p
. That doesn't mean that no compiler would, but it does mean that a compiler that did would risk producing program behavior that did not conform to POSIX. For this case, then, I'd strengthen my remark above to: there are practical issues that should dissuade the compiler from combining the two reads of *p
.
When the compiler performs optimization on this kind of code, if it does not know the implementation of unmapFile()
or mapFile()
(for example, if they are external functions), then in general, the two references through pointer p
will not be collapsed into one. This is because the compiler cannot guarantee that these functions do not have side effects that affect the reference to p
.
On the other hand, if the compiler does know the implementation of unmapFile()
and mapFile()
(for example, if they are inline static
functions), and it is confident that these functions do not affect the reference to p
, then it might collapse the references through p
.
Of course, as you pointed out, if they are volatile
, that would be a different story.
Hope this helps!
p
vs the types of the parameters to those functions. The compiler can assume that they function does not modify the data pointed at byp
if the types do not alias and the pointed-at data has internal linkage, and if the functions have internal or external linkage. Otherwise, it can't assume much. It depends on context, so you need a more concrete example than this pseudo-code. – Lundin Commented Feb 11 at 9:36p
assigned to in the above code? Did you meanp = mapFile(…);
? – Ian Abbott Commented Feb 11 at 10:33mmap
withMAP_FIXED
. – Nate Eldredge Commented Feb 11 at 15:46float *
parameter and convert it touint32_t *
and access memory with that. Further, even if the function is passed no parameters, it might obtain the value ofp
in other ways. Particularly ifp
was obtained by some system memory-mapping routine, the compiler would not know if that routine did or did not store the address in someextern void *foo
thatunMapFile
andmapFile
could access. – Eric Postpischil Commented Feb 11 at 16:44