On PowerPC platform, the book A Primer on Memory Consistency and Cache Coherence stated:
As depicted in Table 5.18, Power’s HWSYNCs can be used to make the Independent Read Independent Write Example (IRIW) of Table 5.10 behave sensibly (i.e., disallowing the result r1==NEW, r2==0, r3==NEW and r4==0). Using LWSYNCs is not sufficient. For example, core C3’s F1 HWSYNC must cumulatively order core C1’s store S1 before core C3’s load L2.
My question is: Why LWSYNC
can not make the Independent Read Independent Write Example (IRIW) behave sensibly?
The C1’s store S1 is obviously before core C3’s load L2 when r1 is NEW(even with the LWSYNC), right?
UPDATE: I found another explanation for IRIW on PowerPC, from .pdf:
Consider the IRIW litmus test (cf. Figure 3a) assuming a non-MCA model such as Power. Note that the two fences