最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

c++ - GCC wiki memory barrier example - Stack Overflow

programmeradmin3浏览0评论

The following code comes from the GCC Wiki.

// -Thread 1-
y.store (20, memory_order_relaxed)
x.store (10, memory_order_relaxed)

// -Thread 2-
if (x.load (memory_order_relaxed) == 10)
 {
   assert (y.load(memory_order_relaxed) == 20) /* assert A */
   y.store (10, memory_order_relaxed)
 }

// -Thread 3-
if (y.load (memory_order_relaxed) == 10)
 assert (x.load(memory_order_relaxed) == 10) /* assert B */

Since threads don't need to be synchronized across the system, either assert in this example can actually FAIL.

I can figure out why assert A can fail. But how can assert B also fail?

Does y.load() == 10 imply the end of thread 2, thus x.load() == 10?

The following code comes from the GCC Wiki.

// -Thread 1-
y.store (20, memory_order_relaxed)
x.store (10, memory_order_relaxed)

// -Thread 2-
if (x.load (memory_order_relaxed) == 10)
 {
   assert (y.load(memory_order_relaxed) == 20) /* assert A */
   y.store (10, memory_order_relaxed)
 }

// -Thread 3-
if (y.load (memory_order_relaxed) == 10)
 assert (x.load(memory_order_relaxed) == 10) /* assert B */

Since threads don't need to be synchronized across the system, either assert in this example can actually FAIL.

I can figure out why assert A can fail. But how can assert B also fail?

Does y.load() == 10 imply the end of thread 2, thus x.load() == 10?

Share Improve this question edited Apr 2 at 9:30 BoP 3,2704 gold badges19 silver badges43 bronze badges asked Apr 2 at 3:03 hk134579hk134579 433 bronze badges New contributor hk134579 is a new contributor to this site. Take care in asking for clarification, commenting, and answering. Check out our Code of Conduct. 0
Add a comment  | 

1 Answer 1

Reset to default 6

Might only be possible on a machine that's not multi-copy-atomic (such as POWER) where IRIW reordering is possible. (Will two atomic writes to different locations in different threads always be seen in the same order by other threads?).

So T2 sees x == 10 before it's globally visible, and stores y=10.

T3 can then read T2's store of y before the x=10 store is visible to it. (StoreStore reordering from the physical core running T1 and T2 to the phys core running T3).

This could be possible on real POWER or NVidia ARMv7 hardware if T1 and T2 run on different logical cores of the same physical core, and T3 runs on a separate physical core.


In terms of the C or C++ memory models, the assert can fail because nothing guarantees visibility. The fact that one thread has seen a value doesn't imply that all threads can see that value.

There might be other simpler mechanisms too, but the assert in T2 means y.store (10, relaxed) doesn't happen at all if that assert fails so it's not as simple as just x.load running before y.load.

发布评论

评论列表(0)

  1. 暂无评论