I'm mostly looking for someone to double check my reading of the standard here.
TLDR: does the standard require that the destructor of a shared_ptr order accesses to the shared object before the destructor of that object being called on a different thread?
It's always mentioned that objects managed by shared_ptr aren't thread safe, which has seemed obvious to me because conceptually all shared_ptr is doing is associating a reference counter with an object to manage it's lifetime. The allocated object is still accessed through a normal pointer (via get() or the overloaded -> or * operators) so there's no way for shared_ptr to provide any additional protection even if it wanted to.
I was, however, under the assumption that the following code would not contain a data race:
#include <memory>
#include <semaphore>
#include <thread>
void foo(std::shared_ptr<int>* in, std::binary_semaphore* flag, int* out)
{
int temp;
{
std::shared_ptr<int> in_copy = *in; //copy shared ptr
flag->release(); //notify parent thread that copy has finished
temp = *in_copy; //read value from shared ptr
//in_copy is destructed
}
*out = temp; //copy value from shared ptr into out
}
int bar()
{
int out_value;
std::binary_semaphore flag{ 0 };
std::thread t1;
{
//create shared_ptr and pass to thread
std::shared_ptr<int> in_value = std::make_shared<int>(15);
t1 = std::thread{ &foo, &in_value, &flag, &out_value };
flag.acquire(); //wait for thread to copy shared_ptr
//in_value is destroyed
}
t1.join(); //wait for thread to finish
return out_value;
}
Specifically, I'm worried that the line *out = temp;
could constitute a data race, which would be the case if the compiler could legally move the initial read of temp forward past the shared_ptr destructor (at least in the case where the shared object isn't destructed on t1), making the definition of foo effectively (with some elaboration functions):
void foo(std::shared_ptr<int>* in, std::binary_semaphore* flag, int* out)
{
//copy shared ptr
__refcount* in_refcount = in->get_refcount();
in_refcount->increment();
int* in_copy = in->get();
flag->release(); //notify parent thread that copy has finished
if (in_refcount->decrement_and_check_zero())
{
*out = *in_copy;
destroy in_copy;
}
else
{
*out = *in_copy;
}
}
There's a data race here in the else branch, because it's possible that between the call to in_refcount->decrement_and_check_zero()
and *in_copy
, there's an opportunity for the other thread to delete the shared_ptr object, therefore meaning that *in_copy
could end up reading from an object whose lifetime has ended.
So the question now becomes "is this a legal transformation?" None of the three big standard library implementations allow this, because in those the decrement of the refcount is a release operation, therefore preventing the read of temp = *in_copy
to be moved past the destructor of the shared_ptr. A happens-before relationship is established between the read and the decrement of the refcount, and by virtue of being conditional on the result of the decrement the delete of the shared object also has a happens-after relationship with the decrement. Transitively this provides an inter-thread-happens-before relationship between the read from the shared object and it's destructor.
The big question that I have now is if the standard requires that such a relationship be established. Looking into the standard, there's a worrying silence on the ordering garantees of shared_ptr. All I can see regarding shared_ptr's relationship to ordering garantees are:
From [util.smartptr.shared]:
shared_ptr
implements semantics of shared ownership; the last remaining owner of the pointer is responsible for destroying the object- Changes in
use_count()
do not reflect modifications that can introduce data races
And from [util.smartptr.shared.dest]:
- after
*this
has been destroyed allshared_ptr
instances that shared ownership with*this
will report ause_count()
that is one less than its previous value
Those all seem to be effectively giving the reference count of shared_ptr similar semantics to relaxed atomics, although notably without actually naming them as atomic operations. The lack of naming them as atomic operations is notable because it means that we can't fall back on the standard rules for establishing inter thread ordering relationships, which are explicitly defined with regards to some atomic operation, and unlike mutex::lock() and mutex::unlock() shared_ptr doesn't seem to be explicitly called out as an example of an object with methods that act as though they were atomic objects.
Basically, I can't see anything in the wording of the standard that provides any guarantees that operations on an object owned by a shared_ptr (or any other objects whose lifetimes are bounded by the lifetime of the pointed to object, such as objects owned by a unique_ptr member of the shared object) have a happens-before relationship with the calling of the destructor of the shared object on a different thread, and therefore any code using shared_ptr needs to add additional synchronisation (i.e. barriers) whenever a shared_ptr goes out of scope in order to prevent data races between accesses of the object and the destructor of the object.
This seems wrong to me, since it would make shared_ptr nearly useless for a lot of purposes (basically any purpose except where the shared_ptr owned object is either known to always be destroyed by the accessing thread (in which case shared_ptr is not needed) or when the lifetime of the shared_ptr owned object is known to only be ended after some other synchronisation event on the accessing thread), but I can't find a way to interpret the standard where those ordering guarantees are mandated to be provided by the shared_ptr itself.
I'm hoping that I'm misunderstanding something here, and if so I'd love if someone could point out to me where exactly I'm reading this wrong.