0

The Rust programming language offers a Arc atomic reference counting generic type for use in multi-threading environment, since Rc is optimized for performance in single-threaded applications, and lack multi-threading protection.

However, imagine a case where a thread trys to acquire a resource that another thread is releasing - the reference count was 1, and the release happens just before ("happens before" is as defined in the C standard), the resource is acquired and the thread run into undefined behavior.

Timeline for illustration:

  1. object has ref-cnt 1
  2. thread 02 releases the object
  3. ref-cnt reaches 0, object deallocated
  4. thread 01 acquires the object,
  5. the object however is gone, resulting in ubdefined behavior.

Obviously to ensure the resource can be owned by multiple party, synchronization primitives must be applied. Considering this, what problem does Arc actually solve?

Alexander
  • 5,185
DannyNiu
  • 372

2 Answers2

11

The problem is there is an assumption violation in the sequence of events:

  1. object has ref-cnt 1

    This implies that object is only visible from 1 Arc reference. In rust this also implies that there are no & references to the contained object that where read from any other reference around. This is because the borrow checker ensures that & references will not outlive the underlying Arc reference.

  2. thread 02 releases the object

    In rust, this implies Thread 2 had exclusive mutable access to the only reference. That Thread 2 has exclusive mutable access implies that there are no & references to the Arc reference or the contained object that where read from this reference.

  3. ref-cnt reaches 0, object deallocated

    The object is no longer in the reference.

  4. thread 01 acquires the object,

    -> Thread 1 somehow has access to an object. This either violates the assumption in step 1 (there was a second reference), or in step 2 (thread 2 did not have exclusive access to the reference).

  5. the object however is gone, resulting in undefined behavior.

Rust ensures step 5 will not happen by preventing step 4. It does this by enforcing the assumptions in step 2 through the borrow checker, and 1 through the interface of Arc in combination through the borrow checker.

The borrow checker enforces this through the Send and Sync traits. an Arc reference implements Send, but does not implement Sync. This specifies that and individual reference can be moved between threads, but can't be shared between them. Each thread needs to have its own Arc reference to the underlying object. Because Arc is not Sync, the borrow checker will prevent any attempts to send a reference to an Arc to a second thread.

The key to Arc is whilst Arc allows multiple parties to access the underlying object, each reference is only owned by exactly one party. Or in other words, Arc has an invariant that the reference count is greater or equal to the total number of threads that have access to the underlying object. By enforcing this invariant through the borrow system, the problem of ensuring only one thread has access at the point of drop reduces to ensuring that only one reference exists.

Arc handles the synchronization for the creation and destruction of references, keeping track of how many parties currently have access to the object in a manner that is thread safe. And it ensures that the object is deleted at a time when only party has access to it, because it is the last party to have a reference to it.

1

Just from my experience with Objective-C which has had automatic reference counting for years:

Reference counting isn't really about resources. It is about life times of objects. An object is created, a reference is stored somewhere, and the reference count is set to 1. Then various bits of code need a reference to the same object, every time the reference counter is increased. Or a bit of code stops needing the reference, then the reference counter is decreased. In Objective-C, when the reference counter becomes zero, the object is dead (it might not know it yet, but it is dead). The reference counting code itself starts destroying the object. At the same time, the reference counter cannot be changed anymore once it has reached zero, so unlike Java, once the reference count is 0, the object will go.

Now increasing and decreasing the reference count from two different threads is not a trivial problem. In Objective-C, a location where a reference is stored is marked as "atomic" or "non-atomic". The difference is that the actual act of increasing or decreasing the reference count works correctly when called from multiple threads if the variable is atomic, while it is not guaranteed to work if the variable is non-atomic and could crash or misbehave in some way.

However, "not crashing" and being atomic is not really enough. Imagine two threads, one decreases the reference count, one increases it, at precisely the same time, as close as possible, and we start with a reference count of 1. If the decrease happens first, then the object will be released, and by the time we try to increase the reference count, it is gone. If the increase happens first, then the reference count is set to 2, one nanosecond later is set to 1, and the object is still alive and kicking. So in this situation there are two very different outcomes, both possible and legal. That just cannot be right.

So in Objective-C, the "atomic" variant was used very, very rarely. Because in those cases where "atomic" avoided undefined behaviour on the code level, you still have unpredictable behaviour on the application level, which will not crash, but is most likely a bug in your code.

(In Swift, there is no distinction between atomic and non-atomic anymore. There are some tiny changes to the ARM processors that are very commonly used, that make atomic changes very fast if an atomic variable hasn't every actually been touched by two different cores. So in cases where "atomic" would be not needed, the atomic code is very, very fast. And in cases where it's needed, well in those cases, it is needed.

gnasher729
  • 49,096