Earlier, when introducing the topic of data races, I mentioned that there was a more subtle reason why the first attempt at the code wasn't thread-safe. It's to do with volatility and data caching. Here's a sample which will make explaining the topic somewhat easier:
using System; using System.Threading; public class Test { static bool stop; static void Main() { ThreadStart job = new ThreadStart(ThreadJob); Thread thread = new Thread(job); thread.Start(); // Let the thread start running Thread.Sleep(2000); // Now tell it to stop counting stop = true; } static void ThreadJob() { int count = 0; while (!stop) { Console.WriteLine ("Extra thread: count {0}", count); Thread.Sleep(100); count++; } } } |
Now, this code is fairly simple. We have a boolean variable (stop
)
which is polled by the new thread - it will keep counting until it notices that
stop
is true. In the main thread, we pause for a couple of seconds
and then set stop
to true.
So, the new thread should count for a couple of seconds and then stop, right?
Well, in fact that's what will almost certainly happen if you run the code, but
it's not guaranteed. The while
loop in the new thread could keep
running forever, never really checking whether or not the stop
variable has been set to true. If that sounds bizarre to you, welcome to the
weird and wonderful world of memory models.
Memory in modern computers is a very complicated business, with registers, multiple levels of cache, and multiple processors sharing main memory but possibly not caches, etc. The idea that there's just a single chunk of memory which is accessed in a simple way is very handy for programmers, but lousy for performance. In addition, if a processor knows it might have to read a bit of memory "soon", it could decide to read it early, etc. Hardware manufacturers and compiler writers (including JIT-compiler writers) have worked very hard to make fast code easy to write. The memory model of a platform is the specification of what developers can do safely without knowing too much about the details of the hardware the platform is running on. This means (in our case) that you can run .NET code on any CPU which has a CLR, and so long as you follow the rules of the memory model, you should be okay - however "strong" or "weak" the memory model of the hardware itself is. (A "strong" memory model is one which guarantees a lot; a "weak" model is one which doesn't guarantee much at all, often giving better performance but requiring more work on the part of the developer. "x86" processors have a stronger memory model than the CLR itself, which is one reason problems such as seeing stale data are relatively hard to demonstrate.)
The memory model in .NET talks about when reads and writes "actually" happen compared with when they occur in the program's instruction sequence. Reads and writes can be reordered in any way which doesn't violate the rules given by the memory model. As well as "normal" reads and writes there are volatile reads and writes. Every read which occurs after a volatile read in the instruction sequence occurs after the volatile read in the memory model too - they can't be reordered to before the volatile read. A volatile write goes the other way round - every write which occurs before a volatile write in the instruction sequence occurs before the volatile write in the memory model too.
Don't worry if the above doesn't make much sense - the
resource section at the end of this page contain a
few links which should help you out a bit if you want to really
understand it thoroughly, but the rule is pretty simple: when you have
access to shared data, you need to make sure you read fresh data and
write any changes back in a timely manner. There are two ways of doing
this - volatile variables, and using lock
again.
A variable which is declared volatile uses volatile reads and writes
for all its accesses. You can only declare a variable to be volatile if
it's one of the following types: a reference type, byte
,
sbyte
, short
, ushort
,
int
, uint
, char
, float
, or
bool
, or an enumeration with a base type of byte
,
sbyte
, short
, ushort
,
int
, or uint
. If you're only interested in sharing
a single piece of data, and it's one of the above types, then using a volatile
variable is probably the easiest way to go. Note, however, that for a reference
type, only the access to the variable itself is volatile - if you write to
something within the instance the reference refers to, that write won't be
volatile. Personally I don't use volatile variables much, preferring the other
approach: locking.
We've already seen how locking is used to limit access to a single thread at a
time. It also has another side effect: a call to Monitor.Enter
performs
an implicit volatile read, and a call to Monitor.Exit
performs
an implicit volatile write. The two effects combine nicely: if you're reading,
you perform a volatile read, so you know that your next read will be from
main memory - and because you're then in a lock, you know that nothing else will
be trying to change the value. Similarly, if you're writing, you know that nothing else
will be trying to read the value between you writing it and the volatile write,
so nothing will see an old value - assuming all access to the variable
is covered with the same lock, of course. If you lock using one monitor for some access
to a variable, and another monitor for other access to the same variable, the volatility
and the locking won't mesh quite as nicely, and you won't get as strong a guarantee of
freshness of data. Fortunately, there's very little reason why you'd even want to
try this.
So, to get back to our sample program: it's currently flawed because the new thread
could read the value of stop
once (perhaps into a register) and then never
bother reading it from main memory. Alternatively, it could always read it from main memory,
but the original thread may never write it there. To fix it, we could either just make
stop
volatile, or we could use a lock. The volatile solution is simple
- just add the keyword volatile
to the variable declaration, and you're done.
The locking solution requires a bit more effort, and I'd make things slightly easier by
introducing a property to do the locking. So long as you then only refer to the variable
via the property, you don't need to write the lock all over the place. Here's the full code
with a property which locks:
using System.Threading; public class Test { static bool stop; static readonly object stopLock = new object(); static bool Stop { get { lock (stopLock) { return stop; } } set { lock (stopLock) { stop = value; } } } static void Main() { ThreadStart job = new ThreadStart(ThreadJob); Thread thread = new Thread(job); thread.Start(); // Let the thread start running Thread.Sleep(2000); // Now tell it to stop counting Stop = true; } static void ThreadJob() { int count = 0; while (!Stop) { Console.WriteLine ("Extra thread: count {0}", count); Thread.Sleep(100); count++; } } } |
Unfortunately there's no way of getting the compiler to complain if you
access stop
directly, so you do need to be careful to always use the property.
As of .NET 1.1, there is another way of achieving a memory barrier: Thread.MemoryBarrier()
.
In future versions there may well be separate method calls for "write" memory barriers and "read" memory
barriers. I would advise steering well clear of these unless you're an expert - even the experts
seem to argue amongst themselves about what's needed when. (Read the links in the
resource section for more information.)
Just to reiterate: working things out to be as efficient as possible but still absolutely correct is hard. Fortunately, using locks whenever you want to access shared data is relatively easy and correct by the model. Stick to the simple way of doing things and you don't need to worry about all this too much.
This section is here almost as an aside - because if you're writing thread-safe code to start with, atomicity isn't particularly relevant to you. However, it's a good idea to clear up what atomicity is all about, because many people believe it's to do with volatility and the like.
An operation is atomic if it is indivisible - in other words, nothing else can happen in the middle. So, with an atomic write, you can't have another thread reading the value half way through the write, and ending up "seeing" half of the old value and half of the new value. Similarly, with an atomic read, you can't have another thread changing the value half way through the read, ending up (again) with a value which is neither the old value nor the new value.
The CLR guarantees that for types which are no bigger than the size of a native integer,
if the memory is properly aligned (as it is by default - if you specify an explicit layout,
that could change the alignment), reads and writes are atomic. In other words, if one thread
is changing a properly aligned int
variable's value from 0 to 5 and another thread
is reading the variable's value, it will only ever see 0 or 5 - never 1 or 4, for instance.
For a long
, however, on a 32-bit machine, if one thread is changing the value from 0 to
0x0123456789abcdef, there's no guarantee that another thread won't see the value as 0x0123456700000000
or 0x0000000089abcdef. You'd have to be unlucky - but writing thread-safe code is all about taking
luck out of the equation.
Fortunately, using the techniques I've already mentioned, you rarely need to worry about atomicity at all. Certainly if you use locking, you don't need to worry as you're already making sure that a read and a write can't overlap. If you use volatile variables there may be a slight chance of problems, as although every type which can be volatile can be atomically written and read, if the alignment of the variable is wrong, you could still get non-atomic reads and writes - the volatility doesn't provide any extra guarantees. Just another reason to use locking :)
Interlocked
class
Just occasionally, locking is a bit too much effort (and possibly too much of
a performance hit) for doing very simple operations such as counting. The Interlocked
class provides a set of methods for performing atomic changes: exchanges (optionally performing a comparison
first), increments and decrements. The Exchange
and CompareExchange
methods act on
variables of type int
, object
or float
; the Increment
and Decrement
methods act on variables of type int
or long
.
Frankly I've never used the class myself in production code - I prefer to take the simple approach of
using one tool (locking) to sort out all my volatility, atomicity and race-avoidance problems. However,
that does come at the cost of a bit of performance. While that's never bothered me overly, if you're
writing code which needs to perform at its absolute fastest, you may want to consider using this class
as a fast way of performing the very specific operations it provides. Here's a sample - the first example
I used to illustrate data races, rewritten to be thread-safe using the Interlocked
class:
using System; using System.Threading; public class Test { static int count=0; static void Main() { ThreadStart job = new ThreadStart(ThreadJob); Thread thread = new Thread(job); thread.Start(); for (int i=0; i < 5; i++) { Interlocked.Increment(ref count); } thread.Join(); Console.WriteLine ("Final count: {0}", count); } static void ThreadJob() { for (int i=0; i < 5; i++) { Interlocked.Increment(ref count); } } } |
Back to the main C# page.