Volatility, Atomicity and Interlocking

Earlier, when introducing the topic of data races, I mentioned that there was a more subtle reason why the first attempt at the code wasn't thread-safe. It's to do with volatility and data caching. Here's a sample which will make explaining the topic somewhat easier:

using System;
using System.Threading;

public class Test
{
                  static bool stop;
    
                  static void Main()
    { 
        ThreadStart job = new ThreadStart(ThreadJob);
        Thread thread = new Thread(job);
        thread.Start();

                  // Let the thread start running
        Thread.Sleep(2000);
        
                  // Now tell it to stop counting
        stop = true;
    }
    
                  static void ThreadJob()
    {
                  int count = 0;
                  while (!stop)
        {
            Console.WriteLine ("Extra thread: count {0}", count);
            Thread.Sleep(100);
            count++;
        }
    }
}

Now, this code is fairly simple. We have a boolean variable (stop) which is polled by the new thread - it will keep counting until it notices that stop is true. In the main thread, we pause for a couple of seconds and then set stop to true.

So, the new thread should count for a couple of seconds and then stop, right? Well, in fact that's what will almost certainly happen if you run the code, but it's not guaranteed. The while loop in the new thread could keep running forever, never really checking whether or not the stop variable has been set to true. If that sounds bizarre to you, welcome to the weird and wonderful world of memory models.

Memory in modern computers is a very complicated business, with registers, multiple levels of cache, and multiple processors sharing main memory but possibly not caches, etc. The idea that there's just a single chunk of memory which is accessed in a simple way is very handy for programmers, but lousy for performance. In addition, if a processor knows it might have to read a bit of memory "soon", it could decide to read it early, etc. Hardware manufacturers and compiler writers (including JIT-compiler writers) have worked very hard to make fast code easy to write. The memory model of a platform is the specification of what developers can do safely without knowing too much about the details of the hardware the platform is running on. This means (in our case) that you can run .NET code on any CPU which has a CLR, and so long as you follow the rules of the memory model, you should be okay - however "strong" or "weak" the memory model of the hardware itself is. (A "strong" memory model is one which guarantees a lot; a "weak" model is one which doesn't guarantee much at all, often giving better performance but requiring more work on the part of the developer. "x86" processors have a stronger memory model than the CLR itself, which is one reason problems such as seeing stale data are relatively hard to demonstrate.)

The memory model in .NET talks about when reads and writes "actually" happen compared with when they occur in the program's instruction sequence. Reads and writes can be reordered in any way which doesn't violate the rules given by the memory model. As well as "normal" reads and writes there are volatile reads and writes. Every read which occurs after a volatile read in the instruction sequence occurs after the volatile read in the memory model too - they can't be reordered to before the volatile read. A volatile write goes the other way round - every write which occurs before a volatile write in the instruction sequence occurs before the volatile write in the memory model too.

Don't worry if the above doesn't make much sense - the resource section at the end of this page contain a few links which should help you out a bit if you want to really understand it thoroughly, but the rule is pretty simple: when you have access to shared data, you need to make sure you read fresh data and write any changes back in a timely manner. There are two ways of doing this - volatile variables, and using lock again.

A variable which is declared volatile uses volatile reads and writes for all its accesses. You can only declare a variable to be volatile if it's one of the following types: a reference type, byte, sbyte, short, ushort, int, uint, char, float, or bool, or an enumeration with a base type of byte, sbyte, short, ushort, int, or uint. If you're only interested in sharing a single piece of data, and it's one of the above types, then using a volatile variable is probably the easiest way to go. Note, however, that for a reference type, only the access to the variable itself is volatile - if you write to something within the instance the reference refers to, that write won't be volatile. Personally I don't use volatile variables much, preferring the other approach: locking.

We've already seen how locking is used to limit access to a single thread at a time. It also has another side effect: a call to Monitor.Enter performs an implicit volatile read, and a call to Monitor.Exit performs an implicit volatile write. The two effects combine nicely: if you're reading, you perform a volatile read, so you know that your next read will be from main memory - and because you're then in a lock, you know that nothing else will be trying to change the value. Similarly, if you're writing, you know that nothing else will be trying to read the value between you writing it and the volatile write, so nothing will see an old value - assuming all access to the variable is covered with the same lock, of course. If you lock using one monitor for some access to a variable, and another monitor for other access to the same variable, the volatility and the locking won't mesh quite as nicely, and you won't get as strong a guarantee of freshness of data. Fortunately, there's very little reason why you'd even want to try this.

So, to get back to our sample program: it's currently flawed because the new thread could read the value of stop once (perhaps into a register) and then never bother reading it from main memory. Alternatively, it could always read it from main memory, but the original thread may never write it there. To fix it, we could either just make stop volatile, or we could use a lock. The volatile solution is simple - just add the keyword volatile to the variable declaration, and you're done. The locking solution requires a bit more effort, and I'd make things slightly easier by introducing a property to do the locking. So long as you then only refer to the variable via the property, you don't need to write the lock all over the place. Here's the full code with a property which locks:

using System.Threading;

public class Test
{
                  static bool stop;
                  static readonly object stopLock = new object();
                  static bool Stop
    {
        get
        {
                  lock (stopLock)
            {
                  return stop;
            }
        }
        set
        {
                  lock (stopLock)
            {
                stop = value;
            }
        }
    }
    
                  static void Main()
    { 
        ThreadStart job = new ThreadStart(ThreadJob);
        Thread thread = new Thread(job);
        thread.Start();

                  // Let the thread start running
        Thread.Sleep(2000);
        
                  // Now tell it to stop counting
        Stop = true;
    }
    
                  static void ThreadJob()
    {
                  int count = 0;
                  while (!Stop)
        {
            Console.WriteLine ("Extra thread: count {0}", count);
            Thread.Sleep(100);
            count++;
        }
    }
}

Unfortunately there's no way of getting the compiler to complain if you access stop directly, so you do need to be careful to always use the property.

As of .NET 1.1, there is another way of achieving a memory barrier: Thread.MemoryBarrier(). In future versions there may well be separate method calls for "write" memory barriers and "read" memory barriers. I would advise steering well clear of these unless you're an expert - even the experts seem to argue amongst themselves about what's needed when. (Read the links in the resource section for more information.)

Just to reiterate: working things out to be as efficient as possible but still absolutely correct is hard. Fortunately, using locks whenever you want to access shared data is relatively easy and correct by the model. Stick to the simple way of doing things and you don't need to worry about all this too much.

Atomicity

This section is here almost as an aside - because if you're writing thread-safe code to start with, atomicity isn't particularly relevant to you. However, it's a good idea to clear up what atomicity is all about, because many people believe it's to do with volatility and the like.

An operation is atomic if it is indivisible - in other words, nothing else can happen in the middle. So, with an atomic write, you can't have another thread reading the value half way through the write, and ending up "seeing" half of the old value and half of the new value. Similarly, with an atomic read, you can't have another thread changing the value half way through the read, ending up (again) with a value which is neither the old value nor the new value.

The CLR guarantees that for types which are no bigger than the size of a native integer, if the memory is properly aligned (as it is by default - if you specify an explicit layout, that could change the alignment), reads and writes are atomic. In other words, if one thread is changing a properly aligned int variable's value from 0 to 5 and another thread is reading the variable's value, it will only ever see 0 or 5 - never 1 or 4, for instance. For a long, however, on a 32-bit machine, if one thread is changing the value from 0 to 0x0123456789abcdef, there's no guarantee that another thread won't see the value as 0x0123456700000000 or 0x0000000089abcdef. You'd have to be unlucky - but writing thread-safe code is all about taking luck out of the equation.

Fortunately, using the techniques I've already mentioned, you rarely need to worry about atomicity at all. Certainly if you use locking, you don't need to worry as you're already making sure that a read and a write can't overlap. If you use volatile variables there may be a slight chance of problems, as although every type which can be volatile can be atomically written and read, if the alignment of the variable is wrong, you could still get non-atomic reads and writes - the volatility doesn't provide any extra guarantees. Just another reason to use locking :)

A shortcut for some cases: the Interlocked class

Just occasionally, locking is a bit too much effort (and possibly too much of a performance hit) for doing very simple operations such as counting. The Interlocked class provides a set of methods for performing atomic changes: exchanges (optionally performing a comparison first), increments and decrements. The Exchange and CompareExchange methods act on variables of type int, object or float; the Increment and Decrement methods act on variables of type int or long.

Frankly I've never used the class myself in production code - I prefer to take the simple approach of using one tool (locking) to sort out all my volatility, atomicity and race-avoidance problems. However, that does come at the cost of a bit of performance. While that's never bothered me overly, if you're writing code which needs to perform at its absolute fastest, you may want to consider using this class as a fast way of performing the very specific operations it provides. Here's a sample - the first example I used to illustrate data races, rewritten to be thread-safe using the Interlocked class:

using System;
using System.Threading;

public class Test
{
                  static int count=0;
    
                  static void Main()
    {
        ThreadStart job = new ThreadStart(ThreadJob);
        Thread thread = new Thread(job);
        thread.Start();
        
                  for (int i=0; i < 5; i++)
        {
            Interlocked.Increment(ref count);
        }
        
        thread.Join();
        Console.WriteLine ("Final count: {0}", count);
    }
    
                  static void ThreadJob()
    {
                  for (int i=0; i < 5; i++)
        {
            Interlocked.Increment(ref count);
        }
    }
}

Next page: Threading In Windows Forms
Previous page: The Thread Pool and Asynchronous Methods

Back to the main C# page.