Data races

Well, that's shown you how to create a new thread and start it. For a very few cases, it really is as simple as that - just occasionally, you end up with a thread which doesn't need access to any data other than its own (the counters in this case). Far more commonly, however, you need threads to access the same data, sooner or later - and that's where the problems start. Let's take a very simple program to start with:

using System;
using System.Threading;

public class Test
{
                  static int count=0;
    
                  static void Main()
    {
        ThreadStart job = new ThreadStart(ThreadJob);
        Thread thread = new Thread(job);
        thread.Start();
        
                  for (int i=0; i < 5; i++)
        {
            count++;
        }
        
        thread.Join();
        Console.WriteLine ("Final count: {0}", count);
    }
    
                  static void ThreadJob()
    {
                  for (int i=0; i < 5; i++)
        {
            count++;
        }
    }
}    

This is very straightforward - each of the threads just increments the count variable, and then the main thread displays the final value of count at the end. The only really new thing here is the call in the main thread to Thread.Join, which basically pauses the main thread until the other thread has completed.

So, the result should always be Final count: 10, right? Well, no. In fact, chances are that that will be the result if you run the above code - but it isn't guaranteed to be. There are two reasons for this - one fairly simple, and one much subtler. We'll leave the subtle one for the moment, and just consider the simple one.

The statement count++; actually does three things: it reads the current value of count, increments that number, and then writes the new value back to the count variable. Now, if one thread gets as far as reading the current value, then the other thread takes over, does the whole increment operation, and then the first thread gets control again, its idea of the value of count is out of date - so it will increment the old value, and write that newly incremented (but wrong) value back into the variable.

The easiest way of showing this is by separating the three operations and introducing some Sleep calls into the code, just to make it more likely that the threads will clash heads, as it were. Note that introducing Sleep calls should never change the correctness of a program, in terms of threading - any thread can go to sleep at any time, basically. In other words, you can never rely on two operations both happening without another thread doing stuff in between. I've also put some diagnostics in to make it clearer what's happening. The "main" thread's activities appear on the left, while the "other" thread's activities are on the right. Here's the code:

using System;
using System.Threading;

public class Test
{
                  static int count=0;
    
                  static void Main()
    {
        ThreadStart job = new ThreadStart(ThreadJob);
        Thread thread = new Thread(job);
        thread.Start();
        
                  for (int i=0; i < 5; i++)
        {
                  int tmp = count;
            Console.WriteLine ("Read count={0}", tmp);
            Thread.Sleep(50);
            tmp++;
            Console.WriteLine ("Incremented tmp to {0}", tmp);
            Thread.Sleep(20);
            count = tmp;
            Console.WriteLine ("Written count={0}", tmp);
            Thread.Sleep(30);
        }
        
        thread.Join();
        Console.WriteLine ("Final count: {0}", count);
    }
    
                  static void ThreadJob()
    {
                  for (int i=0; i < 5; i++)
        {
                  int tmp = count;
            Console.WriteLine ("\t\t\t\tRead count={0}", tmp);
            Thread.Sleep(20);
            tmp++;
            Console.WriteLine ("\t\t\t\tIncremented tmp to {0}", tmp);
            Thread.Sleep(10);
            count = tmp;
            Console.WriteLine ("\t\t\t\tWritten count={0}", tmp);
            Thread.Sleep(40);
        }
    }
}

... and here's one set of results I saw ...

Read count=0
                                Read count=0
                                Incremented tmp to 1
                                Written count=1
Incremented tmp to 1
Written count=1
                                Read count=1
                                Incremented tmp to 2
Read count=1
                                Written count=2
                                Read count=2
Incremented tmp to 2
                                Incremented tmp to 3
Written count=2
                                Written count=3
Read count=3
                                Read count=3
Incremented tmp to 4
                                Incremented tmp to 4
                                Written count=4
Written count=4
                                Read count=4
Read count=4
                                Incremented tmp to 5
                                Written count=5
Incremented tmp to 5
Written count=5
Read count=5
Incremented tmp to 6
Written count=6
Final count: 6

Just looking at the first few lines shows exactly the nasty behaviour described before the code: the main thread has read the value 0, the other thread has incremented count to 1, and then the main thread has incremented its "stale" value from 0 to 1, and written that value to the variable. The same thing happens a few more times, and the end result is that count is 6, instead of 10.

Exclusive access - Monitor.Enter/Exit and the lock statement

What we need to fix the problem above is to make sure that while one thread is in a read/increment/write operation, no other threads can try to do the same thing. This is where monitors come in. Every object in .NET has a (theoretical) monitor associated with it. A thread can enter (or acquire) a monitor only if no other thread has currently "got" it. Once a thread has acquired a monitor, it can acquire it more times, or exit (or release) it. The monitor is only available to other threads again once it has been exited as many times as it was entered. If a thread tries to acquire a monitor which is owned by another thread, it will block until it is able to acquire it. (There may be more than one thread trying to acquire the monitor, in which case when the current owner thread releases it for the last time, only one of the threads will acquire it - the other one will have to wait for the new owner to release it too.)

In our example, we want exclusive access to the count variable while we're performing the increment operation. First we need to decide on an object to use for locking. I'll discuss this choice in more detail later, but for the moment we'll introduce a new variable just for the purposes of locking: countLock. This is initialised to be a reference a new object, and thereafter is never changed. It's important that it's not changed - otherwise one thread would be locking on one object's monitor, and another object might be locking on a different object's monitor, so they could interfere with each other just like they did before.

We then simply need to put each increment operation in a Monitor.Enter and Monitor.Exit pair:

   
using System;
using System.Threading;

public class Test
{
                  static int count=0;
                  static readonly object countLock = new object();
    
                  static void Main()
    {
        ThreadStart job = new ThreadStart(ThreadJob);
        Thread thread = new Thread(job);
        thread.Start();
        
                  for (int i=0; i < 5; i++)
        {
            Monitor.Enter(countLock);
                  int tmp = count;
            Console.WriteLine ("Read count={0}", tmp);
            Thread.Sleep(50);
            tmp++;
            Console.WriteLine ("Incremented tmp to {0}", tmp);
            Thread.Sleep(20);
            count = tmp;
            Console.WriteLine ("Written count={0}", tmp);
            Monitor.Exit(countLock);
            Thread.Sleep(30);
        }
        
        thread.Join();
        Console.WriteLine ("Final count: {0}", count);
    }
    
                  static void ThreadJob()
    {
                  for (int i=0; i < 5; i++)
        {
            Monitor.Enter(countLock);
                  int tmp = count;
            Console.WriteLine ("\t\t\t\tRead count={0}", tmp);
            Thread.Sleep(20);
            tmp++;
            Console.WriteLine ("\t\t\t\tIncremented tmp to {0}", tmp);
            Thread.Sleep(10);
            count = tmp;
            Console.WriteLine ("\t\t\t\tWritten count={0}", tmp);
            Monitor.Exit(countLock);
            Thread.Sleep(40);
        }
    }
}

The results look a lot better this time:

Read count=0
Incremented tmp to 1
Written count=1
                                Read count=1
                                Incremented tmp to 2
                                Written count=2
Read count=2
Incremented tmp to 3
Written count=3
                                Read count=3
                                Incremented tmp to 4
                                Written count=4
Read count=4
Incremented tmp to 5
Written count=5
                                Read count=5
                                Incremented tmp to 6
                                Written count=6
Read count=6
Incremented tmp to 7
Written count=7
                                Read count=7
                                Incremented tmp to 8
                                Written count=8
Read count=8
Incremented tmp to 9
Written count=9
                                Read count=9
                                Incremented tmp to 10
                                Written count=10
Final count: 10

The fact that the increments were strictly alternating here is just due to the sleeps - in a more normal system there could be two increments in one thread, then three in another, etc. The important thing is that they would always be thread-safe: each increment would be isolated from each other increment, with only one being processed at a time.

There's a chance - a tiny chance, but a chance nonetheless - that the code above would hang, however. If part of the increment operation (one of the calls to Console.WriteLine, for instance) threw an exception, the thread would still own the monitor, so the other thread would never be able to acquire it and move on. The obvious solution to this (if you're used to exception handling, at least) is to put the call to Monitor.Exit in a finally block, with everything after the call to Monitor.Enter in a try block. Just like the using statement which puts a call to Dispose in a finally block automatically, C# provides the lock statement to call Monitor.Enter and Monitor.Exit with a try/finally block automatically. This makes it much easier to get synchronization right, as you don't end up having to check for "balanced" calls to Enter and Exit everywhere. It also makes sure that you don't try to release a monitor you don't own: in the code we had above, if we changed the value of countLock to be a reference to a different object within the increment operation, we'd have failed to release the monitor we owned, and tried to release a monitor we didn't own - which would (in theory) have caused a SynchronizationLockException. (In fact, the exception wouldn't have been thrown because there's a bug in the framework in version 1.0/1.1, but that's another story.) The lock statement automatically takes a copy of the reference you specify, and calls both Enter and Exit with it. (In the example above, and everywhere else in this article, variables used to hold locks are declared as read-only. I have yet to come across a good reason to change what a particular piece of code locks on.)

So, we can rewrite our previous code into the somewhat clearer and more robust code below:

  
using System;
using System.Threading;

public class Test
{
                  static int count=0;
                  static readonly object countLock = new object();
    
                  static void Main()
    {
        ThreadStart job = new ThreadStart(ThreadJob);
        Thread thread = new Thread(job);
        thread.Start();
        
                  for (int i=0; i < 5; i++)
        {
                  lock (countLock)
            {
                  int tmp = count;
                Console.WriteLine ("Read count={0}", tmp);
                Thread.Sleep(50);
                tmp++;
                Console.WriteLine ("Incremented tmp to {0}", tmp);
                Thread.Sleep(20);
                count = tmp;
                Console.WriteLine ("Written count={0}", tmp);
            }
            Thread.Sleep(30);
        }
        
        thread.Join();
        Console.WriteLine ("Final count: {0}", count);
    }
    
                  static void ThreadJob()
    {
                  for (int i=0; i < 5; i++)
        {
                  lock (countLock)
            {
                  int tmp = count;
                Console.WriteLine ("\t\t\t\tRead count={0}", tmp);
                Thread.Sleep(20);
                tmp++;
                Console.WriteLine ("\t\t\t\tIncremented tmp to {0}", tmp);
                Thread.Sleep(10);
                count = tmp;
                Console.WriteLine ("\t\t\t\tWritten count={0}", tmp);
            }
            Thread.Sleep(40);
        }
    }
}

Next page: Deadlocks, Waiting and Pulsing
Previous page: Passing Parameters to Threads

Back to the main C# page.