Code through the pain Ladislav Mrnka's professional blog about software development

29Sep/124

Data access code follows simple implementations patterns. One of very common patterns consist of fetching data from a database, modifying those data and persisting those data back to the database in the same transaction. Using this pattern in any concurrent environment where multiple threads (or processes) can work with the same record concurrently results in new challenges. How should our application handle scenario when another thread updated the record after the first thread fetched the record from database but before it was able to persist its changes back to the database? This area of database access implementation is in general referred as concurrency. We differ between two common solutions: Optimistic concurrency handling and Pessimistic concurrency handling.

Optimistic concurrency uses a special column in the table to track if the record was changed. The main requirement is that the column must change its value every time when the record is updated. Each relational database offers a special data type for this purpose which is automatically updated every time its record is updated. Microsoft SQL Server offers deprecated TIMESTAMP and newer recommended ROWVERSION data types for this purpose.

If we use Optimistic concurrency to solve the initial scenario with two threads the first thread will load the record together with an actual value of the time stamp and when it tries persisting the change it uses time stamp's value in WHERE condition of UPDATE command together with the record identification (for example key). If another thread updates the record in the mean time the update from the first thread doesn't find any record matching its WHERE condition. This will result in 0 updated records (in Microsoft SQL Server this information is received by executing SELECT @@ROWCOUNT query after the update command) and the thread should somehow react to this situation. The reaction follows common pattern of reloading the record to get current values, applying changes to reloaded data and persisting them again. The real implementation of this pattern may involve for example user interaction where a user must resolve the conflict. Entity framework has direct support for Optimistic concurrency in both ObjectContext API (also shows required mapping in EDMX) and in DbContext API. Code first supports mapping in both data annotations and fluent API.

Pessimistic concurrency doesn't use any special column but it instead uses transactional and locking mechanisms offered by relational database. The whole idea of pessimistic concurrency can be compared to critical section. The fetching of the record, its update and followed persistence must be considered as atomic operation which can be executed by only single thread (per record or record set). When the first thread enters this critical section no other thread can modify the record or enter critical section for the same record. Using this mechanism requires understanding of transaction isolation levels and locking mechanism in used database. The bad news is that Entity framework doesn't have any built-in support for Pessimistic concurrency.

This article was inspired by question asked by Ari on Stack Overflow. The rest of the article will show why base Entity framework update patterns do not provide pessimistic concurrency and why approach described in the mentioned question from Stack Overflow doesn't work. The article will also provide a real solution for pessimistic concurrency which requires usage of native SQL.

Simple scenario to simulate atomic modification

All following examples will be based on simple test scenario where two threads are trying to fetch the same entity from the database, increment its Value property and persist the entity back to the database. Each run will create a fresh database with single entity which will have Value initiated to 1. The test will print current value in the database, execute both threads in parallel and print current value in database after performing modifications. Each thread will print information when it will be incrementing the value, saving changes and when it will finish persisting changes to the database. Expected output from the test application is:

Current value = 1
Incrementing
Saving
Persisted
Incrementing
Saving
Persisted
Current value = 3

Different order of operations, different value after performing modifications or exception means that the example doesn't solve the problem and two threads can enter critical section concurrently.

Shared Code for all examples

All examples will use very basic entity with just an id and value column. The ultimate goal of all examples will be performing atomic increment of the value:

public class TestEntity {
    public int Id { get; set; }
    public int Value { get; set; }
}

public class Context : DbContext {
    public DbSet<TestEntity> TestEntities { get; set; }
}

All examples will also share simple test console application where only implementation of RunTest will change:

public class Program {
    public static void Main(string[] args) {
        Initialize(); // Just initial set up if DB not exists
        PrintValue(); // Get value before tests
        // Run test with specified isolation level
        RunTest(IsolationLevel.Serializable);
        PrintValue(); // Get value after tests
    }

    private static void PrintValue() {
        using (var context = new Context()) {
            Console.WriteLine("Current value = {0}",
                context.TestEntities.First().Value);
        }
    }

    private static void Initialize() {
        using (var context = new Context()) {
            if (!context.Database.Exists()) {
                context.Database.Exists();
                context.TestEntities.Add(new TestEntity {
                    Value = 1
                });
                context.SaveChanges();
            }
        }
    }

    private static void RunTest(IsolationLevel isolationLevel) {
        // TODO
    }

    // Additional helper methods used in RunTest implementations

    private static void PrintError(Exception e) {
        while (e != null) {
            Console.WriteLine(e.Message);
            e = e.InnerException;
        }
    }

    private static TestEntity GetFirstEntity(Context context) {
        return context.TestEntities.Single(); // DB contains only single record
    }
}

Let's start with basic approach

The most basic RunTest implementation should simply run two threads and execute fetching, updating and persistence concurrently. This implementation will  use native transaction handled internally in SaveChanges method call so it will not use isolationLevel passed to RunTest method:

private static void RunTest(IsolationLevel isolationLevel) {
    // Handles are used only to wait for both workers to complete
    var handles = new WaitHandle[] {
        new ManualResetEvent(false),
        new ManualResetEvent(false),
    };

    // Worker's job
    WaitCallback job = state => {
        try {
            using (var context = new Context()) {
                var entity = GetFirstEntity(context);
                Console.WriteLine("Incrementing");
                entity.Value++;
                Console.WriteLine("Saving");
                context.SaveChanges(); // Save changes to DB
            }

            Console.WriteLine("Persisted");
        } catch (Exception e) {
            PrintError(e);
        } finally {
            // Signal to the main thread that the worker has completed
            ((ManualResetEvent)state).Set();
        }
    };

    // Run workers
    ThreadPool.QueueUserWorkItem(job, handles[0]);
    ThreadPool.QueueUserWorkItem(job, handles[1]);

    // Wait for both workers to complete
    WaitHandle.WaitAll(handles);
}

If we run this code we will most probably receive the expected result. Well done! The issue is solved, isn't it? Not exactly.

Force basic approach to fail

The reason why the basic approach may work is that our first worker thread completes before the second thread is executed - it is a race condition. Let's modify the code little bit to force some synchronization between threads. We will use Barrier for two participants. Barrier blocks signalling threads until defined number of participants signals that they have reached the barrier.  We will put barrier signalling after fetching the record from the database. If our implementation really uses Pessimistic concurrency our application should throw some exception because there should never be two threads inside the atomic operation and the barrier should never have two signalling participants (the second participant should block on fetching which happens prior to signalling):

private static void RunTest(IsolationLevel isolationLevel) {
    // Handles are used only to wait for both workers to complete
    var handles = new WaitHandle[] {
        new ManualResetEvent(false),
        new ManualResetEvent(false),
    };

    var barrier = new Barrier(2);

    // Worker's job
    WaitCallback job = state => {
        try {
            using (var context = new Context()) {
                var entity = GetFirstEntity(context);
                barrier.SignalAndWait();
                Console.WriteLine("Incrementing");
                entity.Value++;
                Console.WriteLine("Saving");
                context.SaveChanges(); // Save changes to DB
            }

            Console.WriteLine("Persisted");
        } catch (Exception e) {
            PrintError(e);
        } finally {
            // Signal to the main thread that worker has completed
            ((ManualResetEvent)state).Set();
        }
    };

    // Run workers
    ThreadPool.QueueUserWorkItem(job, handles[0]);
    ThreadPool.QueueUserWorkItem(job, handles[1]);

    // Wait for both workers to complete
    WaitHandle.WaitAll(handles);
}

Now if we try to run the test application we will receive output like:

Current value = 1
Incrementing
Saving
Incrementing
Saving
Persisted
Persisted
Current value = 2

or

Current value = 1
Incrementing
Incrementing
Saving
Saving
Persisted
Persisted
Current value = 2

This demonstration proves that basic approach allows multiple threads to enter critical section and it doesn't implement Pessimistic concurrency.

Why basic approach fails?

The basic approach doesn't use Pessimistic concurrency because it doesn't couple the query for fetching data with the related modification command. Modification command is executed in separate transaction inside SaveChanges but the query is not part of that transaction. Coupling the query with the modification command is the first (but not the only one!) step toward implementation of Pessimistic concurrency. To include fetching into transaction we must use TransactionScope and execute both fetching and SaveChanges in that scope. Before we use TransactionScope let's recapitulate some transactional basics.

Database transactions and locks

Database transactions are programming constructs for reliable and isolated work with data stored in the database. They are essential tool for keeping our data consistent. Each transaction must have properties known as ACID:

  • Atomic - all changes in the transaction must be successfully performed. If any change fails all already performed changes from the same transaction must be reverted.
  • Consistent - all changes executed in the transaction must pass all constraints and rules defined in database.
  • Isolated - transactions must not be affected by uncommitted changes made by other concurrent transactions.
  • Durable  - successfully executed and committed transactions must be persisted.

The most important property related to Pessimistic concurrency is isolation. To achieve isolation the database must somehow handle uncommitted modified or deleted records and that is exactly what pessimistic concurrency is about. Traditional transaction processing changes records in the table directly (I will mention Snapshot isolation with another approach at the end of this section) and it uses locks to control transactional isolation. There are three types of locks we are interested in:

  • Shared lock - shared locks (or read locks) are used by transactions when querying data. Any number of concurrent transactions can hold shared locks on same records because these locks mean that data are not dirty (are not modified by uncommitted transaction).
  • Update lock - special read lock used by transaction when querying data with intention to change them. Update lock is compatible with shared lock so it can be obtained when other transaction holds shared lock on the same record and other transactions can obtain shared lock when update lock is already held. The difference from shared lock is that no other transaction can place update lock on the same record => only one transaction can hold update lock for the record.
  • Exclusive lock - exclusive locks (or write locks) are used by transactions when modifying data (insert, update, delete operations). Only single transaction can have exclusive lock on the record and it can obtain it only when no other transaction holds shared or update lock on the same record => if transaction holds exclusive lock on the record no other proper transaction can read or modify this record until this transaction commits. There is one exception abusing the exclusivity of write locks which is known as dirty reads (or uncommitted reads) - other transaction or query can be executed in a mode when it doesn't use locks and reads uncommitted data - this is abusing of isolation property and it should be used only in very special cases where uncommitted data cannot cause inconsistency or any other problems.

Usage of locks is normally controlled by setting isolation level for transactions. We have four common isolation levels for traditional transactional processing (the processing based on locking records in the table). All isolation levels use exclusive locks for data modifications with all their restrictions to comply with consistency requirement. The differences among isolation levels are:

  • ReadUncommitted - least restrictive level where queries do not use any locks and because of that they can read uncommitted changes (dirty reads) processed by other running transactions.
  • ReadCommitted - default isolation level for MS SQL Server. Queries use shared locks on processed records but they release these locks immediately when the reading operation ends = shared locks are not held until transaction completes.
  • RepeatableRead - more restrictive reading level where shared locks issued by executed queries are held until the transaction completes. Holding shared locks on read records ensures that if transaction executes any query again it will receive all records read by the first execution with the same data because no other transaction can get an exclusive lock to update or delete them. Repeated query execution can still receive new records committed by other concurrent transactions.
  • Serializable - the most restrictive level where shared locks and additional range locks on read record keys are held until transaction completes. Range locks will ensure that no other transaction can insert record with key falling into the locked range so any subsequent execution of the same query within serializable transaction will always return exactly same data because no other transaction can get an exclusive lock to update or delete data or insert record which would fall into the result set of already executed query.

I didn't mentioned fifth isolation level provided by MS SQL Server because it doesn't relate to pessimistic concurrency and traditional implementation of locking. Snapshot isolation level does not modify data directly in the table when executing transaction. It instead keeps read and modified records separated in tempdb and uses timestamps to track original versions of records. Because the data are not modified in the main table, the database doesn't need to use any exclusive locks to block reading of uncommitted records - this can improve performance of reading operations but it will consume more resources (CPU, memory and disk drives). Timestamps obtained when modification begins are used later when the transaction commits to check if records were already modified by other transactions or not. Snapshot isolation level is internal database implementation of optimistic concurrency. I didn't find any information about it but I believe that snapshot isolation level still uses traditional locking when the transaction commits and modified data have to be passed from tempdb to real tables in the database but in contrast to traditional isolation levels these exclusive locks will be held for very short period of time.

Note: Snapshot isolation level must be allowed on database before we can use it for our transactions.

Let's use TransactionScope

This demonstration will use TransactionScope to implement atomic modification. TransactionScope by default uses the most restrictive Serializable transaction isolation level. The RunTest method will be implemented in the way to take used isolation level as parameter and pass it to TransactionScope. This will allow us easily testing the scenario with different isolation levels. The initial implementation will not use Barrier:

private static void RunTest(IsolationLevel isolationLevel) {
    // Handles are used only to wait for both workers to complete
    var handles = new WaitHandle[] {
        new ManualResetEvent(false),
        new ManualResetEvent(false),
    };

    var options = new TransactionOptions {
        IsolationLevel = isolationLevel,
        Timeout = new TimeSpan(0,0,0,10)
    };

    // Worker's job
    WaitCallback job = state => {
        try {
            using (var scope =
                new TransactionScope(TransactionScopeOption.RequiresNew, options)) {
                using (var context = new Context()) {
                    var entity = GetFirstEntity(context);
                    Console.WriteLine("Incrementing");
                    entity.Value++;
                    Console.WriteLine("Saving");
                    context.SaveChanges(); // Save changes to DB
                }

                scope.Complete(); // Commit transaction
                Console.WriteLine("Persisted");
            }
        } catch (Exception e) {
            PrintError(e);
        } finally {
            // Signal to the main thread that worker has completed
            ((ManualResetEvent)state).Set();
        }
    };

    // Run workers
    ThreadPool.QueueUserWorkItem(job, handles[0]);
    ThreadPool.QueueUserWorkItem(job, handles[1]);

    // Wait for both workers to complete
    WaitHandle.WaitAll(handles);
}

Executing this code with any isolation level will most probably again succeeds and produces expected output but is it really working? Unfortunately no because our test may again be affected by execution speed of the first worker thread and starting the second worker thread. To simulate real concurrent execution we will again use Barrier for synchronization but this time I will omit the example. Just add Barrier initialization on line 11 and signalling between lines 18 and 19. The output of the application will now differ based on used isolation level.

Why execution with ReadCommitted isolation level fails?

ReadCommitted serialization level produces same outputs as our basic approach.

Current value = 1
Incrementing
Saving
Incrementing
Saving
Persisted
Persisted
Current value = 2

It again allows multiple threads concurrently reading and updating values with non deterministic result. The reason is behavior of the isolation level where shared locks are not held until the transaction completes. Both threads can read the record concurrently and then save their changes.

Why execution with RepeatableRead and Serializable level fails?

These two isolation levels behaves in the same way. Only single thread successfully saves changes and the second ends with exception:

Current value = 1
Incrementing
Saving
Incrementing
Saving
Persisted
An error occurred while updating the entries. See the inner exception for details.
An error occurred while updating the entries. See the inner exception for details.
Transaction (Process ID 52) was deadlocked on lock resources with another process
and has been chosen as the deadlock victim. Rerun the transaction.
Current value = 2

The reason for exception is behavior of these isolation levels where shared locks are held until transaction completes. Because our Barrier synchronizes execution after fetching data both transactions have already issued shared lock on the same database record. When they proceed to saving changes they both need to upgrade the lock to exclusive but they cannot because other transaction is already holding shared lock. SQL Server will detect the deadlock after a while and fails one transaction to solve the deadlock. Failing one transaction will release shared lock and the second transaction will be able to upgrade its shared lock to exclusive lock and proceed with execution.

In some cases this is considered as "sufficient" solution because we know that transaction will never silently overwrite concurrent changes made by another transaction. Sometimes this is automatically "resolved" by executing transaction with some retry mechanism which will execute the whole transaction again if it fails due to deadlock but there is a better solution.

How to make it work?

You may now asks what is the trick to make this work? When describing database locks I mentioned three types but till know our examples used only shared locks to read data and exclusive locks to modify data. The solution for the problem is to use update locks for reading instead of shared locks. Queries in transaction do not use update locks automatically (because of their more restrictive behavior). If we want to use update lock we must explicitly use UPDLOCK table hint in a query. Here comes the biggest issue. Using table hint requires interaction with SQL directly because Entity framework doesn't support them. We cannot write Linq-to-entities query which will use update lock instead of shared lock.

Note: There is actually an option to use UPDLOCK with Linq-to-entities. At the time of writing this article the option is only possible when using .NET 4.5 and database first approach (EDMX) because it is dependent on mapping table valued function which will execute query with table hint. I plan to write separate article about table valued functions because they bring very powerful tool set to Entity framework.

To demonstrate the real atomic modification we will again use the example with TransactionalScope and we start with the version using Barrier to synchronize both threads inside the critical section. Instead of modifying the transaction itself we will modify GetFirstEntity method to use native SQL for the record retrieval:

private static TestEntity GetFirstEntity(Context context) {
    return context.TestEntities
                  .SqlQuery("SELECT TOP 1 Id, Value FROM TestEntities WITH (UPDLOCK)")
                  .Single();
}

When executing our test application with either ReadCommitted, RepeatableRead or Serializable transaction isolation level we will get following output:

Current value = 1
Incrementing
Saving
Incrementing
Saving
The underlying provider failed on Open.
The operation is not valid for the state of the transaction.
Transaction Timeout
The underlying provider failed on Open.
The operation is not valid for the state of the transaction.
Transaction Timeout
Current value = 1

This output needs little bit deeper discussion because it still looks like concurrent execution inside our critical section. When we execute our two worker threads, one of them fetches data from the database and acquires update lock on the record. It will proceed to the barrier and wait for the second thread to reach the barrier as well but the second thread cannot reach it. The second thread is blocked on fetching data because it needs to acquire update lock on the record already used by the first thread - there cannot be two update locks on the same record. At this point we have another deadlock but this time the deadlock is not in our database but in our application.

The deadlock will be resolved after a while automatically when the database transaction used by the first thread timeouts. This will cause releasing the update lock and allowing second thread fetching data and proceeding to barrier. After both threads signal on barrier they will continue execution till save changes. The fist thread will fail when calling SaveChanges because its transaction has already timed out (that timeout resolved our deadlock - without that timeout our application would never proceed in the execution). The output also shows the timeout of the second transaction because the transaction was running all the time when the thread was blocked on fetching the record. It timed out almost immediately after the first transaction but still later enough to read the record from the database - output for the second thread is a race condition and it can differ per run. In theory the second transaction can even complete successfully.

If we remove the barrier from our test code and execute it with any transaction level we will have working solution for our atomic modification.

Current value = 1
Incrementing
Saving
Persisted
Incrementing
Saving
Persisted
Current value = 3

Manual control of locking by using table hints overwrites defaults defined by transaction isolation level so this solution also works for ReadUncommitted isolation level. The correct behavior of Pessimistic concurrency is dependent on correctly used UPDLOCK in queries. If we forget to use update locks in some transactions the Pessimistic concurrency will not work!

Conclusion

Pessimistic concurrency  is  not directly supported by Entity framework but in case of NET 4.0 we can use EF's helper methods like SqlQuery (DbContext API) or ExecuteStoreQuery (ObjectContext APi) for executing native SQL with UPDLOCK table hints. That will allow us manually handling Pessimistic concurrency inside transaction but we will lose support of Linq-to-entities for these queries. There is already suggestion on Data UserVoice requesting support for locking hints. Another option is improving SqlQuery method to return IQueryable instead of DbSqlQuery.

.NET 4.5 offers better UPDLOCK integration which is still dependent on native SQL provided in table valued function - I plan to write separate  article about table valued functions and some interesting scenarios for their usage.

Posted on September 29, 2012 by Ladislav Mrnka
Filed under: Entity framework
Leave a comment
Comments (4) Trackbacks (0)
  1. Dobry den, mohol by som vas poprosit o konzultaciu? Dakujem velmi pekne

  2. Good walk through of the issues of optimistic and pessimistic concurrency with the EntityFramework ORM. Many thanks to Ladislav for this one!

  3. Complete and thoughtful as usual.
    After seeing this article and your many answers at stackoverflow on some EF topics like dataAnnotations and why you think they are an abomination made me dream of a series of articles covering the creation of some kind of a boilerplate solution for an enterprise application applying what you think to be the best approach for the most different EF + MVC aspects like model validation, separation of concerns, concurrency checking (applying what you said in this article, for example), etc.

  4. Wounderful article to help us to understand the the concurrency in EF.
    Thanks and keep it up.


Leave a comment

(required)

No trackbacks yet.