Understand isolation level

Isolation is the I in ACID, one of the 4 famous properties in database transaction.

Isolation level play a critial role in concurrent scenarios. Higher isolation level offers better gurantee of data consistency, however worse performance in concurrency.


To better understand 4 common isolation levels, let’s see the following examples.

Here we have one user table which has only two fields, id and name. There is one record in the table.

-------------
| id | name |
-----+-------
| 1  | Tom  |
-------------

Two transactions(TA, TB) occur concurrently:

Read Uncommitted

Read uncommitted is the lowest isolation level, which may result in dirty data, let’s have a look at this example:

#1 TA:

>> Begin();
>> select * from user where id=1;
-------------
| id | name |
-----+-------
| 1  | Tom  |
-------------

#2 TB:

>> Begin();
>> update set name="Kim" from user where id=1;

#3 TA:

>> select * from user where id=1;
-------------
| id | name |
-----+-------
| 1  | Kim  |
-------------

#4 TB:

>> Rollback();

#5 TA:

>> select * from user where id=1;
-------------
| id | name |
-----+-------
| 1  | Tom  |
-------------

In this example, name=Kim is the dirty data(which has been rollbacked). By right, in other transaction it shouldn’t have the knowledge of this value. But in step 3, TA accidently gets this temporary value.

In order to avoid such case, we can increase isolation level to read committed.

Read Committed

#1 TA:

>> Begin();
>> select * from user where id=1;
-------------
| id | name |
-----+-------
| 1  | Tom  |
-------------

#2 TB:

>> Begin();
>> update set name="Kim" from user where id=1;

#3 TA:

select * from user where id=1;
-------------
| id | name |
-----+-------
| 1  | Tom  |
-------------

#4 TB:

>> Commit();

#5 TA:

select * from user where id=1;
-------------
| id | name |
-----+-------
| 1  | Kim  |
-------------

In this example, TA can only read name=Kim after TB is committed, thus it solves the dirty read issue. Changes which are not committed to database will not be seen by other transaction.

Read committed isolation level has limitation as well:

In the lifetime of TA, it queries id=1 twice, and gets 2 different names(1st: Tom, 2nd: Kim), even if TA doesn’t change the value for id=1, the result is NOT stable every time this query runs.

To mitigate the above case, we can continue to increase isolation level to repeated read.

Details behind the scene, for every query in transaction, DB will create a read view(snapshot), For read committed isolation level, the read view will be released after statement is executed(not transaction committed or rollbacked).

As a result, for query select * from user where id<10 for update, DB won’t add gap lock for id < 10, and won’t prevent the insertion or update for id < 10 in other transaction.

Read committed is the default isolation level for Oracle, SQL Server and most of distributed transaction manager.

Repeatable Read

#1 TA:

>> Begin();
>> select * from user where id=1;
-------------
| id | name |
-----+-------
| 1  | Tom  |
-------------

#2 TB:

>> Begin();
>> update set name="Kim" from user where id=1;
>> Commit();

#3 TA:

>> select * from user where id=1;
-------------
| id | name |
-----+-------
| 1  | Tom  |
-------------

In this example, even if TB updates the record and commits to DB, TA still gets the old name Tom in the second query. DB create a read view the first time TA reads such record.

For repeated read, the read view will be released after transaction is committed or rollbacked.

Another case:

#1 TA:

>> Begin();
>> select * from user where id=1 for update;
-------------
| id | name |
-----+-------
| 1  | Tom  |
-------------
>> update set name="Kim" from user where id=1;

#2 TB:

>> Begin();
>> select * from user where id=1;
-------------
| id | name |
-----+-------
| 1  | Tom  |
-------------

In this case, TB still get the data(Tom) instead of TA(Kim) when TB start.

Repeatable Read is the default isolation level of InnoDB engine in MySQL.

Serializable

This isolation level is to make sure concurrent operation on same records will be handled in sequence.

InnoDB implicitly converts all plain SELECT statements to SELECT … FOR SHARE

Serializable is the highest isolation level with the worst concurrency performance. It’s not widely used in high concurrent system.

comments powered by Disqus