Concurrency



index
Disabled back button Next Section
printable version

Section 0: Module Objectives or Competencies
Course Objective or Competency Module Objectives or Competency
The student will be introduced to some of the theory behind the development of database engines, and will understand the complexity and challenges of developing multiuser databases. The student will become familiar with the concept of concurrent transactions in a multiuser database.
The student will be able to explain the need for concurrency control stemming from the fact that the simultaneous execution of transactions over a shared database can lead to several data-integrity and consistency problems including lost updates, uncommitted data, and inconsistent retrievals.
The student will learn what situations can result in a conflict between two database transactions.
The student will learn that a multiuser database relies on a scheduler to establish the order in which the operations within concurrent transactions are executed.
The student will learn that a multiuser database incorporates a lock manager.
The student will learn about locks, lock granularity, locking methods, lock types, and time stamps, as well as that a multiuser database incorporates a lock manager to deal with locking conflicts, or deadlocks.
The student will learn that there are some databases that are primarily read access, and therefore are unlikely to have database operations that conflict.
The student will be introduced to database recovery management.
Section 1: Overview

Concurrency control refers to the coordination of simultaneous execution of transactions in a multiprocessing database system

Objective: to ensure the serializability of transactions in a multiuser database environment.

Simultaneous execution of transactions over a shared database can lead to several data-integrity and consistency problems:

Section 2: Lost Updates

Lost updates occur when the results of one update are overwritten by another.

example:

Note that the first transaction has not been committed when the second begins.

Section 3: Uncommitted Data

Data are not committed when two transactions T1 and T2 are executed concurrently and the first transaction T1 is rolled back after the second transaction T2 has already accessed the uncommitted data.

example:

Section 4: Inconsistent Retrievals

Inconsistent retrievals occur when a transaction calculates some summary, or aggregate, functions over a set of data while other transactions are updating the data.

example:

Section 5: The Scheduler
Section 6: Locking Methods

A lock guarantees exclusive use of a data item to a current transaction.

example: Transaction T1 will be unable to access a data item that is currently being used by (and is therefore locked by) T2.

A transaction acquires a lock prior to data access; the lock is released when the transaction is completed.


Lock manager – initiates and enforces locking procedures; built in to most multiuser DBMSes.

Lock Granularity – refers to the level or scope of the lock

Section 7: Lock Types

Regardless of the object (database, table, page, row, field) being locked, a DBMS may use different locking techniques, such as binary and shared/exclusive.

Binary locks:

Shared/Exclusive locks:

Section 8: Two-Phase Locking

Two-phase locking defines how locks are acquired and relinquished.

Two-phase locking guarantees serializability, but does not prevent deadlocks.

The two phases are:

  1. Growing phase:
    • In this phase a transaction acquires all the required locks without unlocking any data. Once all locks are acquired, the transaction is in its locked point.
  2. Shrinking phase:
    • In this phase a transaction releases all locks and cannot obtain a new lock.

Two-phase locking is subject to the following rules:

  1. Two transactions cannot have conflicting locks.
  2. No unlock operation can precede a lock operation in the same transaction.
  3. No data are affected until all locks are obtained, i.e., the transaction is in its locked point.
Two-phase locking protocol.
Section 9: Deadlocks

Deadlocks exist when two transactions T1 and T2 exist in the following mode:

Deadlocked transactions.

If T1 has not unlocked item Y, T2 cannot begin; and if T2 has not unlocked item X, T1 cannot begin. Consequently, T1 and T2 wait indefinitely, each waiting for the other to unlock the required data item.

Deadlock condition.

In a real-world DBMS several transactions may be executed concurrently, thereby increasing the probability of generating deadlocks.

Deadlocks are possible only if one of the transactions wants to obtain an exclusive lock on a data item.


Three basic techniques exist to control deadlocks:

  1. Deadlock prevention:
    • A transaction requesting a new lock is aborted if there is the possibility that a deadlock may occur.
    • If the transaction is aborted, all the changes made by the transaction are rolled back, and all locks obtained by the transaction are released.
    • The transaction is then rescheduled for execution.
    • Deadlock prevention works because it avoids the conditions that lead to deadlocks.
    • More on this below.
  2. Deadlock detection:
    • The DBMS periodically tests the database for deadlocks.
    • If one is found, one of the transactions is aborted (rolled back and restarted), and the other transaction continues.
  3. Deadlock avoidance:
    • The transaction must obtain all the locks it needs before it can be executed.
    • This technique avoids rollback of conflicting transactions by requiring that locks be obtained in succession.
    • However, the serial lock assignment required in deadlock avoidance increases response times.

The database environment dictates the best deadlock control method:

Deadlock control methods.

For more information, read Deadlock Resolution....

Section 10: Time Stamping Methods

Each transaction is assigned a global unique time stamp, which provides an explicit order in which transactions were submitted to the DBMS.

Time stamps must possess two properties:

All operations within the same transaction must have the same time stamp.

The DBMS executes conflicting operations in time stamp order, thereby ensuring serializability of the transactions.

If two transactions conflict, one is stopped, rescheduled, and assigned a new time stamp value.

Time stamping increases overhead because each value stored in the database requires two time stamp fields: one for the last time the field was read and one for the last update.


Wait/Die and Wound/Wait Schemes

Section 11: Optimistic Methods

Optimistic methods are based on the assumption that the majority of database operations do not conflict, hence the optimistic approach does not require locking or time stamping. [link]

The optimistic method allows a transaction to execute without restrictions until it is committed.

Each transaction passes through two or three phases:

  1. Read phase:
    • The transaction reads the database, performs the necessary computations, and makes updates to a private copy of the database values.
    • All update operations of the transaction are recorded in a temporary update file, which cannot be accessed by the remaining transactions.
  2. Validation phase:
    • The transaction is validated to assure that the changes made will not affect the integrity and consistency of the database.
    • If the validation test is positive, the transaction moves to the write phase, otherwise the transaction is restarted and the changes are discarded.
  3. Write phase:
    • The changes are permanently applied to the database.

This approach is acceptable for database systems that perform mostly reads or queries, and require very few update transactions.

Section 12: Database Recovery Management

Restores database to previous consistent state.

Recovery techniques are based on the atomicity property:

If the transaction operation cannot be completed:


There are several important concepts that affect the recovery process, including:


Deferred-write technique (also called a deferred update)


Write-through technique (also called an immediate update)

See the example using Table 10.15.

Buffers

Database buffers are temporary storage areas in primary memory used to speed up disk operations. To improve processing time, the DBMS software reads the data from the physical disk and stores a copy of it in a "buffer" in primary memory. When a transaction updates data, it actually updates the copy of the data in the buffer because that process is much faster than accessing the physical disk every time. Later, all buffers that contain updated data are written to a physical disk during a single operation, thereby saving significant processing time.