Transactions

Section 0: Module Objectives or Competencies

Course Objective or Competency	Module Objectives or Competency
The student will be able to explain the importance of transactions and will be able to define transactions in SQL.	The student will be able to list and explain the properties that a transaction must possess in order to ensure that a database remains in a consistent state.
	The student will be able to explain the SQL statements provided for transaction support.
	The student will be able to list and explain how the transaction log is used to record all transactions that update the database so that it is possible to restore the database.
	The student will be able to define multi-statement transactions.
	The student will be able to explain how save points and rollbacks can be used to reverse changes and will be able to use them.

Section 1: Overview

A transaction is a collection of SQL queries that together form a single logical task.

The queries are typically data manipulation language (DML) statements.

Transactions group a set of tasks into a single execution unit.

Each transaction begins with a specific task and ends when all the tasks in the group are successfully completed.
All statements in a transaction are treated as a single logical unit of work – either all of them succeed or all of them fail.
- If any of the tasks that make up the transaction fails, the transaction is aborted.
- Therefore, a transaction has only two results: success or failure.
- Incomplete steps result in the failure of the transaction; no intermediate states are acceptable.
  - Incomplete or improper transactions can have a devastating effect on database integrity.

A transaction that changes the contents of the database must alter the database from one consistent state to another.

A consistent database state is one in which all data-integrity constraints are satisfied.

Examples

Examine the current balance for the specified account number.

SELECT ACC_NUM, NAME, BALANCE
FROM CHECKACC
WHERE ACC_NUM = '0908110638';
Record the credit sale of 100 units of product '345TYX' to a customer '60120010' in the amount of $3500.00.

UPDATE PRODUCT
SET PROD_QOH = PROD_QOH - 100
WHERE PROD_CODE = '345TYX';

UPDATE ACCREC
SET AR_BALANCE = AR_BALANCE + 3500
WHERE AR_NUM = '60120010';
- Both transactions in example 2 must be completely executed or the transaction will yield an inconsistent database.
- If the first transaction is completed but the second is interrupted before completion, the PRODUCT is updated but customer Y is not charged.
Transfer $1000 from savings account 100 to checking account 200.

UPDATE ACCOUNTS
SET BALANCE = BALANCE - 1000
WHERE ACCT_NUM = '100';

UPDATE ACCOUNTS
SET BALANCE = BALANCE + 1000
WHERE ACCT_NUM = '200';
- Both transactions in example 3 must be completely executed or the transaction will yield an inconsistent (incorrect) database.
- If the first transaction is completed but the second is interrupted before completion, money has been transferred from one account, but not deposited in the other account.
In an earlier example we had an AGENT table and a CUSTOMER table. If agent Alex Alby (AGENT_CODE 501) leaves the company, we need to set the foreign key for his clients to NULL.

DELETE FROM AGENT
WHERE AGENT_CODE = '501';

UPDATE CUSTOMER
SET AGENT_CODE = NULL
WHERE AGENT_CODE = '501';
- Again, both transactions in example 4 must be completely executed or the transaction will yield an inconsistent (incorrect) database.
- If the first transaction is completed but the second is interrupted before completion, several customers remain linked to a non-existent agent, violating referential integrity.

The failure of any query in a multi-query transaction places the database in an inconsistent state and makes the data unreliable for subsequent transactions. The DBMS must make provisions to recover the database to a previous consistent condition.

Section 2: Tutorials

This section includes a series of video tutorials that do an excellent job explaining transactions.

Database Transactions, part 1: Introduction

Database Transactions, part 2: Examples

Database Transactions, part 3: ACID and Isolation

Database Transactions, part 4: Autocommit Mode

Database Transactions, part 5: Scripting

Section 3: Transaction Properties

In order to ensure that a database is in a consistent state, transactions must possess several properties.

The four key properties are often referred to as ACID, which is an acronym for Atomicity Consistency Isolation Durability.

Atomicity – All the work in the transaction is treated as a single unit; either all or no work is performed.
- Atomicity requires that all operations that make up a transaction be completed; if not the transaction is aborted
Consistency – A transaction must leave the database in a consistent internal state.
- The database moves from one consistent state to another.
Isolation – Each transaction is independent of all other transactions.
- That means each transaction will read data that was committed prior to the beginning or after the end of the other transactions.
- Data used during the execution of a transaction cannot be accessed by a second transaction until the first one has completed.
- This property is very important in multi-user databases because multiple users may be attempting to access and update the same database at the same time.
Durability – The results of the transaction are permanently stored in the system.
- After the transaction is committed, the data is in a persistent state, regardless of the circumstances.
- The RDBMS records the transaction in the transaction log, and marks it as being committed.
- If the transaction is not committed, then the RDBMS will roll all data changes back.
- Durability indicates the permanence of the database's consistent state.
- When a transaction is completed, the database reaches a consistent state, and that state cannot be lost, even in the event of a system failure.

An additional property that is sometimes included is Serializability.

Serializability describes the result of the concurrent execution of several transactions.
Concurrent transactions are treated as though they were executed in serial order.
This property is important in multi-user databases, where several transactions can be executed simultaneously.

Because single-user databases allow only one transaction to be executed at a time, both serializability and isolation are guaranteed.

However, multiuser databases must implement controls to ensure serializability and isolation of transactions, in addition to atomicity and durability, in order to guard the database's consistency and integrity.

BASE will be addressed in the NoSQL notes.

Section 4: Transaction Management with SQL

ANSI standards exist for SQL database transactions.

Transaction support is provided through the use of the COMMIT and the ROLLBACK statements.

A transaction begins implicitly when the first SQL statement is encountered.

ANSI standards require that when a transaction sequence is initiated it must continue through all succeeding SQL statements until one of the following four conditions is satisfied:

A COMMIT statement is encountered, in which case all changes are permanently recorded in the database. The COMMIT statement automatically ends an SQL transaction.
A ROLLBACK statement is encountered, in which case all modifications are aborted and the database is restored to its previous consistent state.
The end of the program is successfully reached, in which case all changes are permanently recorded in the database. This is equivalent to COMMIT.
The program is abnormally terminated, in which case the modifications made to the database are aborted and the database is restored to its previous consistent state. This is equivalent to ROLLBACK.

We add a COMMIT to the previous example below. In this example, the COMMIT is not really necessary if the UPDATE statement is the application's last statement and the program terminates normally. It is, however, good practice to include it.

UPDATE PRODUCT
SET PROD_QOH = PROD_QOH - 100
WHERE PROD_CODE = '345TYX';

UPDATE ACCREC
SET AR_BALANCE = AR_BALANCE + 3500
WHERE AR_NUM = '60120010';
COMMIT;

Section 5: Transaction Log

A transaction log is used to track all transactions that update the database so that it is possible to restore the database when a ROLLBACK statement is encountered, the program terminates abnormally, or the system crashes.

While the DBMS executes transactions that modify the database, it also automatically updates a transaction log.
This log stores before-and-after values about the database and any of the tables, tuples, and attribute values that participated in the transaction.
The starting and ending point of the transaction are also recorded.
While this increases the processing overhead of the DBMS, it is essential to the recovery of a corrupted database.

The transaction log is itself a database, and is managed by the DBMS like any other database. It is also subject to normal database threats, so the log is often located on a different storage device.

The following table shows a transaction log that reflects the previous UPDATE commands.

If a system failure occurs, the following sequence is followed:

The DBMS examines the transaction log for all uncommitted or incomplete transactions.
It then rolls back the database to its previous state based on this information.
Upon completion of the recovery process the DBMS writes all committed transactions in the log.

If a ROLLBACK is issued before the termination of a transaction, the DBMS will restore the database only for that transaction, rather than for all transactions. This preserves the durability of the previous transactions.

The following video provides more detail about Transaction Logs.

Section 6: Transaction Types

Single-Statement Transactions

The simplest transaction is a single data manipulation statement.

USE PUBS
UPDATE AUTHORS
SET AU_FNAME = 'John'
WHERE AU_ID = '172-32-1176'

This type of transaction is also called an Autocommit transaction.

The RDBMS first writes to the log file what it's going to do.
Then it does the actual update statement.
Finally, it writes to the log that it completed the update statement.
The writes to the log file are written directly to disk, but the update itself is typically done to a copy of the data that resides in memory.
At some future point that temporary copy will be written to disk.
If the server fails after a transaction has been committed and written to the log, the RDBMS will use the transaction log to "roll forward" that transaction when it starts up next.
Here is a script to create the pubs database if you want to experiment.

Multi-Statement Transactions

To make transactions a little more useful, they require two or more statements.

Such transactions are called Explicit Transactions.

USE PUBS
START TRANSACTION;
UPDATE AUTHORS SET AU_FNAME = 'John' WHERE AU_ID = '172-32-1176';
UPDATE AUTHORS SET AU_FNAME = 'Mary' WHERE AU_ID = '213-46-8915';
COMMIT;

Note that there is a START TRANSACTION at the beginning and a COMMIT at the end.

These statements start and complete a transaction.
Everything inside these statements is considered a logical unit of work.
If the statement fails after the first update, neither update statement will be applied when the RDBMS is restarted.
The log file will contain a START TRANSACTION but no corresponding COMMIT.
MySQL allows the use of the keyword BEGIN rather than START TRANSACTION. [link]

Note:

By default, MySQL runs with AUTOCOMMIT mode enabled. This means that as soon as you execute a statement that updates (modifies) a table, MySQL stores the update on disk to make it permanent.

If AUTOCOMMIT is set true, then COMMIT and ROLLBACK are useless.

To disable AUTOCOMMIT mode, use the following statement:

SET AUTOCOMMIT=0;

After disabling AUTOCOMMIT mode by setting the AUTOCOMMIT variable to zero, changes to transaction-safe tables are not made permanent immediately. You must use COMMIT to store your changes to disk or ROLLBACK to ignore the changes.

Section 7: Rolling Back

You can roll back a transaction if it doesn't do what you want. Consider the following transaction, which uses the pubs database:

USE pubs;
DECLARE EXIT HANDLER FOR SQLEXCEPTION
BEGIN
ROLLBACK; -- rollback any error in the transaction
END;

START TRANSACTION;
INSERT INTO AUTHORS
VALUES ('731-84-1654', 'Cunningtubby ', 'Waldorf', '505 397-2403',
'941 8th St.', 'Albuquerque', 'NM', '87101', 1);

INSERT INTO AUTHORS
VALUES ('525-22-1976', 'Akers', 'Fred', '806 221-8047',
'1014 Mesquite Ln.', 'Wolfforth', 'TX', '79382', '0'); -- Fails (data type mismatch)
COMMIT; -- this will not be executed

The second INSERT attempts to store character data in a numeric field, so all changes needed to be aborted. The process of reversing changes is called a rollback.

Notes:

The ROLLBACK statement "undoes" all the work since the matching START TRANSACTION statement. It will undo the results of both update statements.

Section 8: Save Points

You can save part of the transaction if it's important to save a portion of the work, even if the rest of the transaction fails. This is accomplished by using transaction savepoints with the following syntax:

SAVEPOINT <savepoint-name>;

Savepoints provide a mechanism to roll back portions of transactions.

A user can set a savepoint within a transaction to define a location to which a transaction can return if part of the transaction is conditionally canceled.

Scenario: Suppose that you have a transaction populating the sales table as well as the history table.

It might make sense to commit an INSERT statement populating the main sales table, even if the second insert into the history table fails. That's when the savepoints come in handy.

The ROLLBACK syntax also supports savepoints; in order to use a savepoint, it must be used along with one or more ROLLBACK TO SAVEPOINT statements.

if you use ROLLBACK with a savepoint syntax, the transaction will be rolled back up to that savepoint.
When using a ROLLBACK TO SAVEPOINT statement, you must provide a save-point name, as shown in the following syntax:
ROLLBACK TO SAVEPOINT <savepoint name>;

MySQL does not support nested transactions, but they can be simulated using savepoints. More here.

The following Transaction Controls in SQL video demonstrates SAVEPOINT fairly well. Note, however, that the examples are done using SQL Server, so instead of SAVEPOINT <savepoint-name> he uses SAVE TRAN <savepoint-name>.