Post

Unit 7

Unit 7

Lesson 18: Database Transactions

What is a Transaction?

  • A transaction is a group of operations done together as one task.
  • Example: Transferring money from one account to another:
    • Read balance of Account A
    • Subtract amount from A
    • Add amount to Account B
    • Save both changes together

If any step fails, the whole transaction is canceled (rolled back).


Why Transactions Matter?

  • To keep the database correct and reliable, especially when:
    • Many users access it at the same time
    • There are unexpected issues (e.g., power failure)

ACID Properties of Transactions

  1. Atomicity – All operations happen or none do
  2. Consistency – Database remains correct before and after
  3. Isolation – Transactions don’t interfere with each other
  4. Durability – Once completed, the changes stay, even if system crashes

Transaction States

  • Active – Doing the work
  • Partially Committed – Almost done
  • Committed – Successfully completed
  • Failed – Something went wrong
  • Aborted – Rolled back

Issues Without Proper Transactions

  • Lost Update – One update overwrites another
  • Dirty Read – Reading data that was not yet committed
  • Unrepeatable Read – Getting different results when reading same data twice
  • Phantom Read – Rows appear/disappear between reads

Concurrency Control Techniques

  • Locking – Prevents others from changing data while one is using it
  • Timestamp Ordering – Orders transactions by time
  • Validation – Checks before committing to avoid conflict

Recovery Techniques

  • Log-based Recovery – Keeps a record (log) of all operations
  • Shadow Paging – Keeps a copy of data before changes

Summary

  • Transactions ensure reliable, correct, and safe changes in databases.
  • ACID properties make them trustworthy.
  • Recovery and concurrency control protect data from crashes and conflicts.

Lesson 19: Concurrency Control & Recovery

What is Concurrency Control?

  • Concurrency control ensures that multiple transactions can happen at the same time without messing up the database.
  • It avoids problems like:
    • Two users updating the same data
    • One user reading data while another is changing it

Problems Without Concurrency Control

  • Lost Update – Two changes at the same time; one gets lost
  • Temporary Update (Dirty Read) – Read uncommitted (temporary) data
  • Incorrect Summary – Read partial data while other updates are happening
  • Unrepeatable Read – Data changes between two reads
  • Phantom Read – New rows appear/disappear during a transaction

Concurrency Control Methods

1. Lock-Based Protocols

  • Shared Lock (S): For reading
  • Exclusive Lock (X): For writing
  • Two-Phase Locking (2PL):
    • Growing Phase: Gets all locks
    • Shrinking Phase: Releases locks

2. Timestamp-Based Protocols

  • Every transaction gets a timestamp
  • Ensures older transactions happen before newer ones

3. Validation-Based Protocols

  • Check for conflicts before committing

Serializability

  • A schedule (order of operations) is serializable if it gives the same result as if the transactions ran one by one.

Deadlock in Transactions

  • Happens when two transactions wait for each other’s locks
  • Solutions:
    • Wait-Die and Wound-Wait (based on timestamps)
    • Timeouts – Cancel if waiting too long
    • Deadlock Detection – Use wait-for graph

Recovery in Databases

Why Recovery?

  • To bring the database back to a correct state after:
    • System crash
    • Power failure
    • Software bugs

Types of Storage:

  • Volatile Storage – Loses data on crash (e.g., RAM)
  • Non-Volatile Storage – Keeps data safe (e.g., hard disk)

Recovery Techniques

1. Log-Based Recovery

  • Log stores all actions of transactions
  • Use UNDO (rollback) or REDO (reapply)

2. Shadow Paging

  • Keeps a backup (shadow) of data before changes
  • Swap if the transaction is successful

Summary

  • Concurrency control allows safe multi-user access
  • Locking, timestamps, and validation help avoid conflicts
  • Recovery methods like logging and shadow paging fix crashes
  • Together, they keep the database safe, correct, and reliable

Lesson 20: Database Recovery

Why Database Recovery?

Databases must maintain atomicity and durability, even in the face of failures. If a system crashes during a transaction, we need a way to roll back partial changes or reapply completed transactions.


Types of Failures

TypeDescription
Transaction FailureErrors in logic or constraint violations
System CrashHardware/software crashes
Disk FailureData loss due to disk corruption

Logging and Write-Ahead Logging (WAL)

  • All changes are logged before they’re written to the database.
  • WAL Principle: Log first, then update.

Each log record typically contains:

  • Transaction ID
  • Type (BEGIN, UPDATE, COMMIT, ABORT)
  • Old and new values

Why logs? So we can undo or redo changes during recovery.


Undo and Redo

  • Undo: Reverses incomplete transactions.
  • Redo: Reapplies committed transactions.

Think of it like a time machine:

  • Undo goes backward.
  • Redo goes forward.

Checkpointing

Imagine saving your progress in a game. Checkpointing does the same!

  • Periodically writes all dirty pages to disk.
  • Adds a checkpoint log entry so recovery doesn’t start from scratch.

Improves recovery speed dramatically!


The ARIES Algorithm

ARIES (Algorithm for Recovery and Isolation Exploiting Semantics) is the standard for crash recovery.

Three Phases:

  1. Analysis
    • Determines dirty pages and active transactions at the crash time.
  2. Redo
    • Repeats all actions since the last checkpoint (for committed transactions).
  3. Undo
    • Rolls back uncommitted transactions using the log records.

ARIES uses a Log Sequence Number (LSN) to track and order operations.


Summary

  • Recovery = Logging + Undo/Redo + Checkpointing
  • WAL ensures logs are written before changes are made.
  • ARIES is a powerful recovery algorithm used in most modern DBMSs.

Fun Fact

Oracle, PostgreSQL, and SQL Server all use ARIES-inspired recovery techniques.


Conclusion

In Unit 6, we explored the core mechanisms that ensure reliable, consistent, and safe data operations in a database system. Transactions, concurrency control, and recovery aren’t just abstract concepts—they are real-world safeguards that protect data integrity and user trust, especially in multi-user and failure-prone environments. By mastering these topics, you now understand how modern databases keep operations atomic, isolated, durable, and correct, even under stress.


Key Takeaways

  • Transactions are groups of operations that must complete fully or not at all.

  • The ACID properties (Atomicity, Consistency, Isolation, Durability) are the foundation of reliable transactions.

  • Concurrency control prevents data conflicts when multiple users access the database at the same time.

  • Techniques like locking, timestamps, and validation manage concurrent transactions.

  • Problems like lost updates, dirty reads, and phantom reads occur without proper control.

  • Recovery mechanisms ensure data is restored after crashes or failures.

  • Logging (using Write-Ahead Logging), Undo/Redo, and Checkpointing are key to recovery.

  • The ARIES algorithm is a widely used recovery technique that efficiently handles system failures.


This post is licensed under CC BY 4.0 by the author.