Transactions
- Transaction
- A sequence of read and write operations treated as a single logical unit of work that either commits (all changes made permanent) or aborts (all changes rolled back).
- Commit
- The successful completion of a transaction, making all of its changes permanent and visible to other transactions.
- Abort (rollback)
- The cancellation of a transaction, undoing all of its changes and returning the system to the state it was in before the transaction began.
- Write-ahead log (WAL)
- A durability mechanism in which changes are written to a sequential log before being applied to the data, enabling crash recovery by replaying the log to redo committed and undo incomplete transactions.
- Stable storage
- Storage that survives crashes, power outages, and reboots; typically a file system with writes flushed to disk before returning.
- ACID
- The set of properties that define a correct database transaction: Atomicity, Consistency, Isolation, and Durability.
- Atomicity
- The property that a transaction is all-or-nothing: either all of its operations commit or none do.
- Isolation
- The property that concurrent transactions do not observe each other’s intermediate state; serializability is the standard isolation guarantee.
- Durability
- The property that committed transactions persist across crashes, implemented using write-ahead logging.
Concurrency Control
- Concurrency control
- The mechanism by which a system ensures that concurrent transactions do not interfere with each other, enforcing the isolation property.
- Serializability
- An isolation property for multi-operation transactions requiring that the outcome of concurrent transactions be equivalent to some serial execution of those transactions, with no real-time constraint on which serial order is chosen.
- Schedule
- A sequence of interleaved read and write operations from multiple concurrent transactions; a serializable schedule is one whose outcome is equivalent to some serial execution.
- Pessimistic concurrency control
- An approach that assumes conflicts between transactions are likely and prevents them proactively using locks.
- Optimistic concurrency control (OCC)
- An approach that assumes conflicts are rare, allows transactions to proceed without locks, and checks for conflicts only at commit time, aborting and restarting the transaction if a conflict is found.
- Two-phase locking (2PL)
- A pessimistic concurrency control protocol in which a transaction has a growing phase (acquires locks, releases none) followed by a shrinking phase (releases locks, acquires none), guaranteeing serializability.
- Read lock (shared lock)
- A lock that allows multiple transactions to hold it simultaneously on the same data item, but prevents any transaction from acquiring a write lock on that item.
- Write lock (exclusive lock)
- A lock that grants exclusive access to a data item; no other transaction may hold any lock on the item while a write lock is held.
- Cascading abort
- A failure condition in plain 2PL where one transaction’s abort forces other transactions that read its uncommitted data to also abort; prevented by strict or strong strict 2PL.
- Strict two-phase locking
- A variant of 2PL in which write locks are held until the transaction commits or aborts, preventing cascading aborts.
- Strong strict two-phase locking (SS2PL)
- A variant of 2PL in which all locks, both read and write, are held until the transaction commits or aborts; the standard implementation in most commercial databases.
- Multi-Version Concurrency Control (MVCC)
- A concurrency control technique in which the system maintains multiple versions of each data item, allowing readers to see a consistent snapshot without blocking writers.
- Snapshot isolation
- A property of MVCC systems in which each transaction reads from a consistent snapshot of the data taken at the transaction’s start time, so reads never block and are unaffected by concurrent writes.
- Lease
- A lock with a time limit that is automatically released when the lease expires, allowing the system to recover from the failure of a lock holder without waiting for explicit release.
Deadlock
- Deadlock
- A situation in which a set of transactions each hold locks needed by another transaction in the set, forming a cycle of dependencies with no transaction able to proceed.
- Wait-for graph (WFG)
- A directed graph used for deadlock detection where each node represents a transaction and a directed edge from T1 to T2 means T1 is waiting for a resource held by T2; a cycle indicates deadlock.
- Phantom deadlock
- A false positive in deadlock detection caused by asynchronous collection of wait-for graph snapshots; the cycle appears in the merged graph but does not exist in the current system state.
- Centralized deadlock detection
- A deadlock detection approach in which one designated node collects local wait-for graphs from all nodes, merges them into a global graph, and searches for cycles.
- Edge chasing
- A distributed deadlock detection approach in which probe messages propagate along wait-for graph edges; a deadlock is confirmed when a probe returns to its origin.
- Chandy-Misra-Haas algorithm
- A distributed edge-chasing deadlock detection algorithm that uses probe messages to detect cycles in a distributed wait-for graph without requiring a central coordinator.
- Wait-die
- A deadlock prevention scheme in which an older transaction waits for a younger one to release a resource, but a younger transaction requesting a resource held by an older one is aborted and retried.
- Wound-wait
- A deadlock prevention scheme in which an older transaction preempts a younger one holding a needed resource, but a younger transaction requesting a resource held by an older one waits.
Atomic Commit Protocols
- Two-Phase Commit (2PC)
- A distributed protocol that coordinates a commit-or-abort decision across multiple nodes by separating a voting phase from a decision phase, requiring unanimous agreement to commit.
- Three-Phase Commit (3PC)
- An extension of 2PC that adds a pre-commit phase to eliminate blocking under single-node failure, at the cost of requiring bounded network delays; not used in practice.
- Coordinator
- The node in a distributed transaction that drives the commit protocol, collects votes from participants, and broadcasts the final commit or abort decision.
- Participant (cohort)
- A node in a distributed transaction that executes part of the transaction, votes on whether it can commit, and carries out the coordinator’s final decision.
- Prepare phase
- The first phase of 2PC in which the coordinator requests that all participants vote on whether they can commit, and each participant writes a durable prepare record.
- Uncertain state
- The state a 2PC participant enters after voting yes but before receiving the coordinator’s decision; the participant cannot unilaterally commit or abort while in this state.
- Blocking protocol
- A commit protocol in which the failure of one node can prevent other nodes from making progress; 2PC is a blocking protocol because coordinator failure leaves participants in the uncertain state.
- Fail-recover model
- A failure model in which nodes that crash eventually recover and resume normal operation; assumed by 2PC.
Consistency Models
- Consistency model
- A specification of what values a read operation is permitted to return given a history of writes; stronger models provide more intuitive guarantees at the cost of coordination.
- Linearizability
- The strongest practical consistency model, requiring that every operation appear to take effect instantaneously at some point between its invocation and completion, in an order consistent with real time.
- Sequential consistency
- A consistency model requiring that all operations appear in some total order consistent with each process’s program order, without requiring agreement with real-time order.
- Causal consistency
- A consistency model that requires causally related operations to appear in the same order for all processes; causally independent operations may be observed in different orders.
- Eventual consistency
- A weak consistency model that guarantees only that, if no new updates are made, all replicas will eventually converge to the same value; reads may return stale or divergent data in the interim.
- Strong Eventual Consistency (SEC)
- A strengthening of eventual consistency guaranteeing that any two nodes that have received the same set of updates will be in identical states, regardless of the order updates were received.
- CRDT (Conflict-Free Replicated Data Type)
- A data structure designed so that concurrent updates from any replica can be merged automatically in any order without conflicts, enabling strong eventual consistency.
CAP Theorem and PACELC
- CAP theorem
- The theorem stating that when a network partition occurs, a distributed system cannot simultaneously guarantee both consistency (linearizability) and availability.
- Partition tolerance
- The ability of a distributed system to continue operating correctly despite arbitrary message loss or delay between nodes; required of any real-world distributed system.
- CP system
- A distributed system that prioritizes consistency over availability during a network partition, preferring to return an error rather than serve potentially stale data.
- AP system
- A distributed system that prioritizes availability over consistency during a network partition, preferring to serve potentially stale data rather than return an error.
- PACELC
- A framework extending CAP by stating that during normal operation, a distributed system must also trade off between latency and consistency.
- Latency-consistency trade-off
- The tension between responding quickly from a local replica (low latency, possibly stale) and coordinating with a quorum of replicas before responding (higher latency, strongly consistent).
BASE
- BASE
- A design philosophy for large-scale distributed systems trading strict consistency for availability and scale: Basically Available, Soft State, Eventually Consistent.
- Basically Available
- The property that a system prioritizes responding to requests even if the response may be stale or incomplete, rather than refusing requests to preserve consistency.
- Soft state
- The property that a system’s state may change over time even without new input, because replicas are asynchronously reconciling diverged data.