CS 417 Exam info

When & where

The second exam will be held in our regular classroom on November 6, 2017. It will take up about half the lecture, starting approximately during the second half of the class period. Please be sure to arrive on time and do not plan on coming in just to take the exam. If you arrive after the exam has started, you will not be allowed to take it.

Exam rules

Be sure to arrive on time. If you arrive after the exam starts, you will not be allowed to take it.

This will be a closed book, closed notes exam. Calculators, phones, augmented reality glasses, laptops, and tablets are neither needed nor permitted. If you have these devices, you must turn them off, put them out of sight, and not access them for the duration of the exam.

No other electronic devices are permitted except for hearing aids, pacemakers, electronic nerve stimulators, other implanted medical devices, or electronic watches that function only as timekeeping devices or chronographs.

Bring a couple of pens or pencils with you. An extra pencil is affordable fault tolerance. If you want to splurge, the Palomino Blackwing 602 is considered by many to be one of the finest pencils available. The company advertises its key virtue as "half the pressure, twice the speed." If that claim is really true, using this product might help you complete the exam quicker. If you do not choose bring several extra pencils, you may want to bring a pencil sharpener. Palamino makes a companion Blackwing Long Point Sharpener. This, too, is pricey at $11.00. For a bit less money, you can get what looks like a clone: the Alvin Kum Long Point Pencil Sharpener. Both of these feature two-step sharpening: one for the wood case and another for the graphite core of the pencil. A truly beautiful sharpener is the El Casco Pencil Sharpener, but bringing this to class is really overkill, as is spending over $300 on a pencil sharpener. If you would like to learn the craft of pencil sharpening, there are several books available. The best of these may be How to Sharpen Pencils: A Practical & Theoretical Treatise on the Artisanal Craft of Pencil Sharpening for Writers, Artists, Contractors, Flange Turners, Anglesmiths, & Civil Servants by David Rees. Do not be intimidated by the omission of "students" in the title. You can read more about it at artisinalpencilsharpening.com. A video by David Rees is here. You are welcome to bring a full pencil sharpening travel kit to the exam but be aware that a proper sharpening routine may consume too much time during the exam and may be messy.

Past exams

You can use my recent exams as a guide to what this exam may look like. Expect a few short-answer questions and a bunch of multiple-choice questions.

Get past 417 exams here.

Study guide

You are responsible for the material from since exam 1, weeks 5 through 9.

I've prepared a study guide that attempts to cover most of the material you should know. The guide is not a substitute for the lectures, lecture material, and other reading matter. My goal is to put most of the information you need to know in as concise a form as possible, with more elaboration in areas where textbook coverage may be lacking.

Topics

Topics that you should know and may be on the exam include:

Mutual exclusion

  • Types: centralized, token-based, contention-based
  • central-server algorithm
  • distributed mutual exclusion algorithms:
    • Token ring
    • Lamport
    • Ricart & Agrawala
    • Understand pros and cons of each

Election algorithms

  • Bully algorithm
  • Ring algorithm
  • Chang and Roberts ring algorithm optimizations
  • network partitioning problem (split-brain/segmentation)

Consensus

  • Understand consensus requirements
    • Validity
    • Uniform agreement
    • Integrity
    • Termination (= progress)
  • Paxos
    • Goal: fault tolerant consensus algorithm
    • Quorum
    • Roles: proposer, acceptor, learner
    • Why does electing a leader simplify the role of proposers?
  • Raft (see homework)
    • Leader election
    • Committing log entries
  • Leasing vs. locking
  • Hierarchical leases

Distributed transactions

  • properties of transactions (ACID)
  • nested transactions: private work spaces, write-ahead logs
  • two-phase commit protocol
    • understand the need for a log
    • what happens in each of the phases?
  • three-phase commit protocol
    • precommit
    • coordinator recovery
    • understand the conditions of what happens to the protocol when the coordinator crashes (slide 33)
    • problems with 3PC
  • I will not ask you about Paxos commit
  • Scaling via replication and Brewer's CAP Theorem
    • Consistency
    • Availability
    • Fault tolerance
  • Eventual consistency: BASE approach vs. ACID

Concurrency control

  • schedules
  • lock manager
  • two-phase locking
    • What is the purpose? Preserves serializability ("Isolated" in ACID)
    • cascading aborts
  • strict two-phase locking
  • separating read and write locks
    • read locks, write locks, commit locks
    • two-version (multiversion) concurrency control
  • optimistic concurrency control
    • validation phase
    • update phase
    • timestamp ordering

Distributed deadlock

  • conditions for deadlock, deadlock resolution
  • centralized detection
    • Wait-for graph, Global wait-for graph
    • false deadlock (what's meant by this?)
  • detection: Chandy-Misra-Haas probing algorihm
    • Just the high level: a probe message circulates back to the originator
  • prevention: wait-die, wound-wait
    • Just the high level: cyclies are prevented because transactions can only wait on older (or younger) resources. I will not ask you to remember which is wait-die (old can only wait on young) or which is wound-wait (young can only wait on old)

Network Attached Storage

  • File service, file server
  • Service models: remote access vs. upload/downloa
  • Consistency: sequential semantics, session semantics, ambiguous semantics
  • Stateful vs. stateless design
  • Caching considerations
  • Case studies
    • NFS
      • Goals
      • Mounting filesystems, goals of automounter
      • UDP vs TCP transport
      • Directory and file access protocol (don't memorize the list of RPC calls but understand why they exist, what lookup does, why there's no open)
      • Caching
      • Inconsistencies because of caching & validation
      • Know that a user-level lock manager was added to NFS but otherwise don't bother with features of successive versions.
      • Don't memorize features of NFS v4 but know:
        • the protocol is now stateful
        • supports better caching (similar to oplocks)
        • compound RPC support added
    • AFS (versions 1,2)
      • Goals
      • Caching
      • Service model, semantics
      • Cache coherence (callbacks)
    • Coda
      • Goals
      • Replicated volumes
      • Accessible Volume Storage Group
      • Client modification log
      • Resolution
      • Disconnected operation
      • Reintegration
    • DFS (AFS v.3)
      • Goals
      • Token mechanism: don't memorize tokens but understand the goals
    • SMB/CIFS
      • Goals
      • High-level protocol (RPC-like)
      • Caching model: oplocks (don't memorize oplocks)
      • Microsoft Dfs: just know that it adds consisten naming, similar to AFS
      • SMB2 – just understand a few concepts that improved performance
        • pipelining
        • credit-based flow control
        • compounding

Distributed file systems

  • Chubby
    • Purpose: lock manager, name server, and simple file system
    • Architecture of a Chubby cell: master and replicas
    • Named locks
    • Event notification
    • Whole file reads and writes
    • Don't try to memorize the API
    • Client cache consistency
    • Use of Paxos
  • Dropbox
    • Goal
    • How does server traffic load differ from mosts other sites that store data (facebook, twitter, ...)?
    • What's a metadata server (metaserver)?
    • Why was a notification server added? Polling vs. notifications
  • What is a parallel file system?
  • GFS
    • Goal
    • chunkservers - what do they do?
    • master - what does it do?
    • Google cluster environment: colocate computation and data on same set of machines
    • chunk handle
    • operation log
    • why large chunks?
    • replication: data flow vs. control flow
  • HDFS
    • HDFS is practically a clone of GFS. Focus on GFS terminology. If I use any HDFS terminoloyg, I will identify the GFS equivalent
    • NameNode = Master
    • DataNode = Chunkserver
    • blocks = chunks
    • Transaction log = Operation log
    • You do not have to know any of the differences between GFS and HDFS (they're minor)
    • Rack-aware distribution - client tries to read from the closest replica

Distributed Lookup (Distributed Hash Tables)

  • Purpose: distributed, highly available key-value store
  • Central coordinator
  • Flooding
  • What is an overlay network?
  • Back propagation, time to live
  • Distributed hash table (DHT)
  • What is consistent hashing?
  • Content-Addressable Network
    • Multi-dimensional hashes, zone, node insertion
  • Chord
    • Ring, successor nodes
    • How do finger tables make query routing more efficient?
  • Amazon Dynamo
    • Understand the goal: distributed, highly available key-value store
    • Functional interface: get, put
    • Consistency model for replication (eventually consistent)
    • Goals: incremental scalability, symmetry, decentralization, heterogenous systems
    • Conflict resolution
      • Who does it?
      • Use of vector timestamps for conflict detection
    • Partitioning among multiple nodes
    • What is a virtual node and why are they a good idea?