CS417 Exam 3
Fall 2017
Paul Krzyzanowski
100 points - 3 points each (1 point free). For each statement, select the most appropriate answer.
- The core task of the user's map function within a map worker in a MapReduce framework is to:
(a) Determine which reduce worker should process which key.
(b) Split the input data into shards.
(c) Parse input data and create key, value tuples.
(d) All of the above. - In MapReduce, partitioning refers to
(a) Determining the ratio of map workers to reduce workers.
(b) Determining which reduce worker will process a specific key.
(c) Splitting the input data into shards.
(d) Assigning input shards to map workers. - Reduce workers in MapReduce can start working:
(a) In parallel when the map workers start.
(b) When at least one map worker starts to generate data.
(c) When at least one map worker has processed all its input.
(d) When every single map worker has completed its task. - Bigtable's multidimensional property refers to the fact that:
(a) Bigtable stores versioned data within rows and columns.
(b) A table is actually composed of an arbitrary number of tablets.
(c) A multi-level storage structure is used: memtable, SSTable, tablet, and table.
(d) Each cell in a table can also be a table and, recursively, cells within that table can be tables. - As new rows are added to a Bigtable, they are:
(a) Added to an arbitrary tablet in the table that has free space.
(b) Appended to the end of the entire table.
(c) Appended to the end of the entire table but an index file with sorted keys enables rapid lookup.
(d) Added in a way to make sure the table remains sorted by a single key. - In Bigtable, what is the unit of distribution and load balancing?
(a) A set of adjacent rows.
(b) A set of adjacent columns.
(c) Each column family.
(d) Timestamped versions of data. - To coordinate transaction commits across multiple servers, Spanner uses:
(a) A two-phase commit protocol.
(b) A three-phase commit protocol.
(c) Distributed consensus based on Paxos.
(d) Optimistic concurrency control, checking for problems after the commit. - To provide isolation of transactions, Spanner:
(a) Restricts execution to one transaction at a time.
(b) Uses two-phase locking.
(c) Uses strict two-phase locking.
(d) Requires transactions to specify the data they plan to access ahead of time. - TrueTime provides:
(a) A means of synchronizing clocks across multiple data centers.
(b) A bounded time interval that contains the actual time of day within the interval.
(c) The exact time of day obtained from local time servers.
(d) A vector clock to enable each transaction to obtain a unique time stamp. - Spanner addresses the problem of global time ordering by:
(a) Allowing each transaction to get the precise time of day.
(b) Using consistent (total) ordering instead of global time ordering.
(c) Using an eventual consistency model where time of day does not matter.
(d) Forcing commit operations to wait. - Spanner allows transactions to use lock-free reads by:
(a) Using optimistic concurrency control mechanisms and not using write locks.
(b) Letting them read from replicas instead of the main servers.
(c) Using write locks but no read locks
(d) Letting them read older versions of data. - Messages sent by a process during execution of a superstep in BSP:
(a) Must be delivered before the start of the next superstep.
(b) Are delivered only at the start of the next superstep.
(c) Can be delivered to any programmer-specified future superstep.
(d) Are multicast to the entire group and acknowledged at the end of the superstep. - In Pregel, a function is executed for:
(a) Each vertex of a graph.
(b) Each edge of a graph.
(c) A graph cluster, representing a connected set of vertices and their edges.
(d) Each subgraph that is allocated to a distinct server. - Pregel's combiners:
(a) Reduce the number messages from the same processor that are targeted to the same destination.
(b) Manage global state.
(c) Merge multiple vertices into one vertex.
(d) Merge multiple edges into one edge. - In Spark, a Resilient Distributed Dataset, or RDD, is:
(a) A distributed collection of objects that is modified by each transformation.
(b) An immutable distributed collection of objects representing original data or the output of a transformation.
(c) The original input data that will be processed by Spark and is replicated onto multiple servers.
(d) The output data generated by a Spark action. - Spark's fault tolerance is based on:
(a) Checkpointing the output of each transformation and action.
(b) Running replicated transformation servers.
(c) Keeping track of the sequence of transformations that created the needed data.
(d) Restarting the entire sequence of transformations from the user's original data. - Multihoming means:
(a) A process migrates between multiple servers.
(b) Content is cached in multiple places close to the user.
(c) A system is connected to more than one network.
(d) The same content may be generated from multiple sources. - Akamai's dynamic DNS (domain name service):
(a) Locates the most suitable edge server based on a client's URL request.
(b) Locates the most suitable edge server based on a client's domain name query.
(c) Locates the shortest path to the origin server from a specific client.
(d) Locates the set of edge servers that should cache content for a specific host. - A system area network is typically designed to:
(a) Eliminate the overhead of TCP while providing reliable communication.
(b) Be a dedicated network for storage components.
(c) Act as a heartbeat network to allow detection of network failures.
(d) Connect hardware elements within a computer system. - A clustered file system differs from a distributed file system in that:
(a) Multiple computers access the same physical storage device.
(b) Data may be distributed among multiple computers.
(c) Data is replicated across storage devices on multiple computers for fault tolerance.
(d) It provides services only over a local area network. - A clustered file system does NOT:
(a) Require a distributed lock manager.
(b) Access data on a device block level rather than a file level.
(c) Enable multiple systems to share files.
(d) Distribute a file's data among multiple servers. - Fencing is used to:
(a) Provide a trusted path for nodes to communicate on a LAN.
(b) Isolate a computing node from other nodes.
(c) Monitor whether cluster members are alive.
(d) Establish a quorum among cluster members. - Unlike a public key algorithm, a symmetric algorithm:
(a) Uses the same function for encryption as decryption.
(b) Uses the same key for encryption and decryption.
(c) Produces ciphertext that is the same length as the plaintext.
(d) Cannot be used for message authentication. - For Alice to send an encrypted signed message to Bob, she creates a hash of the message and sends Bob:
(a) The message encrypted with Alice's private key and the hash encrypted with Bob's public key.
(b) The message encrypted with Alice's public key and the hash encrypted with Alice's private key.
(c) The message encrypted with Bob's public key and the hash encrypted with Alice's private key.
(d) The message encrypted with Bob's public key and the hash encrypted with Alice's public key. - A cryptographic hash function is an example of a:
(a) One-way function.
(b) Message authentication code.
(c) Symmetric algorithm.
(d) Session key. - The Diffie-Hellman algorithm most directly solves the problem of:
(a) Alice being able to send authenticated messages to Bob.
(b) Alice being able to validate Bob's identity.
(c) Alice and Bob generating public keys.
(d) Alice and Bob getting a shared secret key. - The Diffie-Hellman algorithm is not needed if you have:
(a) Hash functions.
(b) Message authentication codes.
(c) Symmetric cryptography.
(d) Public key cryptography. - Salt in a password hash is used to:
(a) Implement single-use (one-time) passwords.
(b) Add a layer of protection against bad hash functions.
(c) Encrypt the password before generating the hash.
(d) Make attacks using precomputed hash tables ineffective. - An advantage of the Challenge-Handshake Authentication Protocol (CHAP) is:
(a) The user or client does not need to know any secret information.
(b) It is a time-based protocol and the password is invalid after a short time.
(c) It does not require the use of one-way functions.
(d) No secret information is sent on the network. - Kerberos is designed to allow Alice and Bob to communicate using:
(a) A public key algorithm.
(b) A symmetric cryptography algorithm.
(c) A hybrid cryptosystem.
(d) A restricted cipher. - Secure Sockets Layer (SSL, or Transport Layer Security, TLS) uses:
(a) A public key algorithm.
(b) A symmetric cryptography algorithm.
(c) A hybrid cryptosystem.
(d) A restricted cipher. - OAuth was designed to:
(a) Allow a user to grant one service specific access rights from another service.
(b) Authenticate users using X.509 digital certificates.
(c) Enable an administrator to authorize user access to services.
(d) Support multi-factor authentication protocols. - OAuth relies on:
(a) HTTP URL redirection.
(b) Public key cryptography.
(c) A trusted third party that stores all the keys.
(d) Kerberos to authenticate and authorize users.