4 POINTS EACH - For each statement, select the most appropriate answer.
- Which of the following best describes the data model of Bigtable?
(a) Key-value store.
(b) Wide-column store.
(c) Document store.
(d) Multi-table database. - Bigtable allows:
(a) Splitting the contents of a large row across more than one tablet server.
(b) Performing atomic, isolated transactions on a group of rows.
(c) Dynamically adding a new column to a row.
(d) Nesting a table within a table. - Unlike Bigtable, a Cassandra table:
(a) Supports ACID semantics for transactions.
(b) Uses an SSTable (Sorted Strings Table) as the underlying structure for a table.
(c) Is a wide-column data store.
(d) Is split and distributed based on a user-defined key. - Cassandra's architecture is most closely inspired by:
(a) Amazon Dynamo & Google Bigtable.
(b) Google Bigtable & GFS.
(c) Google Bigtable & Spanner.
(d) Amazon Dynamo & Google Spanner. - Spanner's TrueTime API:
(a) Enables timestamp-based detection of conflicts due to concurrent updates.
(b) Enables a transaction to wait until it is certain that its commit timestamp is in the past.
(c) Provides clients with a highly accurate timestamp via the use of atomic clocks and GPS.
(d) Enables any computer in the network to convert a globally consistent time value into the local timezone. - Spanner does not require locks for a read-only transaction because:
(a) Transactions are serialized via two-phase locking.
(b) Optimistic concurrency control is used at commit time to detect modifications.
(c) The transaction does not modify data.
(d) The transaction will not read data newer than its timestamp. - What is the process of shuffling in MapReduce?
(a) Transferring data from the map workers to the reduce workers.
(b) Distributing input data to the nodes in the cluster.
(c) Reducing the size of the input data.
(d) Converting map workers into reduce workers. - The map step in MapReduce is responsible for:
(a) Aggregating results.
(b) Processing and transforming input data.
(c) Sorting data.
(d) Distributing tasks to workers. - In BSP, barriers are used to:
(a) Prevent any process from generating too much data.
(b) Optimize network traffic.
(c) Synchronize processes at the end of each superstep.
(d) Force processes to stop execution. - Fault tolerance in Pregel is achieved through:
(a) Checkpointing.
(b) Replication of processing nodes.
(c) Data mirroring: replicating stored data.
(d) Automatic retries if a process fails to complete. - In Pregel, vertex-centric programming means:
(a) Each vertex has its own processor.
(b) The program is written from the perspective of a single vertex.
(c) Vertices in a graph are processed sequentially.
(d) Each vertex is considered an independent entity and unaware of others. - In Spark, a Resilient Distributed Dataset, or RDD, is:
(a) A set of data that is replicated across multiple servers for fault tolerance.
(b) A distributed collection of objects representing either original data or the output of a transformation.
(c) The original input data that will be processed by Spark.
(d) Output data generated by a Spark action. - What is one benefit of lazy evaluation in Apache Spark?
(a) It allows a transformation to be distributed across multiple executors.
(b) It ensures that transformations are executed exactly in the order specified by the programmer.
(c) It allows the framework to optimize the execution of transformations.
(d) It enables even load distribution across all executors (processing nodes). - How does Kafka ensure message durability?
(a) By replicating messages across multiple nodes.
(b) By affixing Message Authentication Codes (MACs) to each message.
(c) By allowing multiple consumers to read the same message.
(d) By being implemented on top of a fault-tolerant distributed file system like HDFS or GFS. - What is the purpose of a consumer group in Kafka?
(a) To enable multiple consumers in the group to receive the same stream of messages.
(b) To enable members of the group to subscribe to different topics.
(c) To create a pipeline where one group member can receive a message and forward results to the next member.
(d) To distribute the processing of messages across multiple consumers in the group. - How does Akamai use DNS in optimizing content delivery?
(a) To resolve domain names to the IP addresses of the closest edge server.
(b) To find the most efficient route from the user making the query to the origin server.
(c) To use the worldwide set of DNS servers to cache content that will be returned with a DNS query.
(d) To make sure that access to content is secure. - Akamai's overlay network enhances content delivery by:
(a) Bypassing the Internet.
(b) Improving routing.
(c) Routing traffic to a central storage server.
(d) Redirecting user traffic to caching servers. - In BitTorrent, what is a torrent file?
(a) The original version of a large file that is being shared on BitTorrent.
(b) An incomplete version of the file that is being downloaded.
(c) A file that contains information about a file that is being shared.
(d) One or more complete copies of files that are being shared on BitTorrent. - How does BitTorrent achieve high data transfer speeds?
(a) By compressing a file before downloading it.
(b) By having a peer test and select the lowest latency server that has a copy of the file.
(c) By deploying a large number of caching servers.
(d) By having peers make their partially-downloaded content available to other peers that need it. - For Alice to send data that only Bob can read, Alice would encrypt it with:
(a) Alice's public key.
(b) Alice's private key.
(c) Bob's public key.
(d) Bob's private key. - To prevent intruders from generating a new hash of an altered message, a Message Authentication Code (MAC):
(a) Is encrypted with the sender's private key.
(b) Is encrypted with the sender's public key.
(c) Uses a secret key as part of the input to the hash function.
(d) Is sent separately from the message. - The Diffie-Hellman Key Exchange algorithm:
(a) Allows one party to send an encryption key securely to another party.
(b) Provides a secure mechanism so that two parties can get each other's secret keys.
(c) Allows two parties to come up with a common key that nobody else can compute.
(d) Allows a system to update its encryption key periodically. - A digital signature differs from a Message Authentication Code (MAC) because:
(a) It contains the message and the authentication code.
(b) It identifies the origin but does not detect that the corresponding content has been modified.
(c) It does not rely on a hash function.
(d) It can only be created by one party. - A digital certificate:
(a) Binds information about someone together with their public key.
(b) Is a tamperproof data structure that stores a user's identity and their private key.
(c) Is a secure packaging of a file, its MAC, and a digital signature.
(d) Is a hash of the content encrypted by the owner of the content. - Passkey authentication makes use of:
(a) Diffie-Hellman Key Exchange.
(b) Symmetric cryptography.
(c) Message authentication codes.
(d) Public key cryptography.