# Distributed Programming in Java Quiz

Enroll Now

## Week- 1

### Module 1 Quiz

1.
Question 1
Assume that MapReduce(f, g, data), performs map function f on the input data, followed by reduce function g, as introduced in Lecture 1.1. What does the function mystery, provided below in pseudocode notation, compute?

Hint: Determining the omitted type information might be helpful.

1 point

A. For each input value, the average corresponding input key.

B. For each input key, the average corresponding input value.

C. For each input value, the average position in the ordering of input keys.

D. For each position in the ordering of input keys, the average of the previous input values.

2.
Question 2
Consider the following pseudo-code using MapReduce. What does mystery2 print when its input is a collection with a single key/value pair entry?

Hint: Determining the omitted type information might be helpful.

1 point

A. For each input value, all the words that occur in each line.

B. For each line in the input value, the average length of the words it contains.

C. For each line in the input value, the number of words it contains.

D. For each input value, the words and the line numbers they appear in.

E. For each input value, the words and the line numbers they appear in in sorted order.

3.
Question 3
Which of the following statements are true?

1 point

A. Hadoop is able to provide strong fault tolerance because map-reduce is a functional operation.

B. Hadoop is used to speed up tasks executing on a single computer.

C. The user of Hadoop is responsible for specifying how the computers in the network communicate with each other.

4.
Question 4
Suppose we want to use Hadoop to perform a word count on the following words, given as a sequence of key-value pairs: (the, 1), (dog, 1), (chased, 1), (the, 1), and (cat, 1). Which of the following could be the output of the “group” step of the computation? Assume no reduction operation can occur during the “group” step.

1 point

A. Group 1: (the, 1). Group 2: (dog, 1). Group 3: (chased, 1). Group 4: (the, 1). Group 5: (cat, 1).

B. Group 1: (the, 2). Group 2: (dog, 1). Group 3: (chased, 1). Group 4: (cat, 1).

C. Group 1: (the, 1) and (the, 1). Group 2: (dog, 1). Group 3: (chased, 1). Group 4: (cat, 1).

5.
Question 5
Which of the following statements are true?

1 point

A. Hadoop is a generalization of Spark.

B. Spark benefits from using nodes with large memory.

C. Spark only supports eager or strict evaluation. (It does not use lazy evaluation.)

6.
Question 6
Which of the following are terminal operations in Spark?

1 point

A. Map

B. Reduce

C. Filter

D. Collect

7.
Question 7
Why might the notion of an “inverse document frequency” be important in determining the similarity between two arbitrary documents in a corpus?

1 point

A. The inverse document frequency ensures that words appearing across many documents are discounted in comparison to words that are unique to a few documents.

B. The inverse document frequency adds prominence to words that are common to many documents, enabling a programmer to compute similarity metrics more precisely.

C. The inverse document frequency allows the programmer to more easily apply the MapReduce model of computation to a large corpus of documents.

8.
Question 8
For this question, you are encouraged to search the Internet and read about topics relevant to elementary text-mining and tf-idf.

Which of the following might serve as a valid means of computing which document is most similar to document D_1D
1

in a corpus of text documents? For simplicity, assume all relevant tf-idf weights have been appropriately normalized and computed, and that resources (time, memory, secondary storage, etc) are not issues of concern.

I. For each document in the corpus, use a vector representation of the appropriate tf-idf weights to compute its cosine similarity to D_1D
1

. Those documents yielding the highest cosine similarity to D_1D
1

are considered “most similar”.

II. For each document in the corpus, find the sum of its tf-idf weights corresponding to the words that appear in D_1D
1

. Those documents yielding the highest sum of relevant tf-idf weights are considered “most similar”.

III. For each document in the corpus, compare its total word count to the word count of D_1D
1

. Those documents yielding the smallest absolute difference in word counts are considered “most similar”.

1 point

A. I and II

B. II and III

C. III only

D. I and III

9.
Question 9
In Spark, what does the reduceByKey transformation do?

1 point

A. For a set of (key, value) pairs, groups all values that have the same key and then applies a reduction operator to collapse those values into a single value. A single (key, value) pair is then emitted per key.

B. For a set of (key, value) pairs, groups all values that have the same key. A single pair of a key and the list of all values is then emitted.

C. For a set of (key, value) pairs, counts the number of (key, value) pairs for each unique key. For each unique key, emits a (key, value) pair that is that key with the number of pairs that had it.

D. For a set of (key, value) pairs and a given key, removes all (key, value) pairs that have that same key.

10.
Question 10
Why is Page Rank an algorithm that fits well with Spark?

1 point

Spark offers Page Rank transformations specifically for supporting the Page Rank algorithm (e.g. join).

Spark is optimized for iterative, in-memory workloads. Page Rank is an example of one.

Page Rank was originally designed to run on top of Spark, and so algorithmically fits well with the provided transformations.

Very few distributed programming models could support a distributed join operation.

## Week- 2

### Module 2 Quiz

1.
Question 1
When initializing sockets for the server and client, what type of Object should each side initialize?

(Note: some questions may have multiple correct answers)

1 point

A. Server initializes a Socket, Client initializes a ServerSocket.

B. Server initializes a ServerSocket, Client initializes a Socket.

C. The server and client must create both a Socket and ServerSocket or else they cannot both read and write.

D. None of the above.

2.
Question 2
Which of the following statements is correct about communication between a client and server?

1 point

A. The Server uses an InputStream to get data from the Client, and the Client uses an OutputStream to give data to the Server.

B. The Server and Client effectively have “two connections”, one for reading and the other for writing.

C. The Server uses an OutputStream to get data from the Client, and the Client uses an InputStream to give data to the Server.

D. The Server and Client each have an OutputStream and InputStream.

E. None of the above.

3.
Question 3
What must a class extend/implement if you want to make it serializable?

1 point

A. Implement Serializable and Deserializable

B. Implement Serializable

C. Implement Deserializable on the Client, implement Serializable on the Server

D. Extend Deserializable on the Client, extend Serializable on the Server

4.
Question 4
What does transient mean with respect to serializing objects?

1 point

A. It means we can now deserialize in any JVM, not just one.

B. It means that specific variable in the serialized object will not be initialized.

C. It means that we are sending a “generic” serialized object which the receiver can structure how they want.

D. Both B and C

5.
Question 5
In remote method invocation, where object x is located on the server and the client is executing the instruction y = x.foo(), which objects must be serializable?

1 point

A. None

B. Only x

C. Only y

D. Both x and y

6.
Question 6
What are the functions of the stub object in RMI?

1 point

A. Allows the client to remotely call methods on the server’s object.

B. It’s a local object on the client’s JVM created to represent the remote object that lives on the server’s JVM.

C. Stores the data that belongs to the skeleton object

D. Executes the code of the skeleton object’s methods

7.
Question 7
What is the main advantage of using Multicast Sockets?

1 point

A. Multicast Sockets are easier to implement than Broadcast & Unicast Sockets

B. It is generally more efficient to use one Multicast Socket than multiple Unicast Sockets

C. Multicast Sockets, unlike Broadcast Sockets, touch all nodes/destinations

D. Multicast Sockets use more bandwidth/resources than Broadcast and Unicast

8.
Question 8

1 point

A. The DatagramPacket message can have unbounded length

B. A DatagramPacket message can be sent to all members of a given group

C. DatagramPackets are used only for sending messages, not receiving

D. DatagramPackets can only be used by Multicast Sockets

9.
Question 9
What are the nodes in a distributed Publish-Subscribe system referred to as?

1 point

A. Workers.

B. Brokers.

C. Publishers.

D. Subscribers.

10.
Question 10
Which of the following are benefits of the Publish-Subscribe paradigm?

1 point

A. Efficient implementation due to message batching.

B. Higher resilience due to message replication.

C. Higher throughput due to topic partitioning.

D. All of the above.

## Week- 3

### Module 3 Quiz

1.
Question 1
Say you have a logical 4-element array of data and 2 nodes to process that data with. The global view of this logical array is similarly a 4-element array storing the full dataset. How is node 0’s local view of that same array likely to be different from the global view, assuming that data is distributed as evenly as possible between the two nodes?

1 point

A. Node 0’s local view will also store the full 4 elements. By being local to node 0 performance will be improved when accessing that array.

B. Node 0’s local view will be half the size of the global view, and will only store 2 elements of the logical array. However, which two elements it will store is up to the programmer and is referred to as the data distribution.

C. Node 0’s local view will be half the size of the global view, and will only store 2 elements of the logical array. Node 0 must store the first 2 elements in the global logical array.

D. Node 0 will store zero elements from the global array because its rank is equal to zero.

2.
Question 2
In the first lecture video, we see a global view XG split into two local views, each called XL and each stored in a separate node. True or false: Node 0 can directly access node 1’s XL, and vice versa?

1 point

True

False

3.
Question 3
In MPI programs, how would you normally select the logic for different SPMD nodes to run?

1 point

A. By looking up the hostname of the current node

B. Through a global negotiation with the other nodes in the SPMD program

C. By querying for the MPI rank of the current node

D. Based on MPI tags

4.
Question 4
Which of the below communication patterns are considered an example of point-to-point communication?

Choose all that apply

1 point

A. Send

D. Scatter

E. Gather

5.
Question 5
Given the following three nodes and their send/receive schedules, which will finish its sends/receives first?

P0: Send X to P1; Recv Y from P2;

P1: Recv X from P0; Recv Z from P2;

P2: Send Y to P0; Send Z to P1;

1 point

A. P0

B. P1

C. P2

D. It can’t be known because there’s no guarantee of message order

E. It can’t be known because this schedule will result in deadlock

6.
Question 6
In the above node schedule (in question five), which operations can be blocking simultaneously? Assume there are no network delays.

Choose all that apply

1 point

A. Send Y to P0 and Recv Y from P2

B. Send X to P1 and Send Y to P0

C. Recv X from P0 and Recv Z from P2

D. Recv Y from P2 and Send Z to P1

E. Recv Z from P2 and Send Y to P0

7.
Question 7
Which of the following are advantages to using ISend and IRecv (and Wait) instead of Send and Recv?

Choose all that apply

1 point

A. They’re less likely to produce data races

B. They require writing less code to achieve the same result

C. They reduce the possibility of deadlock

D. They can increase parallelism

8.
Question 8
Which of the following statements about non-blocking communication is correct?

1 point

A. It’s impossible to have deadlock if one only uses ISend, IRecv, and WaitAny

B. It’s impossible to have deadlock if one only uses ISend, IRecv, and WaitAll

C. Using the result of an IRecv before it has actually been received implicitly calls Wait on the request returned by IRecv

D. One can emulate blocking Send and Recv calls by immediately calling any variety of Wait after a call to ISend or IRecv

9.
Question 9
Which of the following is true for MPI’s broadcast and reduce collectives?

1. A broadcast sends data from one node to all nodes, while a reduce sends data from all nodes to one node.

2. Both broadcast and reduce apply some mathematical transformation to their inputs to produce an output.

3. A broadcast can transmit many integers at once, but a reduce can only be applied to one integer at a time.

4. In both, the root parameter specifies a main process that either sends or receives all data.

1 point

A. 1 and 4

B. 1, 3, and 4

C. 1 and 3

D. 2 and 3

10.
Question 10
What is one of the benefits of using MPI collectives?

1 point

A. The operations that MPI collectives implement can only be implemented in the operating system kernel, and so we must rely on lower levels of the stack for them.

B. They offer constant time cost for all collective operations and processor counts.

C. MPI collectives offer optimized and succinct implementations of common, distributed operations.

## Week- 4

### Module 4 Quiz

1.
Question 1
Using multiple threads per process can help with:

1 point

A. Resource sharing

B. Performance

C. Responsiveness to JVM delays

D. Scalability

E. Responsiveness to network delays

F. Resource availability

2.
Question 2
True or false: on a node with 16 cores, running 16 processes with 1 thread each will always be faster than running one process with 16 threads?

1 point

True

False

3.
Question 3
The benefits of using a multithreaded server vs. a single-threaded one are:

1 point

A. Increased throughput of completed requests

B. Reduced time it takes to service an individual request

C. Reduced delay between request submission and processing of a request

D. Elimination of data races and contention between requests

4.
Question 4
In the following multithreaded file server pseudo-code:

}
Which of the operations in the algorithm have to ensure that the concurrent access to memory or resources is handled correctly?

1 point

A. None, the implementation does not have to worry about concurrency

B. All of them: A, B, C and D have to ensure a safe concurrent thread access

C. A and C

D. Only C

5.
Question 5
Which of the following is not a valid MPI mode?

1 point

A. Funneled

B. Multiple

C. Single

D. Serialized

6.
Question 6
I have a program with threads T_0T

. I want to make all communications to the MPI go through T_0T
0

. Which of the MPI modes would I want to use?

1 point

A. Funneled

B. Multiple

C. Single

D. Serialized

7.
Question 7
Which of the following statement is false?:

1 point

A. Remote actors residing on different nodes cannot exchange object references because they can only communicate through message passing.

B. All messages sent from an actor must be serialized and be passed by copy in a distributed actor program.

C. Multiple actors in an actor-based program can run on different physical nodes without change to the program logic.

D. In a distributed actor system, actors maintain a logical name that can be remotely referenced by other actors across the node boundaries.

8.
Question 8
Consider a distributed actor-based implementation of the Sieve of Eratosthenes as follows:

123456789101112131415
//create the next sieve actor at local node

Assuming there are two physical nodes in the network, with 32 bit nodeID with integer values 0 and 1, which of the following programs that replaces line 11 can maximize the number of messages crossing the node boundary?

1 point

A. 12345 if (message.val > = 65536) { next = newRemoteActor(class:=SieveActor.class, arguments:=[message.val], node:=(localNodeId + 1)%2) } else { next = newActor(class:=SieveActor.class, arguments:=[message.val]); }

B. 1 next = newRemoteActor(class:=SieveActor.class, arguments:=[message.val], node:=(localNodeId ^1)

C. 1 next = newRemoteActor(class:= SieveActor.class, arguments :=[message.val], node:= (localNodeId + 1) %2)

D. 1 next = newRemoteActor(class:=SieveActor.class, arguments:=[message.val], node:= (~localNodeId))

9.
Question 9
Which of the following statements is true?

1 point

A. An advantage of the actor model is the ability of the actor to specify when to receive data.

B. A polling model where the consumer requests items periodically reduces delays in receiving information.

C. In reactive programming, producers propagate events to subscribers to trigger reactions.

D. In reactive programming, the subscriber has no way to specify how frequently it will receive data.

10.
Question 10
What is the expected output of the following piece of Java-based pseudocode?

ext(int item) {
x += item;
System.out.print(x + “ ,”);
}

1 point

A. There will be no output

B. 3, 30,

C. 0, 3,

D. 3, 33,

Other Questions Of This Category