Which of these accurately describes the relationship between Apache Beam and Cloud Dataflow
Apache Beam and Cloud Dataflow >>> Which of these accurately describes the relationship between Apache Beam and Cloud Dataflow >>> Feature Engineering
1.
Question 1
Which of these accurately describes the relationship between Apache Beam and Cloud Dataflow?
1 / 1 point
They are the same.
Cloud Dataflow is the proprietary version of the Apache Beam API and the two are not compatible.
——————————————————————————————-
3.
Question 3
What is the purpose of a Cloud Dataflow connector?
.apply(TextIO.write().to(“gs://…”));
1 / 1 point
Connectors allow you to output the results of a pipeline to a specific data sink like Bigtable, Google Cloud Storage, flat file, BigQuery, and more…
Connectors allow you to chain multiple data-processing steps together automatically so they process in parallel.
Connectors allow you to authenticate your pipeline as specific users who may have greater access to datasets.
——————————————————————————————-
6.
Question 6
Your development team is about to execute this code block. What is your team about to do?
1 / 1 point
We are compiling our Cloud Dataflow pipeline written in Java and are submitting it to the cloud for execution. Notice that we are calling mvn compile and passing in –runner=DataflowRunner.
We are compiling our Cloud Dataflow pipeline written in Python and are loading the outputs of the executed pipeline inside of Google Cloud Storage (gs://)
We are preparing a staging area in Google Cloud Storage for the output of our Cloud Dataflow pipeline and will be submitting our BigQuery job with a later command.
——————————————————————————————-
2.
Question 2
TRUE or FALSE: The Filter method can be carried out in parallel and autoscaled by the execution framework:
1 / 1 point
True: Anything in Map or FlatMap can be parallelized by the Beam execution framework.
False: Anything in Map or FlatMap can be parallelized by the Beam execution framework.
——————————————————————————————-
4.
Question 4
Below you’ll find a Cloud Dataflow preprocessing graph. Correctly identify the terms for A, B, and C.
1 / 1 point
A is a data source, B are transformation steps, and C is a data sink.
A is a data stream, B are transformation steps, and C is a data sink
A is a data stream, B are transformation steps, and C is a data source
——————————————————————————————-
7.
Question 7
TRUE or FALSE: A ParDo acts on all items at once (like a Map in MapReduce).
1 / 1 point
True
False. A ParDo acts on one item at a time (like a Map in MapReduce)
——————————————————————————————-
5.
Question 5
To run a pipeline you need something called a ________.