304 North Cardinal St.
Dorchester Center, MA 02124
Data pipelines typically fall under one of the Extra-Load, Extract-Load-Transform, or Extract-Transform-Load paradigms. Building Batch Data Pipelines on the GCP course describes which paradigm should be used and when for batch data.
Furthermore, this course covers several technologies on Google Cloud Platform for data transformation including BigQuery, executing Spark on Cloud Dataproc, pipeline graphs in Cloud Data Fusion, and serverless data processing with Cloud Dataflow.
Q1. Which of the following is the ideal use case for Extract and Load (EL)
Q1. Which of the following statements are true about Cloud Dataproc?
Q2. Match each of the terms with what they do when setting up clusters in Cloud Dataproc:
__ 1. Zone – A. Costs less but may not be available always
__ 2. Standard Cluster mode – B. Determines the Google data center where compute nodes will be
__ 3. Preemptible – C. Provides 1 master and N workers
Q3. Cloud Dataproc provides the ability for Spark programs to separate compute & storage by:
Q1. Cloud Data Fusion is the ideal solution when you need
Q1. Which of the following statements are true?
Q2. Match each of the Dataflow terms with what they do in the life of a dataflow job:
__ 1. Transform A. Output endpoint for your pipeline
__ 2. PCollection B. A data processing operation or step in your pipeline
__ 3. Sink C. A set of data in your pipeline
More Coursera Quiz Answers >>