Module Review Correct Answer Google Cloud Platform Big Data and Machine Learning Fundamentals Week 1

  1. What are the common big data challenges that you will be building solutions for in this course? (check all that apply)
  • Migrating existing on-premise workloads to the cloud
  • Analyzing large datasets at scale
  • Building streaming data pipelines
  • Applying machine learning to your datasets

Question 2

  1. You have a large enterprise that will likely have many teams using their own Google Cloud Platform projects and resources. What should you be sure to have to help manage and administer these resources? (check all that apply)
  • A defined Organization
  • Folders for teams and/or products
  • A defined access control policy with Cloud IAM

  1. Which of the following is NOT one of the advantages of Google Cloud security
  • Google Cloud will automatically manage and curate your content and access policies to be safe for the public

Question 4

  1. If you don't have a large dataset of your own but still want to practice writing queries and building pipelines on Google Cloud Platform, what should you do?
  • Practice with the datasets in the Google Cloud Public Datasets program
  • Find other public datasets online and upload them into BigQuery
  • Work to create your own dataset and then upload it into BigQuery for analysis
  1. As you saw in the demo, Compute Engine nodes on GCP are:
  • Allocated on demand, and you pay for the time that they are up.
  1. You should feed your machine learning model your _______ and not your _______. It will learn those for itself!
  • data, rules
  1. True or False: Cloud SQL is a big data analytics warehouse
  • False
  1. True or False: If you are migrating your Hadoop workload to the cloud, you must first rewrite all your Spark jobs to be compliant with the cloud.
  • False
  1. You are thinking about migrating your Hadoop workloads to the cloud and you have a few workloads that are fault-tolerant (they can handle interruptions of individual VMs gracefully). What are some architecture considerations you should explore in the cloud? Choose all that apply
  • Use PVMs or Preemptible Virtual Machines
  • Migrate your storage from on-cluster HDFS to off-cluster Google Cloud Storage (GCS)
  • Consider having multiple Cloud Dataproc instances for each priority workload and then turning them down when not in use
Question 5

  1. Google Cloud Storage is a good option for storing data that:
  • May be imported from a bucket into a Hadoop cluster for analysis
  • May be required to be read at some later time (i.e. load a CSV file into BigQuery)
  1. Relational databases are a good choice when you need
  • Transactional updates on relatively small datasets
  1. Cloud SQL and Cloud Dataproc offer familiar tools (MySQL and Hadoop/Pig/Hive/Spark). What is the value-add provided by Google Cloud Platform?
  • Running it on Google infrastructure offers reliability and cost savings
  • Fully-managed versions of the software offer no-ops

You can Enroll in this course use this like for Coursera.

Lab: Recommending Products Using Cloud SQL and Spark 



  1. Which of the below are the core services that make up BigQuery? (choose the correct 2)
  • Query service
  • Storage service

  1. You want to know how many rows are in the BigQuery Public Dataset on San Francisco Bike Shares. What could you do?
  • # Run the below query:

    SELECT

     COUNT(*) AS total_trips

    FROM

     `bigquery-public-data.san_francisco_bikeshare.bikeshare_trips`

  • In the BigQuery Web UI, find the table and click the details tab and view the rows


  1. True or False: You can query a Google Spreadsheet directly from BigQuery without loading it in first.
  • True(this is a federated query)
  1. You have a taxi service data schema that has three columns:
ride_id
ride_timestamp
ride_status You want to use BigQuery for reporting but you don't want to split your table into multiple sub-tables. What native features of BigQuery data types should you explore? (check all that apply)

  • Consider making ride_timestamp an ARRAY of timestamp values so each ride_id row in your table could still be unique and easy to report off of.
  • Consider adding lat / long geographic data points as new columns and using GIS Functions to quickly plot the distances your fleet has travelled.

Complete the following

  1. In ML, a row of data is called a(n) ________ and a column of data is called a(n) _______. We mark one or more columns as ________ which we know for historical data and are trying to predict for future data.

  • instance or observation
  • feature
  • labels

 


Share:

0 Comments:

Post a Comment

Please don't enter any spam link in the comment box

subscribe

IOE ALL SUB NOTES

Delivered by FeedBurner

Wikipedia

Search results

The quality of being the only one of its kind.

If you are fresher or preparing to attempt the Back Exam then you are going to be happy to know that here you will find all the shortlisted targeted question for your subject. Here you will find perfect answers to your questions