data analyst at a construction company is working on

Course 3: Prepare Data for Exploration Quiz Answers

Week 1: Data Types and Structures

GOOGLE DATA ANALYTICS PROFESSIONAL CERTIFICATION

Complete Study Guide

Data Types and Structures introduction

Data collection is a vital part of analytics. Data can be collected in structured or unstructured forms, and it must be properly prepared for analysis. Data types are used to describe the kinds of data that analysts collect and analyze, while data structures describe how the data is organized. For example, Google Data Analytics uses Coursera’s platform to store and organize its data in order to make it easier to extract insights from it.

Data types include categorical variables (e.g., gender), numerical variables (e.g., age), dates and times, strings, images, audio files, etc. Data formats such as CSV files allow analysts to store raw data in an easily-digestible way for further analysis.

Learning Objectives

  • Explain how data is generated as a part of our daily activities with reference to the types of data generated
  • Explain factors that should be considered when making decisions about data collection
  • Explain the difference between structured and unstructured data
  • Discuss the difference between data and data types
  • Explain the relationship between data types, fields, and values
  • Discuss wide and long data formats with references to organization and purpose

Optional: familiar with data analytics? – take our diagnostic quiz

1. A data analyst at a construction company is working on a report for a quickly approaching deadline. Why might they choose to analyze only historical data?

  • They enjoy historical references.
  • The project has a very short time frame. (Correct)
  • The data is constantly changing.
  • The data is difficult to predict.

Correct: The most likely reason for choosing to analyze only historical data is that a project has a very short time frame

2. What are the benefits of data modeling? Select all that apply.

  • Secure data for future use
  • Keep data consistent (Correct)
  • Provide a map of how data is organized (Correct)
  • Make data easier to understand (Correct)

Correct: Data modeling keeps data consistent, provides a map of how data is organized, and makes data easier to understand. Data modeling is the process of creating a model that is used for organizing data elements and how they relate to one another.

3. A group of high school students take a survey that asks,” Are you on an athletic team? Please reply yes or no.” What kind of data is being collected?

  • Boolean (Correct)
  • String
  • Visual
  • Number

Correct: Boolean data would be collected. Boolean data has only two possible values, such as yes or no.

4. A data analyst is evaluating data to determine whether it is good or bad. Which qualities characterize good data? Select all that apply.

  • Cited (Correct)
  • Consequential
  • Comprehensive (Correct)
  • Current (Correct)

Correct: Good data is comprehensive, current, and cited.

5. Imagine that a company uses your personal data as part of a financial transaction. Before it occurs, you are not made aware of the nature and scale of this transaction. What concept of data ethics does this violate?

  • Transaction transparency
  • Openness
  • Consent
  • Currency (Correct)

Correct: This situation violates the concept of currency. The currency concept of data ethics states that individuals should be aware of financial transactions resulting from the use of their personal data and the scale of these transactions.

6. Which of the following are protections afforded by data privacy? Select all that apply.

  • Providing users the right to inspect, update, or correct their own data (Correct)
  • Providing users the right to free access, usage, and sharing of data
  • Preserving a data subject’s information and activity for all data transactions (Correct)
  • Applying standards of right and wrong to the management and usage of data

Correct: The protections of data privacy include preserving a data subject’s information and activity for all data transactions. They also include providing users the right to inspect, update, and correct their own data.

7. Which of the following are uses of relational databases? Select all that apply.

  • Organize numerical data based on relative scale
  • Keep data consistent regardless of where it’s accessed (Correct)
  • Contain and describe a series of tables that can be connected to form relationships (Correct)
  • Present the same information to each collaborator (Correct)

Correct: Relational databases are used to contain and describe a series of tables that can be connected to form relationships. They also present the same information to each collaborator by keeping data consistent regardless of where it’s accessed.

8. Which statements define primary keys and foreign keys and describe their relationship? Select all that apply.

  • A primary key is an identifier that references a column in which each value is unique. (Correct)
  • A foreign key is a field within a table that’s a primary key in another table. (Correct)
  • Primary and foreign keys are two connected identifiers within separate tables in a relational database. (Correct)
  • A primary key is a table containing observational data, and a foreign key is a table that contains the results of the primary key’s analysis.

Correct: A primary key is an identifier that references a column in which each value is unique. A foreign key is a field within a table that’s a primary key in another table. Primary and foreign keys are two connected identifiers within separate tables in a relational database.

9. What tasks can data analysts accomplish using metadata? Select all that apply.

  • Combine data from more than one source (Correct)
  • Perform data analyses
  • Evaluate the quality of data (Correct)
  • Interpret the contents of a database (Correct)

Correct: Data analysts use metadata to combine data, evaluate data, and interpret a database. Metadata is data about data; in database management, it helps data analysts understand the contents of the data within a database.

10. A data analyst reviews a spreadsheet of boat auction sales to find the last five sailboats sold in Kentucky. What steps would they take in order to narrow the scope? Select all that apply.

  • Sort by date in ascending order
  • Sort by date in descending order (Correct)
  • Filter out sales in Kentucky
  • Filter out sales outside of Kentucky (Correct)

Correct: The analyst can filter out sales outside of Kentucky and sort by date in descending order.

11. You are writing a SQL query to filter data from a database that describes trees in Omaha, Nebraska. You want to only display entries for trees that have a diameter of 30 inches. The name of the table you’re using is Nebraska_trees and the name of the column that shows the diameters of the trees is trunk_diameter. What is the correct query syntax that will retrieve and filter data from this table?

  • SELECT Nebraska_trees WHERE trunk_diameter = 30
  • SELECT * FROM trunk_diameter WHERE Nebraska_trees = 30
  • SELECT trunk_diameter = 30 FROM Nebraska_trees
  • SELECT * FROM Nebraska_trees WHERE trunk_diameter = 30 (Correct)

Correct: The correct query is SELECT * FROM Nebraska_trees WHERE trunk_diameter = 30.

12. Consistent naming conventions describe which properties of a file? Select all that apply.

  • Version (Correct)
  • Content (Correct)
  • Creation date (Correct)
  • File location

Correct: Consistent naming conventions describe the content, creation date, and version of a file.

Test your knowledge on collecting data

1. Which method of data-collection is most commonly used by scientists?

  • Interviews
  • Observations (Correct)
  • Questionnaires
  • Surveys

Correct: Observation is the method of data-collection most often used by scientists.

2. Organizations such as the U.S. Centers for Disease Control (CDC) often use data collected from hospitals. What kind of data is the CDC using if it is collected by hospitals, then sold to the CDC for its own analysis?

  • Multiple-party data
  • Second-party data (Correct)
  • First-party data
  • Third-party data

Correct: Data gathered by hospitals, then collected by the CDC, is an example of second-party data.

3. Fill in the blank: In data analytics, a _____ refers to all possible data values in a certain dataset.

  • representation
  • population (Correct)
  • sample
  • source

Correct: In data analytics, a population refers to all possible data values in a certain dataset.

Test your knowledge on data formats and structures

1. Fill in the blank: The running time of a movie is an example of _____ data.

  • nominal
  • qualitative
  • discrete
  • continuous (Correct)

Correct: Running times of movies are an example of continuous data, which is measured and can have almost any numeric value.

2. What are the characteristics of unstructured data? Select all that apply.

  • Has a clearly identifiable structure
  • Is not organized (Correct)
  • May have an internal structure (Correct)
  • Fits neatly into rows and columns

Correct: Unstructured data is not organized, although it may have an internal structure.

3. Structured data enables data to be grouped together to form relations. This makes it easier for analysts to do what with the data? Select all that apply.

  • Rewrite
  • Store (Correct)
  • Search (Correct)
  • Analyze (Correct)

Correct: Structured data that is grouped together to form relations enables analysts to more easily store, search, and analyze the data.

4. Which of the following is an example of unstructured data?

  • Rating of a local favorite restaurant
  • GPS location
  • Contact saved on a phone
  • Email message (Correct)

Correct: An example of unstructured data is an email message. Other examples of unstructured data are video files and social media content.

5. How would you write a function to calculate February’s entertainment expenses for Cable TV, Video Streaming, and Movies in the example spreadsheet?

  • =SUM(B2:C4)
  • SUM(C2:C6)
  • SUM(B2:C6)
  • =SUM(C2:C4) (Correct)

Correct: The correct way to write a SUM function that calculates February’s entertainment expenses for Cable TV, Video Streaming, and Movies is =SUM(C2:C4). To write this function, you took the relevant range of cells and put them in the proper SUM function syntax. Going forward, you can use this knowledge of functions to interact with spreadsheet data and make dynamic sheets that will aid you in the future.

6. Which statements are true about the two penguin datasets in the Dive into dplyr (tutorial #1) notebook? Select all that apply.

  • In penguins_lter.csv, the column Individual ID cannot be sorted.
  • penguins_size.csv has 7 columns. (Correct)
  • In penguins_lter.csv, the highest value in the column Sample Number is 152. (Correct)
  • In both datasets, the number of columns is the same.

Correct: The penguins_size.csv has 7 columns. In penguins_lter.csv, the highest value in the column Sample Number is 152. To learn about the penguin datasets, you used an interactive notebook’s data viewing feature. Going forward, you can use interactive notebooks to examine and describe data. This is an important skill that will help you complete data projects in the future.

Test your knowledge on data types, fields, and values

1. Fill in the blank: Internet search engines are an everyday example of how Boolean operators are used. The Boolean operator _____ expands the number of results when used in a keyword search.

  • OR (Correct)
  • AND
  • WITH
  • NOT

Correct: The Boolean operator OR expands the number of results when used in a keyword search.

2. Which of the following statements accurately describes a key difference between wide and long data?

  • Wide data subjects can have data in multiple columns. Long data subjects can have multiple rows that hold the values of subject attributes. (Correct)
  • Wide data subjects can have multiple rows that hold the values of subject attributes. Long data subjects can have data in multiple columns.
  • Every wide data subject has multiple columns. Every long data subject has data in a single column.
  • Every wide data subject has a single column that holds the values of subject attributes. Every long data subject has multiple columns.

Correct: Wide data subjects can have data in multiple columns. Long data subjects can have multiple rows that hold the values of subject attributes.

3. What does data transformation enable data analysts to accomplish?

  • Inspect the data for accuracy
  • Change the structure of the data (Correct)
  • Restore the data after it has been lost
  • Retrieve the data faster

Correct: Data transformation enables data analysts to change the structure of data.

GOOGLE DATA ANALYTICS COURSERA ANSWERS AND STUDY GUIDE

Liking our content? Then don’t forget to add us to your bookmarks so you can find us easily!

Weekly Breakdown | Google Study Guides | Back to Top

Prepare Data for Exploration Weekly Challenge 1

1. A data analyst is working on an urgent traffic study. As a result of the short time frame, which type of data are they most likely to use?

  • Unclean
  • Theoretical
  • Personal
  • Historical (Correct)

Correct: As a result of the short time frame, they are most likely to use historical data.

2. Which of the following is an example of continuous data?

  • Movie budget.
  • Movie run time. (Correct)
  • Leading actors in movie.
  • Box office returns.

Correct: Movie run time is an example of continuous data.

3. Nominal qualitative data has a set order or scale.

  • True
  • False (Correct)

Correct: Nominal qualitative data does not have a set order or scale.

4. Which of the following is a benefit of internal data?

  • Internal data is less likely to need cleaning.
  • Internal data is less vulnerable to biased collection.
  • Internal data is the only data relevant to the problem.
  • Internal data is more reliable and easier to collect. (Correct)

Correct: A benefit of internal data is that it’s more reliable and easier to collect than external data.

5. Structured data is likely to be found in which of the following formats? Select all that apply.

  • Audio file
  • Digital photo
  • Spreadsheet (Correct)
  • Table (Correct)

Correct: Structured data is likely to be found in a table or spreadsheet.

6. Which of the following values are examples of a Boolean data type? Select all that apply.

  • Yes, no, or unsure
  • Yes or no (Correct)
  • One, two, or three
  • True or false (Correct)

Correct: True or false and yes or no are examples of a Boolean data type.

7. The following is a selection from a spreadsheet:

Table_Weekly_Challenge_1_Course_3
Table
  • Narrow
  • Wide (Correct)
  • Long
  • Short

Correct: The selection from the spreadsheet contains wide data.

8. Data transformation can change the structure of the data. An example of this is taking data stored in one format and converting it to another.

  • True (Correct)
  • False

Correct: Data transformation can change the structure of the data. An example of this is taking data stored in one format and converting it to another.

9. Which of the following questions collect nominal qualitative data? Select all that apply.

  • Have you heard of our frequent diner program? (Correct)
  • How likely are you to recommend this restaurant to a friend?
  • Is this your first time dining at this restaurant?
  • Did anyone recommend our restaurant to you today?

Correct: “Did anyone recommend our restaurant to you today?”, “Have you heard of our frequent diner program?”, and “Is this your first time dining at this restaurant?” are questions that collect nominal qualitative data.

10. A social media post is an example of structured data.

  • True
  • False (Correct)

Correct: A social media post is an example of unstructured data.

11. A Boolean data type must have a numeric value.

  • True
  • False (Correct)

Correct: A Boolean data type can have many different types of values, but there can only be two of them.

12. In long data, separate columns contain the values and the context for the values, respectively. What does each column contain in wide data?

  • A specific data type
  • A unique data variable (Correct)
  • A specific constraint
  • A unique format

Correct: In wide data, each column contains a unique data variable. In long data, separate columns contain the values and the context for the values, respectively.

13. A data analyst is working in a spreadsheet application. They use Save As to change the file type from .XLS to .CSV. This is an example of a data transformation.

  • True (Correct)
  • False

Correct: A data analyst using Save As to change a file type from .XLS to .CSV is an example of a data transformation.

14.  If you have a short time frame for data collection and need an answer immediately, you likely will have to use historical data.

  • True (CORRECT)
  • False

Correct: If you have a short time frame for data collection and need an answer immediately, you likely will have to use historical data.

15.  Continuous data is measured and has a limited number of values.

  • True
  • False (CORRECT)

16.  Internal data is more reliable because it’s clean.

  • True
  • False (CORRECT)

Correct: Internal data is more reliable because it lives within a company’s own systems.

17.  A social media post is an example of structured data.

  • True
  • False (CORRECT)

Correct: A social media post is an example of unstructured data.

18.  A data analyst at a book publisher is working on an urgent report for executives. They are using only historical data. What is the most likely reason for choosing to analyze only historical data?

  • The data is constantly changing
  • There is plenty of time to research historical data
  • The project has a very short time frame (CORRECT)
  • The data is unknown

Correct: The most likely reason for choosing to analyze only historical data is that a project has a very short time frame.

19. Which of the following is an example of continuous data?

  • Box office returns
  • Movie run time (CORRECT)
  • Movie budget
  • Leading actors in movie

Correct: Movie run time is an example of continuous data.

20. Why is internal data considered more reliable and easier to collect than external data?

  • Internal data circumvents privacy restrictions.
  • Internal data has much larger sample sizes.
  • Internal data lives within a company’s own systems. (CORRECT)
  • Internal data comes from people you know.

Correct: Internal data is considered more reliable and easier to collect than external data because it lives within a company’s own systems.

21. Which of the following is an example of structured data?

  • Digital photo
  • Relational database (CORRECT)
  • Audio file
  • Video file

Correct: A relational database is an example of structured data.

22. In long data, separate columns contain the values and the context for the values, respectively. What does each column contain in wide data?

  • A specific data type
  • A unique format
  • A unique data variable (CORRECT)
  • A specific constraint

Correct: In wide data, each column contains a unique data variable. In long data, separate columns contain the values and the context for the values, respectively.

23. Which of the following questions collects nominal qualitative data?

  • On a scale of 1-10, how would you rate your service today?
  • Is this your first time dining at this restaurant? (CORRECT)
  • How many times have you dined at this restaurant?
  • How many people do you usually dine with?

Correct: “Is this your first time dining at this restaurant?” is a question that collects nominal qualitative data.

24. Nominal qualitative data has a set order or scale.

  • True
  • False (CORRECT)

Correct: Nominal qualitative data does not have a set order or scale.

25. Structured data is likely to be found in which of the following formats? Select all that apply.

  • Audio file
  • Digital photo
  • Table (CORRECT)
  • Spreadsheet (CORRECT)

Correct: Structured data is likely to be found in a table or spreadsheet.

Correct: Structured data is likely to be found in a table or spreadsheet.

26. Which of the following are examples of discrete data? Select all that apply.

  • Movie running time
  • Number of actors in movie (CORRECT)
  • Box office returns (CORRECT)
  • Movie budget (CORRECT)

Correct: The number of actors in a movie, box office returns, and the movie budget are examples of discrete data.

Correct: The number of actors in a movie, box office returns, and the movie budget are examples of discrete data.

Correct: The number of actors in a movie, box office returns, and the movie budget are examples of discrete data.

27. Fill in the blank: Data transformation enables data analysts to change the _____ of the data.

  • value
  • structure (CORRECT)
  • accuracy
  • meaning

Correct: Data transformation enables data analysts to change the structure of the data.

22. Why is internal data considered more reliable and easier to collect than external data?

  • Internal data has much larger sample sizes.
  • Internal data lives within a company’s own systems. (CORRECT)
  • Internal data comes from people you know.
  • Internal data circumvents privacy restrictions.

Correct: Internal data is considered more reliable and easier to collect than external data because it lives within a company’s own systems.

23. Fill in the blank: A Boolean data type can have _____ possible values.

  • 10
  • three
  • two (CORRECT)
  • infinite

Correct: A Boolean data type can have two possible values.

24. The following is a selection from a spreadsheet:

Name  AgeOccupation
Agnes Shipton44Entrepreneur
Ronaldo Vincent23Accountant
Henry Sing36Editor
Krishna Bowling62Graphic designer

What kind of data format does it contain?

  • Long
  • Wide (CORRECT)
  • Short
  • Narrow

Correct!

25. Data transformation can change the structure of the data. An example of this is taking data stored in one format and converting it to another.

  • True (CORRECT)
  • False

Correct: Data transformation can change the structure of the data. An example of this is taking data stored in one format and converting it to another.

Data Types and Structures Conclusion

In summary, data collection is a vital part of analytics. Data can be collected in structured or unstructured forms, and it must be properly prepared for analysis. Data types are used to describe the kinds of data that analysts collect and analyze, while data structures describe how the data is organized.

Google Data Analytics uses Coursera’s platform to store and organize its data in order to make it easier to extract insights from it. If you want to learn more about how Google collects and analyzes data, we suggest taking one of Coursera’s courses on the subject.