data analyst is working on a dataset

Course 5 – Analyze Data to Answer Questions

Week 1: Organizing Data to Begin Analysis Quiz Answers

GOOGLE DATA ANALYTICS PROFESSIONAL CERTIFICATION

Complete Study Guide

Organizing Data to Begin Analysis INTRODUCTION

Data analyst is working on a dataset must organize data is an essential step in any analysis process. Through sorting and filtering, you can arrange your information to make it easier to identify patterns, spot trends, and draw conclusions. This type of organization of data is required for the Google Data Analytics Professional Certificate offered through Coursera.

In this certification course, you’ll learn how to use spreadsheet and SQL tools to sort and filter your data so that it’s better organized for further exploration. By organizing your data with these techniques, you can more easily explore datasets and understand the relationships between different pieces of information within them. Organizing data at the outset will help make your analysis much more successful.

Learning Objectives

  • Describe what is involved in the data analysis process with reference to goals and key tasks
  • Discuss the importance of organizing data before analysis with references to sorts and filters
  • Describe sorting as it relates to data in a spreadsheet or database with reference to functionality and benefits
  • Demonstrate an understanding of the steps involved in sorting and filtering data through the use of SQL queries

Test your knowledge on understanding data analysis

1. You ask volunteers at a theater production which tasks they have already completed and add that data to a spreadsheet containing all required tasks. You will use the information provided by the volunteers to figure out which tasks still need to be done. This is an example of which phase of analysis?

  • Transform data
  • Get input from others (Correct)
  • Format and adjust data
  • Organize data (into a dataset)

Correct: This is an example of getting input from others. Getting input involves soliciting information from other sources to inform your decisions.

2. You are working with three datasets about voter turnout in your county. First, you identify relationships and patterns between the datasets. Then, you use formulas and functions to make calculations based on your data. This is an example of which phase of analysis?

  • Get input from others
  • Format and adjust data
  • Organize data (into a dataset)
  • Transform data (Correct)

Correct: This is an example of transforming data, which involves identifying relationships and patterns in the data, and making calculations.

3. You are working with a dataset from a local community college. You sort the students alphabetically by last name. This is an example of which phase of analysis?

  • Format and adjust data (Correct)
  • Transform data
  • Organize data (into a dataset)
  • Get input from others

Correct: Sorting a list of students alphabetically is an example of formatting and adjusting data. This is a step analysts take to rearrange the data to make it easier to work with.

Test your knowledge on organizing data

1. Fill in the blank: A data analyst uses _____ to decide which data is relevant to their analysis and which data types and variables are appropriate.

  • database normalization
  • database references
  • database organization (Correct)
  • database relationships

Correct: It is important to document the evolution of a dataset in order to recover data-cleaning errors, inform other users of changes, and determine the quality of the data.

2. A data analyst wants to organize a database to show only the 100 most recent real estate sales in Stamford, Connecticut. How can they do that?

  • The data analyst should add a filter to return only sales in Stamford, Connecticut, then sort the most recent sales at the top of their list. (Correct)
  • The data analyst should filter out sales in Stamford, Connecticut, then sort the most recent sales at the top of their list.
  • The data analyst should filter out sales in Stamford, Connecticut, then sort the least recent sales at the top of their list.
  • The data analyst should add a filter to return only sales in Stamford, Connecticut, then sort the least recent sales at the top of their list.

Correct: The data analyst should add a filter for only sales in Stamford, Connecticut, then sort the most recent sales at the top of their list.

3. You are working with a database table that contains customer data. The country column designates the country where each customer is located. You want to find out which customers are located in Brazil.

You write the SQL query below. Add a WHERE clause that will return only customers located in Brazil.

Course_5_Quiz_2

How many customers are located in Brazil?

  • 5 (Correct)
  • 3
  • 9
  • 7

Correct: The clause WHERE country = “Brazil” will return only customers located in Brazil. The complete query is SELECT * FROM customer WHERE country = “Brazil”. The WHERE clause filters results that meet certain conditions. The WHERE clause includes the name of the column, an equals sign, and the value(s) in the column to include.

There are 5 customers located in Brazil.

Test your knowledge on Sorting in spreadsheets

1. Which spreadsheet menu function is used to sort all data in a spreadsheet by the ranking of a specific sorted column?

  • Sort Range
  • Sort By Rank
  • Sort Data
  • Sort Sheet (Correct)

Correct: Sort Sheet is used to sort all data in a spreadsheet by the ranking of a specific sorted column.

2. In spreadsheets, data analysts can sort a range from the Data tab in the menu or by typing a function directly into an empty cell.

  • True (Correct)
  • False

Correct: Sorting a range and sorting a sheet can both be done from the menu and written as a function. Analysts can work from the Data tab in the menu or type a function directly into an empty cell.

3. An analyst uses =SORT to sort spreadsheet data in descending order. What do they type at the end of their sort function?

  • Z-A
  • DESCEND
  • TRUE
  • FALSE (Correct)

Correct: To sort a spreadsheet in descending order using the SORT function, the analyst types FALSE at the end of their sort function.

4. The last query you ran returned the top 10 counties with the highest birth counts for 2018 only. Remove the LIMIT statement and run the query again. What is the county with the 11th highest birth count?

  • Orange County, CA (Correct)
  • Dallas County, TX
  • Unidentified Counties, KY
  • Miami-Dade County, FL

Correct: The county with the 11th highest birth count in 2018 is Orange County, CA. To find this answer, you ran a query with an ORDER BY clause and a WHERE clause. Going forward, you can use this knowledge of SQL to better organize and structure your data.

5. What was the average temperature at JFK and La Guardia stations between June 1, 2020 and June 30, 2020?

  • 87.671
  • 92.099
  • 72.883 (Correct)
  • 74.909

Correct: The average was 72.883. To find out the average temperature during this time period, you successfully created a new table using a query and ran another query against that table. Going forward, you will be able to use this skill to create tables with specific subsets of your data to query. This will help you draw insights from multiple data sources in the future.

Test your knowledge on sorting in SQL

1. A data analyst wants to sort a list of greenhouse shrubs by price from least expensive to most expensive. Which statement should they use?

  • ORDER BY shrub_price (Correct)
  • WHERE shrub_price ASC
  • WHERE shrub_price
  • ORDER BY shrub_price DESC

Correct: To sort a list of greenhouse shrubs by price from least expensive to most expensive, they should use ORDER BY shrub_price.

2. You are working with a database table that contains data about music genres. You want to sort the genres by name in ascending order. The genres are listed in the genre_name column.

You write the SQL query below. Add an ORDER BY clause that will sort the genres by name in ascending order.

Test knowledge on sorting SQL 1

What genre appears in row 3 of your query result?

  • Easy Listening
  • Classical
  • Alternative
  • Blues (Correct)

Correct: The clause ORDER BY genre_name will sort the genres by name in ascending order. The complete query is SELECT * FROM genre ORDER BY genre_name. The ORDER BY clause tells the database how to organize the data it returns. The ORDER BY clause sorts data in ascending order by default.

The Blues genre appears in row 3 of your query result.

3. You are working with a database table that contains employee data. You want to sort the employees by hire date in descending order. The hire dates are listed in the hire_date column.

You write the SQL query below. Add an ORDER BY clause that will sort the employees by hire date in descending order.

Test knowledge on sorting SQL 2

What employee appears in row 1 of your query result?

  • Nancy Edwards
  • Margaret Park
  • Laura Callahan (Correct)
  • Robert King

Correct: The clause ORDER BY hire_date DESC will sort the employees by hire date in descending order. The complete query is SELECT * FROM employee ORDER BY hire_date DESC. The ORDER BY clause tells the database how to organize the data it returns. The ORDER BY clause sorts data in ascending order by default. The DESC command is used to sort data in descending order.

The employee Laura Callahan appears in row 1 of your query result.

GOOGLE DATA ANALYTICS COURSERA ANSWERS AND STUDY GUIDE

Liking our content? Then don’t forget to add us to your bookmarks so you can find us easily!

Weekly Breakdown | Google Study Guides | Back to Top

Analyze Data to Answer Questions Weekly Challenge 1

1. In the data analysis process, which of the following refers to a phase of analysis? Select all that apply.

  • Organize data into understandable sections (Correct)
  • Visualize the data
  • Format data using sorts and filters (Correct)
  • Get input from others (Correct)

Correct: There are four phases of analysis: organize data, format and adjust data, get input from others, and transform data by observing relationships between data points and making calculations.

2. During which of the four phases of analysis do you compare your data to external sources?

  • Transform data
  • Format and adjust data
  • Get input from others (Correct)
  • Organize data

Correct: Comparing your data to external sources occurs while getting input from others.

3. You are performing a calculation during your analysis of a dataset. Which phase of analysis are you in?

  • Organize data
  • Format and adjust data
  • Transform data (Correct)
  • Get input from others

Correct: TRIM is a function that removes leading, trailing, and repeated spaces in data.

4. Fill in the blank: Filtering involves showing only the data that meets a specific _____ while hiding the rest.

  • model
  • measure
  • criteria (Correct)
  • observation

Correct: Filtering involves showing only the data that meets a specific criteria while hiding the rest.

5. A data analyst is sorting spreadsheet data. They want to make sure that, when they rearrange the data, data across rows is kept together. What technique should they use to sort the data?

  • Sort Together
  • Sort Rows
  • Sort Column
  • Sort Sheet (Correct)

Correct: Sort sheet sorts all of the data in a spreadsheet by a specific sorted column. Data across rows is kept together during the sort.

6. A data analyst uses a function to sort a spreadsheet range between cells H1 and K65. They sort in ascending order by the first column, Column H. What is the syntax they are using?

  • =SORT(H1:K65, 1, FALSE)
  • =SORT(H1:K65, A, FALSE)
  • =SORT(H1:K65, 1, TRUE) (Correct)
  • =SORT(H1:K65, A, TRUE)

Correct: The syntax is =SORT(H1:K65, 1, TRUE). The first part of the function sorts the data in the specified range. The 1 represents the first column. And a TRUE statement sorts in ascending order.

7. You are querying a database that contains data about music. Each album is given an ID number. You are only interested in data related to the album with ID number 6. The album IDs are listed in the album_id column.

You write the SQL query below. Add a WHERE clause that will return only data about the album with ID number 6.

 SELECT 
 *
 FROM 
 track
Course_5_Week_Challenge_1.1

How many tracks are on the album with ID number 6?

  • 20
  • 13 (Correct)
  • 5
  • 8

Correct: The clause WHERE album_id = 6 will return only data about the album with ID number 6. The complete query is SELECT * FROM track WHERE album_ID = 6. The WHERE clause filters results that meet certain conditions. The WHERE clause includes the name of the column, an equals sign, and the value(s) in the column to include.

13 tracks are on the album with ID number 6.

8. You are working with a database that contains invoice data about online music purchases. You are only interested in invoices sent to customers located in the city of Chicago. You want to sort the invoices by order total in ascending order. The order totals are listed in the total column.

You write the SQL query below. Add an ORDER BY clause that will sort the invoices by order total in ascending order.

Course_5_Week_Challenge_1.2

What total appears in row 2 of your query result?

  • 1.98 (Correct)
  • 7.96
  • 15.86
  • 5.94

Correct: The clause ORDER BY total will sort the invoices by order total in ascending order. The complete query is SELECT * FROM invoice WHERE billing_city = “Chicago” ORDER BY total. The ORDER BY clause tells the database how to organize the data it returns. The ORDER BY clause sorts data in ascending order by default.

The total 1.98 appears in row 2 of your query result.

9. During which of the four phases of analysis can you find a correlation between two variables?

  • Format and adjust data
  • Organize data
  • Transform data (Correct)
  • Get input from others

Correct: Finding a correlation between two variables occurs while transforming data.

10. Typically, a data analyst uses filters when they want to expand the amount of data they are working with.

  • True
  • False (Correct)

Correct: Typically, a data analyst uses filters when they want to narrow down the amount of data they are working with.

11. A data analyst sorts a spreadsheet range between cells F19 and G82. They sort in ascending order by the second column, Column G. What is the syntax they are using?

  • =SORT(F19:G82, 2, FALSE)
  • =SORT(F19:G82, B, TRUE)
  • =SORT(F19:G82, 2, TRUE) (Correct)
  • =SORT(F19:G82, B, FALSE)

Correct: The correct syntax is =SORT(F19:G82, 2, TRUE). The first part of the function sorts the data in the specified range. The 2 represents the second column. And a TRUE statement sorts in ascending order.

12. Which phase of the data analysis process has the goal of identifying trends and relationships?

  • Analyze (Correct)
  • Process
  • Act
  • Prepare

Correct: The goal of analysis is to identify trends and relationships within that data so you can accurately answer the question you’re asking.

13. Which of the following actions might occur when transforming data? Select all that apply.

  • Recognize relationships in your data (Correct)
  • Eliminate irrelevant info from your data
  • Make calculations based on your data (Correct)
  • Identify a pattern in your data (Correct)

Correct: Transforming data means identifying relationships and patterns between the data, and making calculations based on the data you have.

14. Fill in the blank: Sorting ranks data based on a specific _____ that you select.

  • model
  • calculation
  • observation
  • metric (Correct)

Correct: Sorting ranks data based on a specific metric that you select. This involves arranging the data into a meaningful order to make it easier to understand, analyze, and visualize.

15. A data analyst is sorting data in a spreadsheet. Which tool are they using if all of the data is sorted by the ranking of a specific sorted column and data across rows is kept together?

  • Sort Sheet (Correct)
  • Sort Together
  • Sort Document
  • Sort Rank

Correct: Sort sheet sorts all of the data in a spreadsheet by the ranking of a specific sorted column. Also, data across rows is kept together.

16. You are querying a database that contains data about music. You are only interested in data related to the jazz musician Miles Davis. The names of the musicians are listed in the composer column.

You write the SQL query below. Add a WHERE clause that will return only data about music by Miles Davis.

What track by Miles Davis appears in row 1 of your query result? 

  • So What
  • Summertime
  • Compulsion
  • Now’s The Time (Correct)

Correct: The clause WHERE composer = “Miles Davis” will return only data about music by Miles Davis. The complete query is SELECT * FROM track WHERE composer = “Miles Davis”. The WHERE clause filters results that meet certain conditions. The WHERE clause includes the name of the column, an equals sign, and the value(s) in the column to include.

The track Now’s The Time by Miles Davis appears in row 1 of your query result.

17. A data analyst at a high-tech manufacturer sorts inventory data in a spreadsheet. They sort all data by ranking in the Order Frequency column, keeping together all data across rows. What spreadsheet tool are they using? 

  • Sort Rows
  • Sort Column
  • Sort Together
  • Sort Sheet (CORRECT)

18. Fill in the blank: To filter for all students in the Sophomore table who live in Fairfield County, a data professional uses the _____ clause in SQL. 

  • LIMIT
  • EXCEPT
  • FILTER
  • WHERE (CORRECT)

19. A junior data analyst performs several calculations on a dataset. What phase of analysis is the analyst in? 

  • Get input from others
  • Format and adjust data
  • Organize data
  • Transform data (CORRECT)

20. Which of the following statements accurately describe sorting and filtering? Select all that apply. 

  • Filtering can be performed in spreadsheets, but not SQL databases.
  • Filtering enables data professionals to view the data that is most important. (CORRECT) 
  • Sorting involves arranging data into a meaningful order. (CORRECT)
  • Sorting can be performed in both spreadsheets and SQL databases. (CORRECT)

21. Fill in the blank: During an analysis project, _____ might involve creating new columns in order to prepare the dataset for analysis. 

  • formatting and adjusting data (CORRECT)
  • organizing data
  • getting input from others
  • transforming data

22. Which query will return a list of all construction businesses that have made more than $8 million, in order from the largest number of employees to the fewest? 

  • 1 SELECT * 2 FROM ‘Company_data’ 3 WHERE Business = ‘Construction’, Revenue < 8000000 4 ORDER BY number_of_employees ASC
  • 1 SELECT * 2 FROM ‘Company_data’ 3 WHERE Business = ‘Construction’ 4 AND Revenue > 8000000 5 ORDER BY number_of_employees DSC
  • 1 SELECT * 2 FROM ‘Company_data’ 3 WHERE Business = ‘Construction’ 4 WHERE Revenue < 8000000 5 ORDER BY number_of_employees DSC
  • 1 SELECT * 2 FROM ‘Company_data’ 3 WHERE Business = ‘Construction’ 4 AND Revenue > 8000000 5 ORDER BY number_of_employees ASC (CORRECT)

23. A data professional at a manufacturing company is tasked with identifying which machines are most likely to need repairs. In the analyze phase of the data analysis process, what activities might this involve? Select all that apply.  

  • Prepare a report for the stakeholders
  • Organize a dataset by machine type and performance levels (CORRECT)
  • Get input from colleagues on the data team (CORRECT)
  • Format the data to filter for machines that need the most maintenance (CORRECT)

24. Which function sorts a spreadsheet range between cells K1 and L80 in ascending order by the first column, Column K? 

  • =SORT(K1:L80, A, TRUE)
  • =SORT(K1:L80, A, FALSE)
  • =SORT(K1:L80, 1, TRUE) (CORRECT)
  • =SORT(K1:L80, 1, FALSE)

25. A data analyst determines whether there are any patterns in a dataset. What phase of analysis is the analyst in? 

  • Format and adjust data
  • Organize data
  • Transform data (CORRECT)
  • Get input from others

26. Which function sorts a spreadsheet range between cells C1 and D70 in ascending order by the first column, Column C? 

  • =SORT(C1:D70, A, TRUE)
  • =SORT(C1:D70, 1, FALSE)
  • =SORT(C1:D70, 1, TRUE)(CORRECT)
  • =SORT(C1:D70, A, FALSE)

27. A data analyst at a retail company sorts customer data in a spreadsheet. They sort all data by ranking in the Purchase Amount column, keeping together all data across rows. What spreadsheet tool are they using? 

  • Sort Document
  • Sort Sheet (CORRECT)
  • Sort Together
  • Sort Rank

28. Fill in the blank: To filter for all items in the Products table that are currently in stock, a data professional uses the _____ clause in SQL. 

  • LIMIT
  • FILTER
  • EXCEPT
  • WHERE (CORRECT)

29. Fill in the blank: During an analysis project, _____ might involve converting dates to a consistent format in order to prepare the dataset for analysis. 

  • transforming data
  • formatting and adjusting data
  • getting input from others (CORRECT)

30. Which function sorts a spreadsheet range between cells C1 and D70 in ascending order by the first column, Column C? 

  • =SORT(C1:D70, A, TRUE)
  • =SORT(C1:D70, 1, TRUE)(CORRECT)
  • =SORT(C1:D70, A, FALSE)
  • =SORT(C1:D70, 1, FALSE)

31. A data professional in human resources is tasked with identifying appropriate staff members to manage upcoming projects. In the analyze phase of the data analysis process, what activities might this involve? Select all that apply. 

  • Prepare a report for the stakeholders
  • Get input from other HR data professionals (CORRECT)
  • Format the data to filter for keywords relevant to the upcoming projects (CORRECT)
  • Organize an employee dataset by skills and experience (CORRECT)

32. A data professional at a finance company sorts spreadsheet data. They sort all data by ranking in the Financial Performance column, keeping together all data across rows. What spreadsheet tool are they using? 

  • Sort Rows
  • Sort Together
  • Sort Column
  • Sort Sheet (CORRECT)

33. A data team investigates possible relationships in a dataset. What phase of analysis is the analyst in? 

  • Get input from others
  • Organize data
  • Transform data (CORRECT)
  • Format and adjust data

34.  Which query will return a list of all corn farms that have made more than $4 million, in order from the oldest to the newest business? 

  • SELECT * (CORRECT) FROM ‘farms_directory’ WHERE Crop = ‘Corn’, Revenue < 4000000 ORDER BY Years_in_business ASC
  • SELECT * FROM ‘farms_directory’ WHERE Crop = ‘Corn’ AND Revenue > 4000000 ORDER BY Years_in_business ASC
  • SELECT * FROM ‘farms_directory’ WHERE Crop = ‘Corn’ AND Revenue > 4000000 ORDER BY Years_in_business DSC
  • SELECT * FROM ‘farms_directory’ WHERE Crop = ‘Corn’ WHERE Revenue > 4000000 ORDER BY Years_in_business DSC

35. A data professional in customer service is tasked with identifying customers who are at risk for taking their business to a competitor. In the analyze phase of the data analysis process, what activities might this involve? Select all that apply. 

  • Prepare a report for the stakeholders
  • Request input from other customer service data professionals (CORRECT)
  • Format the data to filter for low customer satisfaction scores (CORRECT)
  • Organize a dataset by customer and purchase history (CORRECT)

36.  Which function sorts a spreadsheet range between cells G1 and H60 in ascending order by the first column, Column G? 

  • =SORT(G1:H60, 1, FALSE)
  • =SORT(G1:H60, A, FALSE)
  • =SORT(G1:H60, 1, TRUE) (CORRECT)
  • =SORT(G1:H60, A, TRUE)

37. Which function sorts a spreadsheet range between cells K1 and L80 in ascending order by the first column, Column K? 

  • =SORT(K1:L80, 1, FALSE)
  • =SORT(K1:L80, A, FALSE)
  • =SORT(K1:L80, 1, TRUE) (CORRECT)
  • =SORT(K1:L80, A, TRUE)

Organizing Data to Begin Analysis CONCLUSION

In this part of the course, you learned about the importance of organizing your data. Sorting and filtering your data can help make it more useful in your analysis.

You explored these processes in both spreadsheets and SQL as you continued to prepare your data for analysis. Now that you understand the importance of Organizing your data, it’s time to join the learning experience on Coursera and continue preparing your dataset for further analysis.