# Course 5 – Analyze Data to Answer Questions

## Aggregating Data for Analysis INTRODUCTION

Aggregating data for analysis is essential to the Google Data Analytics Professional Certification course from Coursera. This part of the course will explore how to combine data from multiple sources, whether they are cells in a spreadsheet or multiple database tables.

You’ll learn the functions, procedures, and syntax needed to aggregate data in order to gain insights and complete business objectives. Aggregating data is an important tool that can help you make smarter decisions when it comes to analyzing data and making actionable plans. With this skill, you will have all the tools you need to become a successful data analyst. Enroll now and get started with Aggregating Data for Analysis!

## Learning Objectives

• Demonstrate an understanding of functions and procedures that may be used to combine data from multiple cells in a spreadsheets
• Demonstrate an understanding of functions and syntax to create SQL queries for combining data from multiple database tables
• Use VLOOKUP to query data, trim data, convert text data to numeric data, and create a summary table from a queried information

## Hands-On Activity: Combine multiple pieces of data

### 1. Imagine the employee Anika Patel asks you to confirm her pay rate. Without using the pivot table, which VLOOKUP function would return her pay rate based off of the imported data on Sheet1?

•  =VLOOKUP(B19, B15:J19, 9, false) (Correct)
• =VLOOKUP(B20, B15:J20, 9, false)
• =VLOOKUP(B19, B15:J19, 9, true)
• =VLOOKUP(B19, B15:J19, 8, false)

Correct: The function =CONCATENATE(A7, ” “, B7) would return “John Quincy Adams.” The correct function to use here would be CONCATENATE, which you can use to combine strings. Using CONCAT with these arguments would resolve this correctly in Microsoft Excel, but return an error in Google Sheets. Going forward, you can use this distinction to write proper spreadsheet functions.

## Test your knowledge on Vlookup

### 1. To change a text string in spreadsheet cell F8 to a numerical value, what is the correct function?

• =MATCH(F8)
• =NUM(F8)
• =CONVERT(F8)
• =VALUE(F8) (Correct)

Correct: To change the text string in spreadsheet cell F8 to a numerical value, the correct syntax is =VALUE(F8). Within the parenthesis, the VALUE syntax must include a reference to the specific cell whose value the function should convert.

### 2. What is the purpose of an absolute reference within a function, such as “\$C\$3”?

• To remove unnecessary instructions from a formula or function
• To make formulas and functions unconditional
• To represent missing values in a formula or function
• To lock rows and columns so they won’t change when a function is copied (Correct)

Correct: The purpose of an absolute reference is to lock the reference to a row or column so values won’t change when a function is copied.

### 3. In VLOOKUP, TRUE tells the function to search for exact matches, and FALSE tells the function to look for approximate matches.

• True
• False (Correct)

Correct: In VLOOKUP, TRUE tells the function to search for approximate matches, and FALSE tells the function to look for exact matches.

### To search for the population of Nigeria, what is the correct VLOOKUP syntax?

• =VLOOKUP(Nigeria, A2:C10, 3, true)
• =VLOOKUP(Nigeria, A2:C10, 3, false)
• =VLOOKUP(“Nigeria”, A2:C10, 2, false) (Correct)
• =VLOOKUP(Nigeria, A2,C10, 2, true)

Correct: To search for the population of Nigeria, the syntax is =VLOOKUP(“Nigeria”, A2:C10, 2, false). “Nigeria” is the reference. A2:C10 is the table array. The 2 indicates the position of the column from which the value should be returned. And the word false instructs the function to return an exact match.

### To search for the height of the building in Mecca, what is the correct VLOOKUP syntax?

• =VLOOKUP(Mecca, A2:D7, 2, true)
• =VLOOKUP(Mecca, A2:D7, 2, false)
• =VLOOKUP(Mecca, A2,D7, 3, true)
• =VLOOKUP(“Mecca”, A2:D7, 3, false) (Correct)

Correct: To search for the height of the building in Mecca, the correct syntax is =VLOOKUP(“Mecca”, A2:D7, 3, false). “Mecca” is the reference. A2:D7 is the table array. The 3 indicates the number of the column from which the value should be returned. And the word false instructs the function to return an exact match.

## Hands-on activity: Queries for joins

### 1. In the last query, you use a LEFT JOIN instead of an INNER JOIN to find the correct information. Beneath the query results, you’ll find that the number of rows in your joined table is 281. If you rerun the query with an INNER JOIN instead of a LEFT JOIN, how many rows would it return?

• 274 (Correct)
• 281
• 301
• 324

Correct: The number of rows returned by an INNER JOIN is 274. When you run the query with an INNER JOIN instead of a LEFT JOIN, you exclude universities without mascots and return fewer rows of data. Knowing which JOIN to use is very important for analyzing data. Going forward, you can use your knowledge of JOINs to properly combine data from multiple tables.

## Test your knowledge on using joins to aggregate data

### 1. A data analyst wants to retrieve only records from a database that have matching values in two different tables. Which JOIN function should they use?

• INNER JOIN (Correct)
• RIGHT JOIN
• LEFT JOIN
• OUTER JOIN

Correct: To retrieve only records from a database that have matching values in two different tables, the analyst should use INNER JOIN.

### 2. You are writing a SQL query to instruct a database to count values in a specified range. You only want to count each value once, even if it appears multiple times. Which function should you include in your query?

• COUNT DISTINCT
• COUNT VALUES (Correct)
• COUNT
• COUNT RANGE

Correct: To tell a database to return distinct values in a specified range, the analyst should use COUNT DISTINCT in their query.

### 3. A data analyst wants to temporarily name a column in their query to make it easier to read and write. What technique should they use?

• Filtering
• Aliasing (Correct)
• Tagging
• Naming

Correct: To temporarily name a column in a query to make it easier to read and write, the analyst should use aliasing.

## Test your knowledge on working with subqueries

### 1. Which of the following queries contain subqueries? Select all that apply.

Correct: The three queries with statements in parentheses contain subqueries.

### 2. Fill in the blank: A data analyst uses aliasing to make it easier to read and write a query. Aliasing involves temporarily _____ a table or column in a query.

• Naming (Correct)
• removing
• copying
• hiding

Correct: Aliasing involves temporarily naming a table or column in a query.

### 3. When working with subqueries, the outer query executes first.

• True
• False (Correct)

Correct: The inner query executes first, then the results are passed onto the outer query to use.

Liking our content? Then don’t forget to add us to your bookmarks so you can find us easily!

## Analyze Data to Answer Questions Weekly Challenge 3

### 1. In data analytics, what is data aggregation?

• The process of modifying data in order to make it suitable for analysis.
• The process of ensuring a company’s data is properly stored, managed, and maintained.
• The process of moving certain data points to a higher rank or position.
• The process of gathering data from multiple sources and combining it into a single, summarized collection. (Correct)

Correct: Data aggregation is the process of gathering data from multiple sources and combining it into a single, summarized collection.

### 2. A data analyst wants to be sure all of the numbers in a spreadsheet are numeric. What function should they use to convert text to numeric values?

• PROCESS
• CONVERT
• VALUE (Correct)
• EXCHANGE

Correct: The analyst should use the VALUE function to convert text to numeric values.

### 3. When using VLOOKUP, there are some common limitations that data analysts should be aware of. Identify these limitations. Select all that apply.

• VLOOKUP can only return a value from the data to the right of the column of the matched value. (Correct)
• VLOOKUP only returns the first match it finds, even if there are many possible matches. (Correct)
• VLOOKUP only returns matches it finds while searching through a row.
• VLOOKUP can only return a value from the data to the left of the column it’s typed into.

Correct: One limitation of VLOOKUP is that it only returns the first match it finds, even if there are many possible matches. Another is that it is only able to return a value from the data to the right of the column of the matched value.

### 4. Fill in the blank: When writing a function, a data analyst wraps a table array in dollar signs. This is an _____, which is used to lock the array so rows and columns don’t change if the function is copied.

• absolute reference (Correct)
• accurate reference
• authentic reference
• arbitrary reference

Correct: Wrapping a table array in dollar signs creates an absolute reference, which locks the array so rows and columns don’t change if the function is copied.

### 5.1. The following is a selection from a spreadsheet:

To search for the population of Brazil, what is the correct VLOOKUP syntax?

• =VLOOKUP(Brazil, A2,B10, 3, false)
• =VLOOKUP(“Brazil”, A2:B10, 2, false) (Correct)
• =VLOOKUP(Brazil, A2:B10, 3, false)
• =VLOOKUP(Brazil, A2:B10, 2, false)

Correct: To search for the population of Brazil, the syntax is =VLOOKUP(“Brazil”, A2:B10, 2, false). “Brazil” is the reference. A2:B10 is the table array. The 2 indicates the number of the column from which the value should be returned. And the word false instructs the function to return an exact match.

### 5.2. The following is a selection from a spreadsheet:

To search for the population of Pakistan, what is the correct VLOOKUP syntax?

• =VLOOKUP(Pakistan, A2,B10, 3, false)
• =VLOOKUP(“Pakistan”, A2:B10, 2, false) (Correct)
• =VLOOKUP(Pakistan, A2:B10, 3, false)
• =VLOOKUP(Pakistan, A2:B10, 2, false)

Correct: To search for the population of Pakistan, the syntax is =VLOOKUP(“Pakistan”, A2:B10, 2, false). “Pakistan” is the reference. A2:B10 is the table array. The 2 indicates the number of the column from which the value should be returned. And the word false instructs the function to return an exact match.

### 5.3. The following is a selection from a spreadsheet:

To search for the population of Nigeria, what is the correct VLOOKUP syntax?

• =VLOOKUP(Nigeria, A2,B10, 3, false)
• =VLOOKUP(“Nigeria”, A2:B10, 2, false) (Correct)
• =VLOOKUP(Nigeria, A2:B10, 3, false)
• =VLOOKUP(Nigeria, A2:B10, 2, false)

Correct: To search for the population of Nigeria, the syntax is =VLOOKUP(“Nigeria”, A2:B10, 2, false). “Nigeria” is the reference. A2:B10 is the table array. The 2 indicates the number of the column from which the value should be returned. And the word false instructs the function to return an exact match.

### 5.4. The following is a selection from a spreadsheet:

To search for the growth in population in Indonesia, what is the correct VLOOKUP syntax?

• =VLOOKUP(Nigeria, A2,B10, 3, false)
• =VLOOKUP(“Nigeria”, A2:B10, 4, false) (Correct)
• =VLOOKUP(Nigeria, A2:B10, 3, false)
• =VLOOKUP(Nigeria, A2:B10, 2, false)

Correct: To search for the population of Indonesia, the syntax is =VLOOKUP(“Indonesia”, A2:B10, 2, false). “Indonesia” is the reference. A2:B10 is the table array. The 2 indicates the number of the column from which the value should be returned. And the word false instructs the function to return an exact match.

### 6. An INNER JOIN is a function that returns records with matching values in two or more tables. An OUTER JOIN is a function that combines RIGHT and LEFT JOIN to return all matching records in both tables.

• True (Correct)
• False

Correct: An INNER JOIN is a function that returns records with matching values in two or more tables. An OUTER JOIN is a function that combines RIGHT and LEFT JOIN to return all matching records in both tables.

### 7. The COUNT DISTINCT function includes repeating values when returning values in a specified range.

• True
• False (Correct)

Correct: The COUNT DISTINCT function does not include repeating values when returning values in a specified range.

### 8. Which of the following terms describe a subquery? Select all that apply.

• Inner select (Correct)
• Inner query (Correct)
• Small query
• Nested query (Correct)

Correct: A subquery can also be called an inner query, inner select, or nested query.

### 9. While using VLOOKUP, you encounter an error because some of your spreadsheet values have leading and trailing spaces. What function should you use to eliminate these spaces?

• TRIM (CORRECT)
• NOSPACE
• CUT
• VALUE

### 10. Fill in the blank: The spreadsheet function _____ can be used to tally the number of cells in a range that are not empty.

• RETURN
• COUNT (CORRECT)
• COUNT DISTINCT
• RANGE

### 11. A data analyst writes the following formula: =MAX(\$E\$5:\$E\$500). What are the purposes of the dollar signs (\$)? Select all that apply.

• Perform the calculation more efficiently.
• Ensure rows and columns do not change. (CORRECT)
• Create an absolute reference. (CORRECT)
• Find the maximum value in the range E5 to E500 regardless of whether the formula is copied. (CORRECT)

### 12. What will this query return?

1 SELECT *

2 FROM Equipment_table

3 LEFT JOIN Computer_table

• All records in the computer table and any matching rows from the equipment table
• All rows from the equipment table joined together with the computer table (CORRECT)
• All records in the equipment table and any matching rows from the computer table
• All records in both the equipment table and the computer table

### 13. In this spreadsheet, which function will search for the surface area of Lake Huron?

• =VLOOKUP(Huron, A2:C10, false)
• =VLOOKUP(“Huron”, A2:B10, 2, false) (CORRECT)
• =VLOOKUP(“Huron”, B2:C10, 2, false)
• =VLOOKUP(Huron, A2:B10, 3, false)

### 14. Fill in the blank: A SQL clause containing HAVING adds a _____ to a query instead of the underlying table.

• Subquery
• Join
• Filter (CORRECT)
• Limit

### 15. A data analyst at a retail store works with a spreadsheet containing sales data. In order to calculate sales tax correctly for customer orders, the analyst ensures all amounts are converted to numeric values. What function do they use?

• EXCHANGE
• PROCESS
• CONVERT
• VALUE (CORRECT)

### 16. Which query will select all columns from the operations table and alias them to ops?

• 1 SELECT * 2 FROM operations NEW ops
• 1 SELECT * 2 FROM operations TO ops
• 1 SELECT * (CORRECT) 2 FROM operations AS ops
• 1 SELECT * 2 FROM operations ALIAS ops

### 17. A junior data analyst writes the following formula: =AVERAGE(\$C\$1:\$C\$100). What are the purposes of the dollar signs (\$)? Select all that apply.

• Perform the calculation more efficiently.
• Create an absolute reference. (CORRECT)
• Average the values in cells C1 to C100 regardless of whether the formula is copied. (CORRECT)
• Ensure rows and columns do not change. (CORRECT)

### 18. What will this query return?

1 SELECT *

2 FROM Inventory_table

3 LEFT JOIN Scrap_table

• All records in both the inventory table and the scrap table
• All records in the inventory table and any matching rows from the scrap table (CORRECT)
• All records in the scrap table and any matching rows from the inventory table
• All rows from the inventory table joined together with the scrap table

### 19. Fill in the blank: A SQL clause containing HAVING can only be used with _____ functions.

• join
• ORDER BY
• Aggregate (CORRECT)
• GROUP BY

### 20. Which query will select all columns from the customers table and alias them to cust?

• 1 SELECT * 2 FROM customer TO cust
• 1 SELECT * (CORRECT) 2 FROM customer AS cust
• 1 SELECT * 2 FROM customer NEW cust
• 1 SELECT * 2 FROM customer ALIAS cust

### 21. You use VLOOKUP in a spreadsheet containing weather data. While searching for rainfall levels in Chicago, you encounter an error because your spreadsheet value has a trailing space after the city name. What function should you use to eliminate this space?

• VALUE
• NOSPACE
• TRIM (CORRECT)
• CUT

### 22.  Fill in the blank: The spreadsheet function _____ can be used to add up the number of times a range of cells contains the value “paid.”

• COUNT (CORRECT)
• RANGE
• RETURN
• COUNT DISTINCT

### 23. A data professional writes the following formula: =SUM(\$A\$6:\$A\$60). What are the purposes of the dollar signs (\$)? Select all that apply.

• Perform the calculation more efficiently
• Sum the values in cells A6 to A60 regardless of whether the formula is copied. (CORRECT)
• Ensure rows and columns do not change. (CORRECT)
• Create an absolute reference. (CORRECT)

### 24. What will this query return?

1 SELECT *

2 FROM Books_table

3 LEFT JOIN Biography_table

• All records in the biography table and any matching rows from the books table
• All rows from the books table joined together with the biography table
• All records in the books table and any matching rows from the biography table (CORRECT)
• All records in both the books table and the biography table

### 25.  In this spreadsheet, which function will search for the surface area of Lake Victoria?

• =VLOOKUP(“Victoria”, B2:C10, 2, false)
• =VLOOKUP(Victoria, A2:B10, 3, false)
• =VLOOKUP(“Victoria”, A2:B10, 2, false) (CORRECT)
• =VLOOKUP(Victoria, A2:C10, false)

### 26. Which query will select all columns from the highways table and alias them to hwys?

• 1 SELECT * 2 FROM highways ALIAS hwys
• 1 SELECT * 2 FROM highways TO hwys
• 1 SELECT * (CORRECT) 2 FROM highways AS hwys
• 1 SELECT * 2 FROM highways NEW hwys

### 27. You use VLOOKUP to search for the name “Liza Campbell.” However, the function doesn’t work properly because your spreadsheet has a repeated space between the first and last name. What function should you use to eliminate this space?

• TRIM (CORRECT)
• CUT
• NOSPACE
• VALUE

### 28. In this spreadsheet, which function will search for the water type of Lake Urmia?

• =VLOOKUP(Urmia, A2:C10, 2, false)
• =VLOOKUP(“Urmia”, A2:C10, 3, false) (CORRECT)
• =VLOOKUP(Urmia, A2:B10, false)
• =VLOOKUP(“Urmia”, B2:C10, 2, false)

### 29.  Fill in the blank: To find out how many times a specific error occurs in a range of cells, the spreadsheet function _____ can be used.

• COUNT (CORRECT)
• COUNT DISTINCT
• RANGE
• RETURN

### 30. Fill in the blank: A SQL clause containing HAVING adds a filter to a _____ instead of the underlying table.

• statement
• column
• row
• query (CORRECT)

### 31.  A junior data analyst in a marketing department works with a spreadsheet containing email click-through data. To calculate the average click-through rate for a campaign, the analyst uses a function to convert the number of clicks to numeric values. What function do they use?

• EXCHANGE
• PROCESS
• VALUE (CORRECT)
• NUM

## Aggregating Data for Analysis CONCLUSION

Now that you’ve completed this module, you should have a good understanding of the functions, procedures, and syntax involved in combining data. You should also be able to do this from multiple cells in spreadsheets and from multiple database tables using SQL queries. If you want to continue learning about data analysis, consider joining the Coursera community.

Here, you can take courses from some of the world’s top universities and institutions and learn at your own pace. Joining Coursera is a great way to further your education and advance your career.

#### Subscribe to our site

Get new content delivered directly to your inbox.

## Quiztudy Top Courses

Liking our content? Then, don’t forget to ad us to your BOOKMARKS so you can find us easily!