# Course 5 – Analyze Data to Answer Questions

## Complete Study Guide

Enroll in Coursera Google Data Analytics

## Aggregating Data for Analysis INTRODUCTION

Aggregating data for analysis is essential to the Google Data Analytics Professional Certification course from Coursera. This part of the course will explore how to combine data from multiple sources, whether they are cells in a spreadsheet or multiple database tables.

You’ll learn the functions, procedures, and syntax needed to aggregate data in order to gain insights and complete business objectives. Aggregating data is an important tool that can help you make smarter decisions when it comes to analyzing data and making actionable plans. With this skill, you will have all the tools you need to become a successful data analyst. Enroll now and get started with Aggregating Data for Analysis!

## Learning Objectives

• Demonstrate an understanding of functions and procedures that may be used to combine data from multiple cells in a spreadsheets
• Demonstrate an understanding of functions and syntax to create SQL queries for combining data from multiple database tables
• Use VLOOKUP to query data, trim data, convert text data to numeric data, and create a summary table from a queried information

## Hands-On Activity: Combine multiple pieces of data

### 1. Imagine the employee Anika Patel asks you to confirm her pay rate. Without using the pivot table, which VLOOKUP function would return her pay rate based off of the imported data on Sheet1?

•  =VLOOKUP(B19, B15:J19, 9, false) (Correct)
• =VLOOKUP(B20, B15:J20, 9, false)
• =VLOOKUP(B19, B15:J19, 9, true)
• =VLOOKUP(B19, B15:J19, 8, false)

Correct: The function =CONCATENATE(A7, ” “, B7) would return “John Quincy Adams.” The correct function to use here would be CONCATENATE, which you can use to combine strings. Using CONCAT with these arguments would resolve this correctly in Microsoft Excel, but return an error in Google Sheets. Going forward, you can use this distinction to write proper spreadsheet functions.

## Test your knowledge on Vlookup

### 1. To change a text string in spreadsheet cell F8 to a numerical value, what is the correct function?

• =MATCH(F8)
• =NUM(F8)
• =CONVERT(F8)
• =VALUE(F8) (Correct)

Correct: To change the text string in spreadsheet cell F8 to a numerical value, the correct syntax is =VALUE(F8). Within the parenthesis, the VALUE syntax must include a reference to the specific cell whose value the function should convert.

### 2. What is the purpose of an absolute reference within a function, such as “\$C\$3”?

• To remove unnecessary instructions from a formula or function
• To make formulas and functions unconditional
• To represent missing values in a formula or function
• To lock rows and columns so they won’t change when a function is copied (Correct)

Correct: The purpose of an absolute reference is to lock the reference to a row or column so values won’t change when a function is copied.

### 3. In VLOOKUP, TRUE tells the function to search for exact matches, and FALSE tells the function to look for approximate matches.

• True
• False (Correct)

Correct: In VLOOKUP, TRUE tells the function to search for approximate matches, and FALSE tells the function to look for exact matches.

### To search for the population of Nigeria, what is the correct VLOOKUP syntax?

• =VLOOKUP(Nigeria, A2:C10, 3, true)
• =VLOOKUP(Nigeria, A2:C10, 3, false)
• =VLOOKUP(“Nigeria”, A2:C10, 2, false) (Correct)
• =VLOOKUP(Nigeria, A2,C10, 2, true)

Correct: To search for the population of Nigeria, the syntax is =VLOOKUP(“Nigeria”, A2:C10, 2, false). “Nigeria” is the reference. A2:C10 is the table array. The 2 indicates the position of the column from which the value should be returned. And the word false instructs the function to return an exact match.

### To search for the height of the building in Mecca, what is the correct VLOOKUP syntax?

• =VLOOKUP(Mecca, A2:D7, 2, true)
• =VLOOKUP(Mecca, A2:D7, 2, false)
• =VLOOKUP(Mecca, A2,D7, 3, true)
• =VLOOKUP(“Mecca”, A2:D7, 3, false) (Correct)

Correct: To search for the height of the building in Mecca, the correct syntax is =VLOOKUP(“Mecca”, A2:D7, 3, false). “Mecca” is the reference. A2:D7 is the table array. The 3 indicates the number of the column from which the value should be returned. And the word false instructs the function to return an exact match.

## Hands-on activity: Queries for joins

### 1. In the last query, you use a LEFT JOIN instead of an INNER JOIN to find the correct information. Beneath the query results, you’ll find that the number of rows in your joined table is 281. If you rerun the query with an INNER JOIN instead of a LEFT JOIN, how many rows would it return?

• 274 (Correct)
• 281
• 301
• 324

Correct: The number of rows returned by an INNER JOIN is 274. When you run the query with an INNER JOIN instead of a LEFT JOIN, you exclude universities without mascots and return fewer rows of data. Knowing which JOIN to use is very important for analyzing data. Going forward, you can use your knowledge of JOINs to properly combine data from multiple tables.

## Test your knowledge on using joins to aggregate data

### 1. A data analyst wants to retrieve only records from a database that have matching values in two different tables. Which JOIN function should they use?

• INNER JOIN (Correct)
• RIGHT JOIN
• LEFT JOIN
• OUTER JOIN

Correct: To retrieve only records from a database that have matching values in two different tables, the analyst should use INNER JOIN.

### 2. You are writing a SQL query to instruct a database to count values in a specified range. You only want to count each value once, even if it appears multiple times. Which function should you include in your query?

• COUNT DISTINCT
• COUNT VALUES (Correct)
• COUNT
• COUNT RANGE

Correct: To tell a database to return distinct values in a specified range, the analyst should use COUNT DISTINCT in their query.

### 3. A data analyst wants to temporarily name a column in their query to make it easier to read and write. What technique should they use?

• Filtering
• Aliasing (Correct)
• Tagging
• Naming

Correct: To temporarily name a column in a query to make it easier to read and write, the analyst should use aliasing.

## Test your knowledge on working with subqueries

### 1. Which of the following queries contain subqueries? Select all that apply.

Correct: The three queries with statements in parentheses contain subqueries.

### 2. Fill in the blank: A data analyst uses aliasing to make it easier to read and write a query. Aliasing involves temporarily _____ a table or column in a query.

• Naming (Correct)
• removing
• copying
• hiding

Correct: Aliasing involves temporarily naming a table or column in a query.

### 3. When working with subqueries, the outer query executes first.

• True
• False (Correct)

Correct: The inner query executes first, then the results are passed onto the outer query to use.

Liking our content? Then don’t forget to add us to your bookmarks so you can find us easily!

## Analyze Data to Answer Questions Weekly Challenge 3

### 1. In data analytics, what is data aggregation?

• The process of modifying data in order to make it suitable for analysis.
• The process of ensuring a company’s data is properly stored, managed, and maintained.
• The process of moving certain data points to a higher rank or position.
• The process of gathering data from multiple sources and combining it into a single, summarized collection. (Correct)

Correct: Data aggregation is the process of gathering data from multiple sources and combining it into a single, summarized collection.

### 2. A data analyst wants to be sure all of the numbers in a spreadsheet are numeric. What function should they use to convert text to numeric values?

• PROCESS
• CONVERT
• VALUE (Correct)
• EXCHANGE

Correct: The analyst should use the VALUE function to convert text to numeric values.

### 3. When using VLOOKUP, there are some common limitations that data analysts should be aware of. Identify these limitations. Select all that apply.

• VLOOKUP can only return a value from the data to the right of the column of the matched value. (Correct)
• VLOOKUP only returns the first match it finds, even if there are many possible matches. (Correct)
• VLOOKUP only returns matches it finds while searching through a row.
• VLOOKUP can only return a value from the data to the left of the column it’s typed into.

Correct: One limitation of VLOOKUP is that it only returns the first match it finds, even if there are many possible matches. Another is that it is only able to return a value from the data to the right of the column of the matched value.

### 4. Fill in the blank: When writing a function, a data analyst wraps a table array in dollar signs. This is an _____, which is used to lock the array so rows and columns don’t change if the function is copied.

• absolute reference (Correct)
• accurate reference
• authentic reference
• arbitrary reference

Correct: Wrapping a table array in dollar signs creates an absolute reference, which locks the array so rows and columns don’t change if the function is copied.

### 5.1. The following is a selection from a spreadsheet:

To search for the population of Brazil, what is the correct VLOOKUP syntax?

• =VLOOKUP(Brazil, A2,B10, 3, false)
• =VLOOKUP(“Brazil”, A2:B10, 2, false) (Correct)
• =VLOOKUP(Brazil, A2:B10, 3, false)
• =VLOOKUP(Brazil, A2:B10, 2, false)

Correct: To search for the population of Brazil, the syntax is =VLOOKUP(“Brazil”, A2:B10, 2, false). “Brazil” is the reference. A2:B10 is the table array. The 2 indicates the number of the column from which the value should be returned. And the word false instructs the function to return an exact match.

### 5.2. The following is a selection from a spreadsheet:

To search for the population of Pakistan, what is the correct VLOOKUP syntax?

• =VLOOKUP(Pakistan, A2,B10, 3, false)
• =VLOOKUP(“Pakistan”, A2:B10, 2, false) (Correct)
• =VLOOKUP(Pakistan, A2:B10, 3, false)
• =VLOOKUP(Pakistan, A2:B10, 2, false)

Correct: To search for the population of Pakistan, the syntax is =VLOOKUP(“Pakistan”, A2:B10, 2, false). “Pakistan” is the reference. A2:B10 is the table array. The 2 indicates the number of the column from which the value should be returned. And the word false instructs the function to return an exact match.

### 5.3. The following is a selection from a spreadsheet:

To search for the population of Nigeria, what is the correct VLOOKUP syntax?

• =VLOOKUP(Nigeria, A2,B10, 3, false)
• =VLOOKUP(“Nigeria”, A2:B10, 2, false) (Correct)
• =VLOOKUP(Nigeria, A2:B10, 3, false)
• =VLOOKUP(Nigeria, A2:B10, 2, false)

Correct: To search for the population of Nigeria, the syntax is =VLOOKUP(“Nigeria”, A2:B10, 2, false). “Nigeria” is the reference. A2:B10 is the table array. The 2 indicates the number of the column from which the value should be returned. And the word false instructs the function to return an exact match.

### 5.4. The following is a selection from a spreadsheet:

To search for the growth in population in Indonesia, what is the correct VLOOKUP syntax?

• =VLOOKUP(Nigeria, A2,B10, 3, false)
• =VLOOKUP(“Nigeria”, A2:B10, 4, false) (Correct)
• =VLOOKUP(Nigeria, A2:B10, 3, false)
• =VLOOKUP(Nigeria, A2:B10, 2, false)

Correct: To search for the population of Indonesia, the syntax is =VLOOKUP(“Indonesia”, A2:B10, 2, false). “Indonesia” is the reference. A2:B10 is the table array. The 2 indicates the number of the column from which the value should be returned. And the word false instructs the function to return an exact match.

### 6. An INNER JOIN is a function that returns records with matching values in two or more tables. An OUTER JOIN is a function that combines RIGHT and LEFT JOIN to return all matching records in both tables.

• True (Correct)
• False

Correct: An INNER JOIN is a function that returns records with matching values in two or more tables. An OUTER JOIN is a function that combines RIGHT and LEFT JOIN to return all matching records in both tables.

### 7. The COUNT DISTINCT function includes repeating values when returning values in a specified range.

• True
• False (Correct)

Correct: The COUNT DISTINCT function does not include repeating values when returning values in a specified range.

### 8. Which of the following terms describe a subquery? Select all that apply.

• Inner select (Correct)
• Inner query (Correct)
• Small query
• Nested query (Correct)

Correct: A subquery can also be called an inner query, inner select, or nested query.

## Aggregating Data for Analysis CONCLUSION

Now that you’ve completed this module, you should have a good understanding of the functions, procedures, and syntax involved in combining data. You should also be able to do this from multiple cells in spreadsheets and from multiple database tables using SQL queries. If you want to continue learning about data analysis, consider joining the Coursera community.

Here, you can take courses from some of the world’s top universities and institutions and learn at your own pace. Joining Coursera is a great way to further your education and advance your career.

#### Subscribe to our site

Get new content delivered directly to your inbox.

## Quiztudy Top Courses

Liking our content? Then, don’t forget to ad us to your BOOKMARKS so you can find us easily!