what steps do data analysts take to ensure fairness when collecting data?

Course 1 – Foundations: Data, Data, Everywhere Quiz Answers

Week 5: Endless Career Possibilities

GOOGLE DATA ANALYTICS PROFESSIONAL CERTIFICATION

Complete Study Guide

Endless Career Possibilities INTRODUCTION

In the world of business, data analysts are essential for understanding customer insights and discovering trends to inform decisions. Endless career possibilities exist in organizations from finance, healthcare, IT to marketing, and many more. As a data analyst, you will be responsible for collecting, analyzing, and interpreting vast amounts of data in order to help formulate strategies for the organization’s success.

With a Google Data Analytics Professional Certificate from Coursera, you have the credentials to prove that you can be an asset to any organization. This certificate is evidence of your skills in working with big data tools such as SQL, Python, and machine learning algorithms. With these qualifications under your belt, employers know that they can count on you to analyze complex datasets with accuracy and speed.

Learning Objectives

  • Describe the role of a data analyst with specific reference to job roles
  • Discuss how the Google Data Analytics Certificate can help a candidate meet the requirements of a given job
  • Explain how a business task may be appropriate for a data analyst, with reference to fairness and the value of the data analyst
  • Identify companies that would potentially hire data analysts
  • Describe how one’s prior experiences may be applied to a career as a data analyst
  • Determine whether the use of data constitutes fair or unfair practices
  • Understand the different ways organizations use data
  • Explain the concept of data-driven decision-making including specific examples

Test your knowledge on making fair business decisions

1. What steps do data analysts take to ensure fairness when collecting data? Select all that apply.

  • Clean the data provided
  • Understand the social context (Correct)
  • Include data self-reported by individuals (Correct)
  • Use an inclusive sample population (Correct)

Correct: Considering inclusive sample populations, social context, and self-reported data enable fairness in data collection.

2. Avens Engineering needs more engineers, so they purchase ads on a job search website. The website’s data reveals that 86% of engineers are men. Based on that number, an analyst decides that men are more likely to be successful applicants, so they target the ads to male job seekers. What should the analyst have done instead?

  • Decline to accept ads from Avens Engineering because of fairness concerns.
  • Make sure their recommendation doesn’t create or reinforce bias (Correct)
  • Let Avens Engineering decide which type of applicants to target ads to.
  • Only show ads for the engineering jobs to women.

Correct: They should make sure their recommendation doesn’t create or reinforce bias. As a data analyst, it’s important to help create systems that are fair and inclusive to everyone.

3. On a railway line, peak ridership occurs between 7:00 AM and 5:00 PM. The fairness of a passenger survey could be improved by over-sampling data from which group?

  • Female passengers
  • Male passengers
  • Nighttime riders (Correct)
  • Daytime riders

Correct: Over-sampling the data from nighttime riders, an under-represented group of passengers, could improve the fairness of the survey.

4. A real estate company needs to hire a human resources assistant. The owner asks a data analyst to help them decide where to advertise the job opening. The analyst learns that the majority of human resources professionals are women, validates this finding with research, and targets ads to a women’s community college. This is fair because the analyst conducted research to make sure the information about gender breakdown of human resources professionals was accurate.

  • True
  • False (Correct)

Correct: This is not fair. Fairness means ensuring that analysis doesn’t create or reinforce bias. As a data analyst, it’s important to help create systems that are fair and inclusive to everyone.

Foundations: Data, Data, Everywhere Weekly Challenge 5

1. An online gardening magazine wants to understand why its subscriber numbers have been increasing. What kind of reports can a data analyst provide to help answer that question? Select all that apply.

  • Reports that examine how a recent 50%-off sale affected the number of subscription purchases (Correct)
  • Reports that describe how many customers shared positive comments about the gardening magazine on social media in the past year (Correct)
  • Reports that compare past weather patterns to the number of people asking gardening questions to their social media (Correct)
  • Reports that predict the success of sales leads to secure future subscribers

Correct: Analyzing historical data such as weather patterns, social media comments, and past sales would provide useful insights into the increase in subscription numbers

2. A doctor’s office discovers that patients are waiting 20 minutes longer for their appointments than in past years. In what ways could a data analyst help solve this problem? Select all that apply.

  • Analyze the number of patients seen per day compared to past years. (Correct)
  • Analyze how many doctors and nurses are on staff at a given time compared to the number of patients with appointments. (Correct)
  • Analyze the average length of an appointment this year compared to past years. (Correct)
  • Analyze a recent change in the average rating for the doctor’s office on social media.

Correct: Analyzing appointment length, staffing numbers, and patient numbers are likely to provide useful insights to illustrate why this is happening and help solve this problem

3. A problem is an obstacle to be solved, an issue is a topic to investigate, and a question is designed to discover information.

  • True (Correct)
  • False

Correct: A problem is an obstacle or complication to be solved, whereas a question is designed to discover information. These two things are the foundation of business tasks.

4. Data analysts answer questions and solve problems. These are called business tasks.

  • True (Correct)
  • False

Correct: Data analysts answer questions and solve problems, which are called business tasks.

5. Data-driven decision-making is using facts to guide business strategy. The benefits include which of the following? Select all that apply.

  • Using data analytics to find the best possible solution to a problem (Correct)
  • Getting a complete picture of a problem and its causes (Correct)
  • Combining observation with objective data (Correct)
  • Making the most of intuition and gut instinct

Correct: Data-driven decision-making enables companies to use data analytics to find the best possible solution to a problem, complement observation with objective data, and get a complete picture of a problem and its causes.

6. Which of the following examples describe fairness in data analysis? Select all that apply.

  • Making sure a sample population represents all groups (Correct)
  • Picking and choosing which data to include from a dataset
  • Considering systematic factors that may influence data (Correct)
  • Factoring in social contexts that could create bias in conclusions (Correct)

Correct: Considering systematic factors that may influence your data, factoring in social contexts that could create bias in your conclusions, and making sure your sample population represents all groups are effective ways to ensure that your analysis is fair and doesn’t create or reinforce bias.

7. A data analyst is analyzing fruit and vegetable sales at a grocery store. They’re able to find data on everything except red onions. What’s the best course of action?

  • Exclude all onion varieties from the analysis.
  • Exclude red onions from the analysis.
  • Use the data on white onions instead, as they’re both onion varieties.
  • Ask a teammate for help finding data on red onions. (Correct)

Correct: If a data analyst were to analyze all fruits and vegetables except for onions, the outcomes would not be fair because the data is not representative of all fruits and vegetables sold in grocery stores.

8. Collaborating with a social scientist to provide insights into human bias and social contexts is an effective way to avoid bias in your data.

  • True (Correct)
  • False

Correct: Collaborating with a social scientist to provide insights into human bias and social contexts is an effective way to avoid bias in your data.

9. Data analysts ensure their analysis is fair for what reason?

  • Fairness helps them stay organized.
  • Fairness helps them pick and choose which data to include from a dataset.
  • Fairness helps them avoid biased conclusions. (CORRECT)
  • Fairness helps them communicate with stakeholders.

Correct: Data analysts ensure their analysis is fair in order to ensure their analysis doesn’t create or reinforce bias.

10. A data analyst is analyzing fruit and vegetable sales at a grocery store. They’re able to find data on everything except red onions. If they exclude red onions from the analysis, this would be an example of creating or reinforcing bias

  • True (CORRECT)
  • False

Correct: Fairness means ensuring that your analysis doesn’t create or reinforce bias. Being inclusive, not exclusive, is an important part of fairness.

11. Fill in the blank: A doctor’s office has discovered that patients are waiting 20 minutes longer for their appointments than in past years. To help solve this problem, a data analyst could investigate how many nurses are on staff at a given time compared to the number of _____.

  • patients with appointments (CORRECT)
  • negative comments about the wait times on social media
  • doctors seeing new patients
  • doctors on staff at the same time

Correct: Analyzing staffing and patient numbers would likely provide useful insights about why patients are waiting longer for their appointment times and to help solve this problem.

12. Fill in the blank: A problem is an obstacle to be solved, an issue is a topic to investigate, and a _____ is designed to discover information

  • business task
  • theme
  • breakthrough
  • question (CORRECT)

Correct: A problem is an obstacle or complication to be solved, whereas a question is designed to discover information. These two things are the foundation of business tasks.

13. Fill in the blank: A business task is described as the problem or _____ a data analyst answers for a business.

  • complaint
  • comment
  • question (CORRECT)
  • solution

Correct: A business task is described as the problem or question a data analyst answers for a business.

14. What is the process of using facts to guide business strategy?

  • Data visualization
  • Data-driven decision-making (CORRECT)
  • Data programming
  • Data ethics

Correct: Data-driven decision-making is using facts to guide business strategy.

15. Fill in the blank: Fairness is achieved when data analysis doesn’t create or _____ bias.

  • Reinforce (CORRECT)
  • resolve
  • constrain
  • highlight

Correct: Fairness is achieved when data analysis doesn’t create or reinforce bias.

16. A gym wants to start offering exercise classes. A data analyst plans to survey 10 people to determine which classes would be most popular. To ensure the data collected is fair, what steps should they take? Select all that apply

  • Survey only people who don’t currently go to the gym.
  • Increase the number of participants. (CORRECT)
  • Collect data anonymously. (CORRECT)
  • Ensure participants represent a variety of profiles and backgrounds. (CORRECT)

Correct: Ensuring participants represent a variety of profiles and backgrounds, collecting data anonymously, and surveying more than just 10 people would all help ensure the data analysis is fair.

17. A magazine wants to understand why its subscribers have been increasing. A data analyst could help answer that question with a report that predicts the result of a half-price sale on future subscription rates.

  • True
  • False (CORRECT)

Correct: Predicting the effect of future sales will not answer the question of why there’s been an increase in subscribers. This type of question requires historical data to provide useful insights.

18. Describe the difference between a question and a problem in data analytics.

  • A question is designed to discover information, whereas a problem is an obstacle or complication that needs to be solved. (CORRECT)
  • A question is a topic to investigate, whereas a problem is a subject to investigate.
  • A question can have many answers, whereas a problem only has one solution.
  • A question is uncertain, whereas a problem is clearly specified.

Correct: A question is designed to discover information, whereas a problem is an obstacle or complication to be solved. These two things are the foundation of business tasks.

19. A doctor’s office has discovered that patients are waiting 20 minutes longer for their appointments than in past years. A data analyst could help solve this problem by analyzing how many doctors and nurses are on staff at a given time compared to the number of patients with appointments.

  • True (CORRECT)
  • False

Correct: Analyzing staffing and patient numbers would likely provide useful insights about why patients are waiting longer for their appointment times and to help solve this problem.

20. What is a question or problem that a data analyst answers for a business?

  • Complaint
  • Hypothesis
  • Mission statement
  • Business task (CORRECT)

Correct: A business task is a question or problem that a data analyst answers for a business.

21. Fill in the blank: Data-driven decision-making is described as using _____ to guide business strategy.

  • gut instinct
  • visualizations
  • facts (CORRECT)
  • intuition

Correct: Data-driven decision-making is using facts to guide business strategy.

22. An online gardening magazine wants to understand why its subscriber numbers have been increasing. A data analyst discovers that significantly more people subscribe when the magazine has its annual 50%-off sale. This is an example of what?

  • Analyzing customer buying behaviors (CORRECT)
  • Analyzing the number of customers by calculating daily foot traffic
  • Analyzing consumer preferences using artificial intelligence
  • Analyzing social media engagement

Correct: Data analysts help companies learn from historical data in order to make predictions. A sale’s affect on subscription purchases is an example of customer buying behavior analysis.

21. It’s possible for conclusions drawn from data analysis to be both true and unfair.

  • True (CORRECT)
  • False

Correct: Sometimes, a conclusion may be true, but it’s unfair because it doesn’t represent all groups or it ignores social context and other systemic factors.

22. A large hotel chain sees about 500 customers per week. A data analyst working there is gathering data through customer satisfaction surveys. They are anxious to begin analysis, so they start analyzing the data as soon as they receive 50 survey responses. This is an example of what? Select all that apply.

  • Failing to reward customers for participating in the survey
  • Failing to collect data anonymously
  • Failing to have a large enough sample size (CORRECT)
  • Failing to include diverse perspectives in data collection (CORRECT)

Correct: This is an example of failing to include diverse perspectives and failing to have a large enough sample size. The first 50 survey responses are unlikely to represent the general population and may produce biased results.

GOOGLE DATA ANALYTICS COURSERA ANSWERS AND STUDY GUIDE

Liking our content? Then don’t forget to add us to your bookmarks so you can find us easily!

Weekly Breakdown | Google Study Guides | Back to Top

Foundations: Data, Data, Everywhere Course Challenge

1. Scenario 1, question 1-5

Next, you continue to the prepare step. You access the database and write a query to retrieve data about Splashtastic. You notice that there are only 38 rows of data, representing the company’s 38 stores. In addition, your dataset contains five columns: Store Number, Average Daily Customers, Average Daily Splashtastic Sales (Units), Average Daily Splashtastic Sales (Dollars), and Average Total Daily Sales (All Products).

Considering the size of your dataset, what’s the best way to proceed with the process and analyze steps?

  • Download the data, then use a spreadsheet to process and analyze it. (Correct)
  • Use SQL to process and analyze the data.
  • Upload the data, then process and analyze it using Tableau.
  • Continue using the company database to process and analyze the data.

Correct: Spreadsheets work well for processing and analyzing a small dataset, such as the one you’re using.

2. Scenario 1 continued

Now, it’s time to process the data. As you know, this step involves finding and eliminating errors and inaccuracies that can get in the way of your results. While cleaning the data, you notice there’s missing data in one of the rows. What might you do to fix this problem? Select all that apply.

  • Ask you supervisor for guidance (Correct)
  • Delete the row with the missing data point
  • Ask a colleague on your team how they’ve handled similar issues in the past (Correct)
  • Sort the spreadsheet so the row with missing data is at the bottom

Correct: You could ask your supervisor or a colleague for guidance. Asking questions helps you learn and avoid mistakes.

3. Scenario 1 continued

Once you’ve found the missing information, you analyze your dataset.

During analysis, you create a new column F. At the top of the column, you add the attribute Average Percentage of Total Sales – Splashtastic. Select the correct definition for an attribute.

  • All of the characteristics of something contained in a table
  • An observation of data within a column
  • A headline or subhead
  • A characteristic or quality of data used to label a column (Correct)

Correct: An attribute is a characteristic or quality of data used to label a column.

4. Scenario 1 continued

Next, you determine the average total daily sales over the past 12 months at all stores. The range that contains these sales is E2:E39. Identify the correct way to write your function.

  • =AVERAGE(E2,E39)
  • =AVERAGE(E2+E39)
  • =AVERAGE(E2-E39)
  • =AVERAGE(E2:E39) (Correct)

Correct: The function begins with an equal sign (=), and the range is E2 through E39.

5. Scenario 1 continued

You’ve reached the share phase of the data analysis process. It involves which of the following? Select all that apply.

  • Stop selling Splashtastic because it doesn’t represent a large percentage of total sales.
  • Create a data visualization to highlight the Splashtastic sales insights you’ve discovered. (Correct)
  • Present your findings about Splashtastic to stakeholders. (Correct)
  • Prepare a slideshow about Splashtastic’s sales and practice your presentation. (Correct)

Correct: The share phase involves creating data visualizations, preparing your presentation, and communicating your findings to stakeholders.

6. Scenario 2, question 6-10

You’ve been working for the nonprofit National Dental Society (NDS) as a junior data analyst for about two months. The mission of the NDS is to help its members advance the oral health of their patients. NDS members include dentists, hygienists, and dental office support staff.

The table is dental_data_table, and the column name is zip_code. How do you complete the following query?

Course Challenge 1.1
  • zip_code = 81137
  • WHERE zip_code = 81137 (Correct)
  • WHERE_zip_code = 81137
  • WHERE = 81137

Correct: The correct syntax is WHERE zip_code = 81137. WHERE indicates where to look for information. The column name is zip_code. And the database is being asked to return only records matching zip code 81137.

7. Scenario 2 continued

The dataset your supervisor retrieved and imported into a spreadsheet includes a list of patients, their demographic information, dental procedure types, and whether they attended their follow-up appointment. To use the dataset for this scenario, click the link below and select “Use Template.”

The patient demographic information includes data such as age and gender. As you’re learning, it’s your responsibility as a data analyst to make sure your analysis is fair. The fact that the dataset includes people who all live in the same zip code might get in the way of fairness.

  • True (Correct)
  • False

Correct: It’s your responsibility as a data analyst to make sure your analysis is fair. Although many zip codes do reflect diverse populations, a better choice would be to include data about people who live in multiple zip codes.

Course Challenge 1.1
  • zip_code = 81137
  • WHERE zip_code = 81137 (Correct)
  • WHERE_zip_code = 81137
  • WHERE = 81137

Correct: The correct syntax is WHERE zip_code = 81137. WHERE indicates where to look for information. The column name is zip_code. And the database is being asked to return only records matching zip code 81137.

8. Scenario 2 continued

As you’re reviewing the dataset, you notice that there are a disproportionate number of senior citizens. So, you investigate further and find out that this zip code represents a rural community in Colorado with about 800 residents. In addition, there’s a large assisted-living facility in the area. Nearly 300 of the residents in the 81137 zip code live in the facility.

Now, the NDS campaign will be about educating dental offices on the challenges faced by senior citizens and finding ways to help them access quality dental care.

Fill in the blank: Changing the business task involves defining a new _____.

  • data-cleaning strategy
  • question or problem to be solved (Correct)
  • gap analysis plan
  • graphical representation of the data

Correct: A business task is the question or problem data analysis answers for a business.

9. Scenario 2 continued

You continue with your analysis. In the end, your findings support what you discovered during your online research: As people get older, they’re less likely to attend follow-up dental visits.

But you’re not done yet. You know that data should be combined with human insights in order to lead to true data-driven decision-making. So, your next step is to share this information with people who are familiar with the problem. They’ll help verify the results of your data analysis.

The people who are familiar with a problem and help verify the results of data analysis include customers and competitors.

  • True
  • False (Correct)

Correct: Subject-matter experts look at the results of data analysis to identify any inconsistencies, make sense of the gray areas, and eventually validate the choices being made.

10. Scenario 2 continued

The subject-matter experts are impressed by your analysis. The team agrees to move to the next step: data visualization. You know it’s important that stakeholders at NDS can quickly and easily understand that older people are less likely to attend important follow-up dental appointments. This will help them create an effective campaign for members.

It’s time to create your presentation to stakeholders. It will include a data visualization that demonstrates the trend of people being less likely to attend follow-up appointments as they get older. Which type of chart will be most effective?

  • A doughnut chart
  • A table
  • A pie chart
  • A line chart (Correct)

Correct: A line chart is effective for tracking trends over time, such as people attending fewer follow-up appointments as they get older.

1. Scenario 1, question 1-11

You’ve just started a new job as a data analyst. You’re working for a midsized pharmacy chain with 38 stores in the American Southwest. Your supervisor shares a new data analysis project with you.

She explains that the pharmacy is considering discontinuing a bubble bath product called Splashtastic. Your supervisor wants you to analyze sales data and determine what percentage of each store’s total daily sales come from that product. Then, you’ll present your findings to leadership.

You know that it’s important to follow each step of the data analysis process: ask, prepare, process, analyze, share, and act. So, you begin by defining the problem and making sure you fully understand stakeholder expectations.

One of the questions you ask is where to find the dataset you’ll be working with. Your supervisor explains that the company database has all the information you need.

Next, you continue to the prepare step. You access the database and write a query to retrieve data about Splashtastic. You notice that there are only 38 rows of data, representing the company’s 38 stores. In addition, your dataset contains five columns: Store Number, Average Daily Customers, Average Daily Splashtastic Sales (Units), Average Daily Splashtastic Sales (Dollars), and Average Total Daily Sales (All Products).

Considering the size of your dataset, you decide a spreadsheet will be the best tool for your project. You proceed by downloading the data from the database. Describe why this is the best choice.

  • Spreadsheets work well for processing and analyzing a small dataset, like the one you’re using. (CORRECT)
  • Spreadsheets are most effective when working with queries.
  • Only spreadsheets let you download and upload data.
  • Databases can’t be used for analysis.

Correct: A spreadsheet is a smart choice when working with a dataset of 38 rows and five columns.

2. Scenario 1 continued

You’ve downloaded the data from your company database and imported it into a spreadsheet. To use the dataset for this scenario, click the link below and select “Use Template.”

Link to template: Course Challenge – Scenario 1

OR

If you don’t have a Google account, you can download the template directly from the attachment below.

Now, it’s time to process the data. As you know, this step involves finding and eliminating errors and inaccuracies that can get in the way of your results. While cleaning the data, you notice there’s missing data in one of the rows. What might you do to fix this problem? Select all that apply.

  • Delete the row with the missing data point
  • Sort the spreadsheet so the row with missing data is at the bottom
  • Ask you supervisor for guidance (CORRECT)
  • Ask a colleague on your team how they’ve handled similar issues in the past (CORRECT)

Correct: You could ask your supervisor or a colleague for guidance. Asking questions helps you learn and avoid mistakes.

3. Scenario 1 continued

Once you’ve found the missing information, you analyze your dataset.

During analysis, you create a new column F. At the top of the column, you add: Average Percentage of Total Sales – Splashtastic. What is this column label called?

  • An attribute (CORRECT)
  • A headline
  • A reference
  • A title

Correct: An attribute is a characteristic or quality of data used to label a column.

4. Scenario 1 continued

Next, you determine the average total daily sales over the past 12 months at all stores. The range that contains these sales is E2:E39. The correct syntax is =AVERAGE(E2:E39).

  • True (CORRECT)
  • False

Correct: The correct syntax is =AVERAGE(E2:E39). The function begins with an equal sign (=), then the word AVERAGE. The range is E2 through E39.

6. Scenario 1 continued

You’ve downloaded the data from your company database and imported it into a spreadsheet. To use the dataset for this scenario, click the link below and select “Use Template.”

Course Challenge – Scenario 1

OR

If you don’t have a Google account, you can download the template directly from the attachment below.

Now, it’s time to process the data. As you know, this step involves finding and eliminating errors and inaccuracies that can get in the way of your results. While cleaning the data, you notice that information about Splashtastic is missing in one of the rows. You are unsure of how to proceed, so the best course of action is to ask your supervisor for guidance.

  • True (CORRECT)
  • False

Correct: The best course of action is to ask your supervisor for guidance. Asking questions helps you learn and avoid mistakes.

8. Scenario 1 continued

Once you’ve found the missing information, you analyze your dataset.

During analysis, you create a new column F. At the top of the column, you add: Average Percentage of Total Sales – Splashtastic. In data analytics, this column label is called an attribute.

  • True (CORRECT)
  • False

Correct: This column label is an attribute, which is a characteristic or quality of data used to label a column.

9. Scenario 1 continued

You’ve reached the share phase of the data analysis process. It involves creating a data visualization to highlight the Splashtastic sales insights you’ve discovered.

  • True (CORRECT)
  • False

Correct: The share phase involves creating data visualizations, preparing your presentation, and communicating your findings to stakeholders.

10. You know that spreadsheets work well for processing and analyzing a small dataset, like the one you’re using. To get the data from the database into a spreadsheet, what should you do?

  • Email a copy of the dataset to your company email address.
  • Copy and paste the data into a spreadsheet.
  • Download the data as a .CSV file, then import it into a spreadsheet. (CORRECT)
  • Use Tableau to convert the data into a spreadsheet.

Correct: Downloading data from a database into a .CSV file, then importing it into a spreadsheet, will enable you to process and analyze the small dataset effectively.

11. Scenario 1 continued

You’ve downloaded the data from your company database and imported it into a spreadsheet. To use the dataset for this scenario, click the link below and select “Use Template.”

Link to template: Course Challenge – Scenario 1

OR

If you don’t have a Google account, you can download the template directly from the attachment below.

Now, it’s time to process the data. As you know, this step involves finding and eliminating errors and inaccuracies that can get in the way of your results. While cleaning the data, you notice that information about Splashtastic is missing in row 16. The best course of action is to delete the row with missing data from your dataset so it doesn’t get in the way of your results.

  • True
  • False (CORRECT)

Correct: Downloading data from a database into a .CSV file, then importing it into a spreadsheet, will enable you to process and analyze the small dataset effectively.

12.  Scenario 2, questions 12-21

You’ve been working for the nonprofit National Dental Society (NDS) as a junior data analyst for about two months. The mission of the NDS is to help its members advance the oral health of their patients. NDS members include dentists, hygienists, and dental office support staff.

An NDS member with three dental offices in Colorado offers to share its data on missed appointments. So, your supervisor uses a database query to access the dataset from the dental group. The query instructs the database to retrieve all patient information from the member’s three dental offices, located in zip code 81137.

The table is dental_data_table, and the column name is zip_code. You have written the following query, but received an error when it ran. What is the clause that will correct this query?

5.1
  • WHERE zip_code = 81137
  • WHERE_zip_code = 81137
  • WHERE zip_code 81137 (CORRECT)
  • WHERE = 81137

Correct: Downloading data from a database into a .CSV file, then importing it into a spreadsheet, will enable you to process and analyze the small dataset effectively.

13. Scenario 2 continued

As you’re reviewing the dataset, you notice that there are a disproportionate number of senior citizens. So, you investigate further and find out that this zip code represents a rural community in Colorado with about 800 residents. In addition, there’s a large assisted-living facility in the area. Nearly 300 of the residents in the 81137 zip code live in the facility.

With this new knowledge, you write an email to your supervisor expressing your concerns about the dataset. He agrees with your concerns, but he’s also impressed with what you’ve learned and thinks your findings could be very important to the project. He asks you to change the business task. Now, the NDS campaign will be about educating dental offices on the challenges faced by senior citizens and finding ways to help them access quality dental care.

Fill in the blank: Changing the business task involves defining a new _____.

  • gap analysis plan
  • question or problem to be solved (CORRECT)
  • graphical representation of the data
  • data-cleaning strategy

Correct: A business task is the question or problem data analysis answers for a business.

14. Scenario 2 continued

You continue with your analysis. In the end, your findings support what you discovered during your online research: As people get older, they’re less likely to attend follow-up dental visits.

But you’re not done yet. You know that data should be combined with human insights in order to lead to true data-driven decision-making. So, your next step is to share this information with people who are familiar with the problem. They’ll help verify the results of your data analysis.

The people who are familiar with a problem and help verify the results of data analysis include customers and competitors.

  • True
  • False (CORRECT)

Correct: Subject-matter experts look at the results of data analysis to identify any inconsistencies, make sense of the gray areas, and eventually validate the choices being made.

15. Scenario 2 continued

The subject-matter experts are impressed by your analysis. The team agrees to move to the next step: data visualization. You know it’s important that stakeholders at NDS can quickly and easily understand that older people are less likely to attend important follow-up dental appointments. This will help them create an effective campaign for members.

It’s time to create your presentation to stakeholders. It will include a data visualization that demonstrates the trend of people being less likely to attend follow-up appointments as they get older. Which type of chart will be most effective?

  • A doughnut chart
  • A line chart (CORRECT)
  • A pie chart
  • A table

Correct: A line chart is effective for tracking trends over time, such as people attending fewer follow-up appointments as they get older.

16. The table is dental_data_table, and the column name is zip_code. How do you complete the following query?

5.2
  • WHERE = 81137
  • zip_code = 81137
  • WHERE_zip_code = 81137
  • WHERE zip_code = 81137 (CORRECT)

Correct: The correct syntax is WHERE zip_code = 81137. WHERE indicates where to look for information. The column name is zip_code. And the database is being asked to return only records matching zip code 81137.

17. Scenario 2 continued

The patient demographic information includes data such as age and gender. As you’re learning, it’s your responsibility as a data analyst to make sure your analysis is fair. The fact that the dataset includes people who all live in the same zip code might get in the way of fairness.

  • True (CORRECT)
  • False

Correct: It’s your responsibility as a data analyst to make sure your analysis is fair. Although many zip codes do reflect diverse populations, a better choice would be to include data about people who live in multiple zip codes.

18. Scenario 2 continued

With this new knowledge, you write an email to your supervisor expressing your concerns about the dataset. He agrees with your concerns, but he’s also impressed with what you’ve learned and thinks your findings could be very important to the project. He asks you to change the business task. Now, the NDS campaign will be about educating dental offices on the challenges faced by senior citizens and finding ways to help them access quality dental care.

Changing the business task involves which of the following?

  • Conducting a gap analysis
  • Creating a graphical representation of the data
  • Defining the new question or problem to be solved (CORRECT)
  • Using a database instead of a spreadsheet

Correct: A business task is the question or problem data analysis answers for a business.

19. Scenario 2 continued

It’s time to create your presentation to stakeholders. It will include a data visualization that demonstrates the trend of people being less likely to attend follow-up appointments as they get older. For this, a pie chart will be most effective.

  • True
  • False (CORRECT)

Correct: A pie chart is used to represent the proportions of certain data categories compared to the whole. A line chart would be effective for tracking trends over time, such as people attending fewer appointments as they get older.

20. Scenario 2 continued

The patient demographic information includes data such as age and gender. As you’re learning, it’s your responsibility as a data analyst to make sure your analysis is fair. Which aspect of patient demographics might get in the way of fairness?

  • The dataset contains patient identification numbers.
  • The dataset represents people who are single.
  • The dataset includes people who all live in the same zip code. (CORRECT)
  • The dataset indicates which dental procedure the patients had performed

Correct: It’s your responsibility as a data analyst to make sure your analysis is fair. Although many zip codes do reflect diverse populations, a better choice would be to include data about people who live in multiple zip codes.

21. Scenario 2 continued

The people who are familiar with a problem and help verify the results of data analysis are called subject-matter experts. What are their roles in the process? Select all that apply.

  • Collect, transform, and organize data
  • Identify inconsistencies in the analysis (CORRECT)
  • Validate the choices being made (CORRECT)
  • Offer insights into the business problem (CORRECT)

Correct: Subject-matter experts can offer insights into the business problem, identify inconsistencies in the analysis, and validate the choices being made.

22. Scenario 1, question 1-5

You’ve just started a new job as a data analyst. You’re working for a midsized pharmacy chain with 38 stores in the American Southwest. Your supervisor shares a new data analysis project with you.

She explains that the pharmacy is considering discontinuing a bubble bath product called Splashtastic. Your supervisor wants you to analyze sales data and determine what percentage of each store’s total daily sales come from that product. Then, you’ll present your findings to leadership.

You know that it’s important to follow each step of the data analysis process: ask, prepare, process, analyze, share, and act. So, you begin by defining the problem and making sure you fully understand stakeholder expectations.

One of the questions you ask is where to find the dataset you’ll be working with. Your supervisor explains that the company database has all the information you need.

Next, you continue to the prepare step. You access the database and write a query to retrieve data about Splashtastic. You notice that there are only 38 rows of data, representing the company’s 38 stores. In addition, your dataset contains five columns: Store Number, Average Daily Customers, Average Daily Splashtastic Sales (Units), Average Daily Splashtastic Sales (Dollars), and Average Total Daily Sales (All Products).

Considering the size of your dataset, what’s the best way to proceed with the process and analyze steps?

  • Download the data, then use a spreadsheet to process and analyze it. (CORRECT)
  • Use SQL to process and analyze the data.
  • Continue using the company database to process and analyze the data.
  • Upload the data, then process and analyze it using Tableau.

Correct!

23. Scenario 1 continued

Once you’ve found the missing information, you analyze your dataset. During analysis, you create a new column F. At the top of the column, you add the attribute Average Percentage of Total Sales – Splashtastic.

Fill in the blank: An attribute is a _______ or quality of data used to label a column.

  • response
  • characteristic (CORRECT)
  • headline
  • number

Correct!

24. Scenario 1 continued

Next, you determine the average total daily sales over the past 12 months at all stores. The entire range of cells that contain these sales are E2:E39. Identify the correct way to write your formula.

  • =AVERAGE(E2+E39)
  • =AVERAGE(E2-E39)
  • =AVERAGE(E2:E39) (CORRECT)
  • =AVERAGE(E2,E39)

Correct!

25. Scenario 1 continued

You’ve reached the share phase of the data analysis process. What can you do in this phase to share the Splashtastic sales insights you’ve discovered?

  • Present your findings to stakeholders. (CORRECT)
  • Revisit the analyze phase.
  • Establish a repository for the data.
  • Present your findings to customers.

Correct!

26. Scenario 2 continued

The dataset your supervisor retrieved and imported into a spreadsheet includes a list of patients, their demographic information, dental procedure types, and whether they attended their follow-up appointment.

To use the dataset for this scenario, click the link below and select “Use Template. If you don’t have a Google account, you can download the template directly from the attachment below.

The patient demographic information includes data such as age and gender. As you’re learning, it’s your responsibility as a data analyst to make sure your analysis is fair. Looking at the geographic data, you notice that all the patients live in the same zip code. How might this negatively impact the analysis?

  • It could cause the analysis to be fair.
  • It could cause the analysis to be biased. (CORRECT)
  • It could cause the analysis to be useless.
  • It could cause the analysis to be unbiased.

Correct!

27. Scenario 2 continued

As you’re reviewing the dataset, you notice that there are a disproportionate number of senior citizens. So, you investigate further and find out that this zip code represents a rural community in Colorado with about 800 residents. In addition, there’s a large assisted-living facility in the area. Nearly 300 of the residents in the 81137 zip code live in the facility.

You recognize that’s a sizable number, so you want to find out if age has an effect on a patient’s likelihood to attend a follow-up dental appointment. You analyze the data, and your analysis reveals that older people tend to miss follow-ups more than younger people.

So, you do some research online and discover that people over the age 60 are 50% more likely to miss dentist appointments. Sometimes this is because they’re on a fixed income. Also, many senior citizens lack transportation to get to and from appointments.

With this new knowledge, you write an email to your supervisor expressing your concerns about the dataset. He agrees with your concerns, but he’s also impressed with what you’ve learned and thinks your findings could be very important to the project. He asks you to change the business task. Now, the NDS campaign will be about educating dental offices on the challenges faced by senior citizens and finding ways to help them access quality dental care.

Changing the business task involves defining the new question or problem to be solved.

  • True (CORRECT)
  • False

Correct!

28. Scenario 2 continued

The subject-matter experts are impressed by your analysis. The team agrees to move to the next step: data visualization. You know it’s important that stakeholders at NDS can quickly and easily understand that older people are less likely to attend important follow-up dental appointments than younger people. This will help them create an effective campaign for members.

It’s time to create your presentation to stakeholders. It will include a data visualization that demonstrates the lifetime trend of people being less likely to attend follow-up appointments as they get older.

Why would a line chart be the most effective in representing this?

  • Line charts represent data values as proportionally sized wedges.
  • Line charts arrange data values into rows.
  • Line charts arrange data values into columns.
  • Line charts are effective in displaying points in series. (CORRECT)

Correct!

29. Scenario 1 continued

Next, you determine the average total daily sales over the past 12 months at all stores. The entire range of cells that contain these sales are E2:E39. To do this, you use type a formula. Fill in the blank to complete the formula correctly: =AVERAGE_____.

  • E2:E39
  • E2-E39
  • (E2:E39) (CORRECT)
  • (E2-E39)

Correct!

30. Scenario 2, questions 6-10

You’ve been working for the nonprofit National Dental Society (NDS) as a junior data analyst for about two months. The mission of the NDS is to help its members advance the oral health of their patients. NDS members include dentists, hygienists, and dental office support staff.

The NDS is passionate about patient health. Part of this involves automatically scheduling follow-up appointments after crown replacement, emergency dental surgery, and extraction procedures. NDS believes the follow-up is an important step to ensure patient recovery and minimize infection.

Unfortunately, many patients don’t show up for these appointments, so the NDS wants to create a campaign to help its members learn how to encourage their patients to take follow-up appointments seriously. If successful, this will help the NDS achieve its mission of advancing the oral health of all patients.

Your supervisor has just sent you an email saying that you’re doing very well on the team, and he wants to give you some additional responsibility. He describes the issue of many missed follow-up appointments. You are tasked with analyzing data about this problem and presenting your findings using data visualizations.

An NDS member with three dental offices in Colorado offers to share its data on missed appointments. So, your supervisor uses a database query to access the dataset from the dental group. The query instructs the database to retrieve all patient information from the member’s three dental offices, located in zip code 81137.

The table is dental_data_table, and the column name is zip_code. You write the following query, but get an error. What statement will correct the problem?

SELECT *
FROM dental_data_table
WHERE zip code = 81137
WHERE zip_code = 81137 (CORRECT)
WHERE 81137
zip_code = 81137
WHERE_zip code = 81137

Correct!

31. Scenario 2 continued

The dataset your supervisor retrieved and imported into a spreadsheet includes a list of patients, their demographic information, dental procedure types, and whether they attended their follow-up appointment. To use the dataset for this scenario, click the link below and select “Use Template.”

Fill in the blank: The fact that the dataset includes people who all live in the same zip code might get in the way of ______.

  • spreadsheet formulas or functions
  • data visualization
  • accuracy
  • fairness (CORRECT)

Correct!

32. Scenario 2 continued

You continue with your analysis. In the end, your findings support what you discovered during your online research: As people get older, they’re less likely to attend follow-up dental visits.

But you’re not done yet. You know that data should be combined with human insights in order to lead to true data-driven decision-making. So, your next step is to share this information with people who are familiar with the problem professionally. They’ll help verify the results of your data analysis.

Fill in the blank: Subject matter experts are people who are familiar with a problem. They can help by identifying inconsistencies in the analysis, _____, and validating the choices being made.

  • creating a presentation with the data
  • offering insights into the business problem (CORRECT)
  • redefining the business problem
  • collecting data relevant to the business problem

Correct!

33. Scenario 1, question 1-5

You’ve just started a new job as a data analyst for a midsized pharmacy chain with 38 stores in the American Southwest. Your supervisor shares a new data analysis project with you.

She explains that the pharmacy is considering discontinuing a bubble bath product called Splashtastic. Your supervisor wants you to analyze sales data and determine what percentage of each store’s total daily sales come from that product. Then, you’ll present your findings to leadership.

You know that it’s important to follow each step of the data analysis process: ask, prepare, process, analyze, share, and act. So, you begin by defining the problem and making sure you fully understand stakeholder expectations.

One of the questions you ask is where to find the dataset you’ll be working with. Your supervisor explains that the company database has all the information you need.

Next, you continue to the prepare step. You access the database and write a query to retrieve data about Splashtastic. You notice that there are only 38 rows of data, representing the company’s 38 stores. In addition, your dataset contains five columns: Store Number, Average Daily Customers, Average Daily Splashtastic Sales (Units), Average Daily Splashtastic Sales (Dollars), and Average Total Daily Sales (All Products). You decide to use a spreadsheet to work with the data because you know that spreadsheets work well for processing and analyzing a small dataset, like the one you’re using.

Fill in the blank:  To get the data from the database into a spreadsheet, you would first download the data as a .CSV file, then _____ it into a spreadsheet.

  • paste
  • export
  • email
  • import (CORRECT)

Correct!

34. Scenario 1 continued

You’ve downloaded the data from your company database and imported it into a spreadsheet. IMPORTANT: To answer questions using this dataset for the scenario, click the link below and select the “Use Template” button before answering the questions.

Now, it’s time to process the data. As you know, this step involves finding and eliminating errors and inaccuracies that can get in the way of your results. While cleaning the data, you notice that information about Splashtastic is missing for Store Number 15 in Row 16. Which of the following would be an appropriate course of action?

  • Replace the row with the average values of the other data points.
  • Sort the spreadsheet so the row with missing data is at the bottom.
  • Investigate previous projects and see how this was dealt with there. (CORRECT)
  • Delete the row with the missing data point.

Correct!

35. Scenario 1 continued

Once you’ve found the missing information, you analyze your dataset. During analysis, you create a new column F. You label the top of the column Average Percentage of Total Sales – Splashtastic.

Fill in the blank: The column label you add to column F is known as ______.

  • an attribute (CORRECT)
  • a title
  • a reference
  • an observation

Correct!

36. Scenario 1 continued

Fill in the blank: You’ve reached the share phase of the data analysis process. One of the things that you can do in this phase is to prepare a _____ about Splashtastic’s sales and practice your presentation.

  • prediction
  • finding
  • record
  • slideshow (CORRECT)

Correct!

37. Scenario 1, question 1-5

You’ve just started a new job as a data analyst for a midsized pharmacy chain with 38 stores in the American Southwest. Your supervisor shares a new data analysis project with you.She explains that the pharmacy is considering discontinuing a bubble bath product called Splashtastic. Your supervisor wants you to analyze sales data and determine what percentage of each store’s total daily sales come from that product. Then, you’ll present your findings to leadership.

You know that it’s important to follow each step of the data analysis process: ask, prepare, process, analyze, share, and act. So, you begin by defining the problem and making sure you fully understand stakeholder expectations.

One of the questions you ask is where to find the dataset you’ll be working with. Your supervisor explains that the company database has all the information you need.

Next, you continue to the prepare step. You access the database and write a query to retrieve data about Splashtastic. You notice that there are only 38 rows of data, representing the company’s 38 stores. In addition, your dataset contains five columns: Store Number, Average Daily Customers, Average Daily Splashtastic Sales (Units), Average Daily Splashtastic Sales (Dollars), and Average Total Daily Sales (All Products). You decide to use a spreadsheet to work with the data because you know that spreadsheets work well for processing and analyzing a small dataset, like the one you’re using.

Fill in the blank: To get the data from the database into a spreadsheet, you would first _____ the data as a .CSV file, then import it into a spreadsheet.

  • download (CORRECT)
  • print
  • email
  • copy and paste

Correct!

38. Scenario 2, questions 6-10

You’ve been working for the nonprofit National Dental Society (NDS) as a junior data analyst for about two months. The mission of the NDS is to help its members advance the oral health of their patients. NDS members include dentists, hygienists, and dental office support staff.

The NDS is passionate about patient health. Part of this involves automatically scheduling follow-up appointments after crown replacement, emergency dental surgery, and extraction procedures. NDS believes the follow-up is an important step to ensure patient recovery and minimize infection.

Unfortunately, many patients don’t show up for these appointments, so the NDS wants to create a campaign to help its members learn how to encourage their patients to take follow-up appointments seriously. If successful, this will help the NDS achieve its mission of advancing the oral health of all patients.

Your supervisor has just sent you an email saying that you’re doing very well on the team, and he wants to give you some additional responsibility. He describes the issue of many missed follow-up appointments. You are tasked with analyzing data about this problem and presenting your findings using data visualizations.

An NDS member with three dental offices in Colorado offers to share its data on missed appointments. So, your supervisor uses a database query to access the dataset from the dental group. The query instructs the database to retrieve all patient information from the member’s three dental offices, located in zip code 81137.

The table is dental_data_table, and the column name is zip_code. The following query is incomplete.

SELECT *
FROM dental_data_table

Which of the following is the missing third line from this query?

  • WHERE = 81137
  • zip_code = 81137
  • WHERE zip_code = 81137 (CORRECT)
  • WHERE_zip_code = 81137

Correct!

39. Scenario 2 continued

The dataset your supervisor retrieved and imported into a spreadsheet includes a list of patients, their demographic information, dental procedure types, and whether they attended their follow-up appointment.

The patient demographic information includes data such as age, gender, and home address. You review the demographic data, paying particular attention to geography. What geographic aspect of the data may negatively impact fairness?

  • The patients all live in the same city.
  • The patients all live in houses.
  • The patients all live in the same country.
  • The patients all live in the same zip code. (CORRECT)

Correct!

40. Scenario 2 continued

You continue with your analysis. In the end, your findings support what you discovered during your online research: As people get older, they’re less likely to attend follow-up dental visits.

But you’re not done yet. You know that data should be combined with human insights in order to lead to true data-driven decision-making. So, your next step is to share this information with people who are familiar with the problem professionally. They’ll help verify the results of your data analysis.

Fill in the blank: The people who are familiar with a problem and help verify the results of data analysis are _____.

  • subject-matter experts (CORRECT)
  • customers
  • data scientists
  • stakeholders

Correct!

41. Scenario 2 continued

The subject-matter experts are impressed by your analysis. The team agrees to move to the next step: data visualization. You know it’s important that stakeholders at NDS can quickly and easily understand that older people are less likely to attend important follow-up dental appointments.

This will help them create an effective campaign for members. It’s time to create your presentation to stakeholders. It will include a data visualization that depicts the relationship between age and follow-up dental appointment attendance rates over a lifetime.

A doughnut chart will be the most effective visualization of this data.

  • True
  • False (CORRECT)

Correct!

42. Scenario 1 continued

Once you’ve found the missing information, you analyze your dataset. During analysis, you create a new column F. At the top of the column, you add the attribute Average Percentage of Total Sales – Splashtastic.

Fill in the blank: An attribute is a characteristic or _____ of data used to label a column.

  • observation
  • number
  • quality (CORRECT)
  • collection

Correct!

Endless Career Possibilities CONCLUSION

Businesses across all industries greatly benefit from the work of data analysts. In this course, you have learned about the different types of businesses that analysts work for and the specific jobs and tasks they perform.

You have also seen how your data analyst certificate can help you meet many of the requirements for a position with these businesses. Join Coursera today to gain access to more courses like this one so that you can continue learning and expanding your skill set!