Question 1

A data analyst is working with a data frame called salary_data. they want to create a new column named hourly_salary that includes data from the wages column divided by 40. what code chunk lets the analyst create the hourly_salary column?

Accepted Answer

Mutate(salary_data, hourly_salary = wages/40) (Correct)

Question 2

A data analyst creates a data frame with data that has more than 50,000 observations in it. when they print their data frame, it slows down their console. to avoid this, they decide to switch to a tibble. why would a tibble be more useful in this situation?

Accepted Answer

Tibbles won't overload the console because they automatically only print the first 10 rows of data and as many variable as will fit on the screen (Correct)

Question 3

A data analyst is working with a data frame named cars. The analyst notices that all the column names in the data frame are capitalized. What code chunk lets the analyst change all the column name to lowercase?

Accepted Answer

rename_with(cars, tolower) (Correct)

Question 4

You are working with the ToothGrowth dataset. You wan to use the head() function to get a preview of the dataset. Write the code chunk that will give you this preview.

Accepted Answer

len, supp, dose (CORRECT)

Question 5

Which R function can be used to make changes to a data frame?

Accepted Answer

mutate() (CORRECT)

Question 6

A data analyst is considering using tibbles instead of basic data frames. What are some of the limitations of tibbles? Select all that apply.

Accepted Answer

Tibbles won’t automatically change the names of variables, Tibbles can never create row names, Tibbles can never change the input type of the data (CORRECT)

Question 7

A data analyst is working with a data frame called salary_data. They want to create a new column named hourly_salary that includes data from the wages column divided by 40. What code chunk lets the analyst create the hourly_salarycolumn?

Accepted Answer

mutate(salary_data, hourly_salary = wages / 40) (CORRECT)

Question 8

A data analyst is working with a data frame named weather. It has separate columns for temperatures (temp) and measurement units (unit). The analyst wants to combine the two columns into a single column called display_temp, with the temperature and unit separated by the string “ Degrees “. What code chunk lets the analyst create the display_temp column?

Accepted Answer

unite(weather, "display_temp", weather, temp, sep = " Degrees ") (CORRECT)

Question 9

In R, which statistical measure can help you understand the spread of values in a dataset and describe how far each value is from the mean?

Accepted Answer

Standard deviation (CORRECT)

Question 10

A data scientist is trying to print a data frame but when you print the data frame to the console output produces too many rows and columns to be readable. What could they use instead of a data frame to make printing more readable?

Accepted Answer

A tibble (CORRECT)

Question 11

A data analyst is working with a large data frame. It contains so many columns that they don’t all fit on the screen at once. The analyst wants a quick list of all of the column names to get a better idea of what is in their data. What function should they use?

Accepted Answer

colnames() (CORRECT)

Question 12

A data analyst is working with the penguins dataset. What code chunk does the analyst write to make sure all the column names are unique and consistent and contain only letters, numbers, and underscores?

Accepted Answer

clean_names(penguins) (CORRECT)

Question 13

A data analyst is working with the penguins data. The variable species includes three penguin species: Adelie, Chinstrap, and Gentoo. The analyst wants to create a data frame that only includes the Adelie species. The analyst receives an error message when they run the following code:

Accepted Answer

filter(species == “Adelie”) (CORRECT)

Question 14

A data analyst is working with a data frame named retail. It has separate columns for dollars (price_dollars) and cents (price_cents). The analyst wants to combine the two columns into a single column named price, with the dollars and cents separated by a decimal point. For example, if the value in the price_dollars column is 10, and the value in the price_centscolumn is 50, the value in the pricecolumn will be 10.50. What code chunk lets the analyst create the pricecolumn?

Accepted Answer

unite(retail, “price”, price_dollars, price_cents, sep=”.”) (CORRECT)

Question 15

A data analyst is using statistical measures to get a better understanding of their data. What function can they use to determine how strongly related are two of the variables?

Accepted Answer

cor() (CORRECT)

Question 16

A data analyst creates two different predictive models for the same dataset. They use the bias() function on both models. The first model has a bias of 20. The second model has a bias of 0.1. Which model is less biased?

Accepted Answer

The second model (CORRECT)

Question 17

You are cleaning a data frame with improperly formatted column names. In order to clean the data frame you want to use the clean_names() function. Which column names will be changed using the clean_names() with default parameters? Select all that apply.

Accepted Answer

column.1, column 2 (CORRECT)

Question 18

A data analyst is working with the penguins dataset. The variable island represents the island on which the sample was collected. The analyst wants to create a data frame that excludes records from the island named “Torgersen”. What code chunk will allow them to create this data frame?

Accepted Answer

penguins filter(island != "Torgersen") (CORRECT)

Question 19

A data analyst is working with a data frame called zoo_records. They want to create a new column named is_large_animal that signifies if an animal has a weight of more than 199 kilograms. What code chunk lets the analyst create the is_large_animal column?

Accepted Answer

zoo_records mutate(is_large_animal = weight > 199) (CORRECT)

Question 20

You are compiling an analysis of the average monthly costs for your company. What summary statistic function should you use to calculate the average?

Accepted Answer

mean() (CORRECT)

Question 21

A data analyst creates a data frame with data that has more than 50,000 observations in it. When they print their data frame, it slows down their console. To avoid this, they decide to switch to a tibble. Why would a tibble be more useful in this situation?

Accepted Answer

Tibbles won’t overload the console because they automatically only print the first 10 rows of data and as many variables as will fit on the screen (CORRECT)

Question 22

You have a data frame named employees with a column named Last_NAME. What will the name of the employees column be in the results of the function rename_with(employees, tolower)?

Accepted Answer

last_name (CORRECT)

Question 23

The variable species includes three penguin species: Adelie, Chinstrap, and Gentoo. What code chunk does the analyst add to create a data frame that only includes the Gentoo species?

Accepted Answer

filter(species == “Gentoo”) (CORRECT)

Question 24

A data analyst is working with a data frame named users. It has separate columns for first name (first_name) and last name (last_name). The analyst wants to combine the two columns into a single column called full_name, with the first name and last name separated by a space. What code chunk lets the analyst create the full_namecolumn?

Accepted Answer

unite(users, "full_name", first_name, last_name, sep = " ") (CORRECT)

Question 25

A data analyst is working with a large data frame. It contains so many columns that they don’t all fit on the screen at once. The analyst wants a quick list of all of the column names to get a better idea of what is in their data. What function should they use?

Accepted Answer

colnames() (CORRECT)

Question 26

A data analyst is working with the penguins dataset. What code chunk does the analyst write to make sure all the column names are unique and consistent and contain only letters, numbers, and underscores?

Accepted Answer

clean_names(penguins) (CORRECT)

Question 27

A data analyst is working with a data frame named salary_data. They want to create a new column named wagesthat includes data from the rate column multiplied by 40. What code chunk lets the analyst create the wages column?

Accepted Answer

mutate(salary_data, wages = rate * 40) (CORRECT)

Question 28

A data analyst writes the following code chunk to return a statistical summary of their dataset: quartet group_by(set) summarize(mean(x), sd(x), mean(y), sd(y), cor(x, y)) Which function will return the average value of the y column?

Accepted Answer

mean(y) (CORRECT)

Question 29

A data analyst is working with a dataset in R that has more than 50,000 observations. Why might they choose to use a tibble instead of the standard data frame? Select all that apply.

Accepted Answer

Tibbles automatically only preview the first 10 rows of data, Tibbles automatically only preview as many columns as fit on screen (CORRECT)

Question 30

A data analyst is examining a new dataset for the first time. They load the dataset into a data frame to learn more about it. What function(s) will allow them to review the names of all of the columns in the data frame? Select all that apply.

Accepted Answer

head(), str(), colnames() (CORRECT)

Question 31

A data analyst is working with a data frame named customers. It has separate columns for area code (area_code) and phone number (phone_num). The analyst wants to combine the two columns into a single column called phone_number, with the area code and phone number separated by a hyphen. What code chunk lets the analyst create the phone_numbercolumn?

Accepted Answer

unite(customers, “phone_number”, area_code, phone_num, sep=”-”) (CORRECT)

Question 32

A data analyst wants a high level summary of the structure of their data frame, including the column names, the number of rows and variables, and type of data within a given column. What function should they use?

Accepted Answer

str() (CORRECT)

Question 33

You are working with the ToothGrowth dataset. You want to use the select() function to view all columns except the supp column. Write the code chunk that will give you this view. 1 How many columns does the resulting data frame contain?

Accepted Answer

2 (CORRECT)

Question 34

What is the minimum bill depth in mm for the Chinstrap species?

Accepted Answer

16.4 (CORRECT)

Question 35

A data analyst wants to find out how much the predicted outcome and the actual outcome of their data model differ. What function can they use to quickly measure this?

Accepted Answer

bias() (CORRECT)

Question 36

What is the average value of the len column?

Accepted Answer

18.8 (CORRECT)

Question 37

What can the data analyst do to fix their code?

Accepted Answer

Save the results of arrange() to a variable that gets passed to head() (CORRECT)

Question 38

When you run the code in the code box, how many separate observational rows are returned by this code chunk?

Accepted Answer

3 (CORRECT)

Question 39

A data analyst is working with a data frame called athletes. The data frame contains a column names record that represents an athlete's wins and losses separated by a hyphen (-). They want to turn this single column into individual columns for wins and losses. Which code chunk lets the analyst split the record column?

Accepted Answer

separate(athletes, record, into=c("wins”, “losses”), sep="-") (CORRECT)

Question 40

A data analyst is working with a data frame named stores. It has separate columns for city (city) and state (state). The analyst wants to combine the two columns into a single column named location, with the city and state separated by a comma. What code chunk lets the analyst create the location column?

Accepted Answer

unite(stores, “location”, city, state, sep=”,”)(CORRECT)

Question 41

A data analyst is working with the penguins dataset in R. What code chunk will allow them to sort the penguins data by the variable bill_length_mm?

Accepted Answer

arrange(penguins, bill_length_mm)(CORRECT)

Question 42

A data analyst is working with a data frame called salary_data. They want to create a new column named total_wages that adds together data in the standard_wages and overtime_wagescolumns.

Accepted Answer

mutate(salary_data, total_wages = standard_wages + overtime_wages) (CORRECT)

Question 43

What scenarios would prevent you from being able to use a tibble?

Accepted Answer

You need to create row names, You need to change the data types of inputs (CORRECT)

Question 44

How many rows does the ToothGrowth dataset contain?

Accepted Answer

60 (CORRECT)

Question 45

In R, which statistical measure demonstrates how strong the relationship is between two variables?

Accepted Answer

Correlation (CORRECT)

Question 46

What will this code chunk calculate?

Accepted Answer

The average difference between the actual and predicted values (CORRECT)

Question 47

A data analyst wants to learn more about a specific data frame. Which function will allow them to review the data types of each column in the data frame?

Accepted Answer

str() (CORRECT)

Question 48

How many different data types are used for the column data types?

Accepted Answer

2 (CORRECT)

Question 49

What is the mean body mass in g for the Adelie species?

Accepted Answer

3706.164 (CORRECT)

Question 50

A data analyst is working with a data frame called sales. In the data frame, a column named location represents data in the format “city, state”. The analyst wants to split the city into an individual city column and state into a new countrycolumn. What code chunk lets the analyst split the location column?

Accepted Answer

separate(sales, location, into=c("city", "country"), sep=", ") (CORRECT)

Question 51

What is an advantage of using data frames instead of tibbles?

Accepted Answer

Data frames allow you to create row names (CORRECT)

Question 52

A data analyst is checking a script for one of their peers. They want to learn more about a specific data frame. What function(s) will allow them to see a subset of data values in the data frame? Select all that apply.

Accepted Answer

head(), str() (CORRECT)

Course 7 – Data Analysis with R Programming Quiz Answers

Week 3: Working with Data in R

GOOGLE DATA ANALYTICS PROFESSIONAL CERTIFICATE

Coursera RStudio Answers Study Guide

TABLE OF CONTENT

Working with Data in R INTRODUCTION

Learning Objectives

Test your knowledge on r data frames

1. Which of the following are best practices for creating data frames? Select all that apply.

2. Why are tibbles a useful variation of data frames?

3. Tidy data is a way of standardizing the organization of data within R.

4. Which R function can be used to make changes to a data frame?

Test your knowledge on cleaning data

1. A data analyst is cleaning their data in R. They want to be sure that their column names are unique and consistent to avoid any errors in their analysis. What R function can they use to do this automatically?

2. You are working with the penguins dataset. You want to use the arrange() function to sort the data for the column bill_length_mm in ascending order. You write the following code:

penguins %>%

Add a code chunk to sort the column bill_length_mm in ascending order.

3. A data analyst is working with customer information from their company’s sales data. The first and last names are in separate columns, but they want to create one column with both names instead. Which of the following functions can they use?

test your knowledge on R functions

1. Which of the following functions can a data analyst use to get a statistical summary of their dataset? Select all that apply.

2. A data analyst inputs the following command:

quartet %>% group_by(set) %>% summarize(mean(x), sd(x), mean(y), sd(y), cor(x, y)).

Which of the functions in this command can help them determine how strongly related their variables are?

3. Fill in the blank: The bias function compares the actual outcome of the data with the _____ outcome to determine whether or not the model is biased.

GOOGLE DATA ANALYTICS COURSERA ANSWERS AND STUDY GUIDE

Data Analysis with R Programming Weekly Challenge 3

1. A data analyst creates a data frame with data that has more than 50,000 observations in it. When they print their data frame, it slows down their console. To avoid this, they decide to switch to a tibble. Why would a tibble be more useful in this situation?

2. A data analyst is exploring their data to get more familiar with it. They want a preview of just the first six rows to get a better idea of how the data frame is laid out. What function should they use?

3. You are working with the ToothGrowth dataset. You want to use the head() function to get a preview of the dataset. Write the code chunk that will give you this preview.

What are the names of the columns in the ToothGrowth dataset?

4. A data analyst is working with a data frame named cars. The analyst notices that all the column names in the data frame are capitalized. What code chunk lets the analyst change all the column names to lowercase?

5. A data analyst is working with the penguins dataset in R. What code chunk will allow them to sort the penguins data by the variable bill_length_mm?

6. You are working with the penguins dataset. You want to use the summarize() and max() functions to find the maximum value for the variable flipper_length_mm. You write the following code:

Add the code chunk that lets you find the maximum value for the variable flipper_length_mm.

What is the maximum flipper length in mm for the Gentoo species?

7. A data analyst is working with a data frame called salary_data. They want to create a new column named total_wages that adds together data in the standard_wages and overtime_wages columns. What code chunk lets the analyst create the total_wages column?

9. In R, which statistical measure demonstrates how strong the relationship is between two variables?

10. A data analyst is studying weather data. They write the following code chunk:

What will this code chunk calculate?

11. A data analyst is working with a data frame called salary_data. They want to create a new column named hourly_salary that includes data from the wages column divided by 40. What code chunk lets the analyst create the hourly_salary column?

12. A data analyst wants a high level summary of the structure of their data frame, including the column names, the number of rows and variables, and type of data within a given column. What function should they use?

13. You are working with the ToothGrowth dataset. You want to use the select() function to view all columns except the supp column. Write the code chunk that will give you this view.

1

How many columns does the resulting data frame contain?

14. You are working with the penguins dataset. You want to use the summarize() and min() functions to find the minimum value for the variable bill_depth_mm. At this point, the following code has already been written into the script:

15. A data analyst wants to find out how much the predicted outcome and the actual outcome of their data model differ. What function can they use to quickly measure this?

16. You are working with the ToothGrowth dataset. You want to use the skim_without_charts() function to get a comprehensive view of the dataset. Write the code chunk that will give you this view.

17. A data analyst is working with the penguins dataset and wants to sort the penguins by body_mass_g from least to greatest. When they run the following code the penguin body mass data is not displayed in the correct order.

18. You are working with the penguins dataset and want to understand the year of data collection for all combinations of species, island, and sex. At this point, the following code has already been written into your script:

21. A data analyst is working with the penguins dataset in R. What code chunk will allow them to sort the penguins data by the variable bill_length_mm?

22. A data analyst is working with a data frame called salary_data. They want to create a new column named total_wages that adds together data in the standard_wages and overtime_wagescolumns.

23. What scenarios would prevent you from being able to use a tibble?

24. You are working with the ToothGrowth dataset. You want to use the skim_without_charts() function to get a comprehensive view of the dataset. Write the code chunk that will give you this view.

25. In R, which statistical measure demonstrates how strong the relationship is between two variables?

26. A data analyst is studying weather data. They write the following code chunk:

27. A data analyst wants to learn more about a specific data frame. Which function will allow them to review the data types of each column in the data frame?

28. You are working with the ToothGrowth dataset. You want to use the glimpse() function to get a quick summary of the dataset. Write the code chunk that will give you this summary.

29. You are working with the penguins dataset. You want to use the summarize() and mean() functions to find the mean value for the variable body_mass_g. At this point, the following code has already been written into your script:

penguins %>%

drop_na() %>%

group_by(species) %>%

Add the code chunk that lets you find the mean value for the variable body_mass_g.

(Note: do not type the above code into the code block editor, as it has already been inputted. Simply add a single line of code based on the prompt.)

1

What is the mean body mass in g for the Adelie species?

31. What is an advantage of using data frames instead of tibbles?

32. A data analyst is checking a script for one of their peers. They want to learn more about a specific data frame. What function(s) will allow them to see a subset of data values in the data frame? Select all that apply.

Working with Data in R CONCLUSION

Subscribe to our site

Quiztudy Top Courses

Popular in Coursera

Mood Zone for Studying & Relaxing