Question 1

Tree-based learning is a type of unsupervised machine learning that performs classification and regression tasks.

Accepted Answer

False (CORRECT)

Question 2

Fill in the blank: Similar to a flow chart, a _____ is a classification model that represents various solutions available to solve a given problem based on the possible outcomes of each solution.

Accepted Answer

decision tree (CORRECT)

Question 3

In a decision tree, which node is the location where the first decision is made?

Accepted Answer

Root (CORRECT)

Question 4

In tree-based learning, how is a split determined?

Accepted Answer

By which variables and cut-off values offer the most predictive power (CORRECT)

Question 5

Fill in the blank: The hyperparameter max depth is used to limit the depth of a decision tree, which is the number of levels between the _____ and the farthest node away from it.

Accepted Answer

root node (CORRECT)

Question 6

What tuning technique can a data professional use to confirm that a model achieves its intended purpose?

Accepted Answer

Grid search (CORRECT)

Question 7

During model validation, the validation dataset must be combined with test data in order to function properly.

Accepted Answer

False (CORRECT)

Question 8

Fill in the blank: Cross validation involves splitting training data into different combinations of _____, on which the model is trained.

Accepted Answer

folds (CORRECT)

Question 9

Ensemble learning is most effective when the outputs are aggregated from models that follow the exact same methodology all using the same dataset.

Accepted Answer

False (CORRECT)

Question 10

What are some of the benefits of ensemble learning? Select all that apply.

Accepted Answer

The predictions have less bias than other standalone models. It combines the results of many models to help make more reliable predictions. The predictions have lower variance than other standalone models. (CORRECT)

Question 11

In a random forest, what type of data is used to train the ensemble of decision-tree base learners?

Accepted Answer

Bootstrapped (CORRECT)

Question 12

Fill in the blank: When using a decision tree model, a data professional can use _____ to control the threshold below which nodes become leaves.

Accepted Answer

min_samples_split (CORRECT)

Question 13

When using the hyperparameter min_child_weight, a tree will not split a node if it results in any child node with less weight than what is specified. What happens to the node instead?

Accepted Answer

It becomes a leaf (CORRECT)

Question 14

Fill in the blank: The supervised learning technique boosting builds an ensemble of weak learners _____, then aggregates their predictions.

Accepted Answer

sequentially (CORRECT)

Question 15

When using a gradient boosting machine (GBM) modeling technique, which term describes a model’s ability to predict new values that fall outside of the range of values in the training data?

Accepted Answer

Extrapolation (CORRECT)

Question 16

A junior data analyst uses tree-based learning for a sales and marketing project. Currently, they are interested in the section of the tree that represents where the first decision is made. What are they examining?

Accepted Answer

Roots (CORRECT)

Question 17

What are some disadvantages of decision trees? Select all that apply.

Accepted Answer

Decision trees can be particularly susceptible to overfitting. When new data is introduced, decision trees can be less effective at prediction. (CORRECT)

Question 18

Which section of a decision tree is where the final prediction is made?

Accepted Answer

Leaf node (CORRECT)

Question 19

In a decision tree ensemble model, which hyperparameter controls how many decision trees the model will build for its ensemble?

Accepted Answer

n_estimators (CORRECT)

Question 20

What process uses different "folds" (portions) of the data to train and evaluate a model across several iterations?

Accepted Answer

Cross validation (CORRECT)

Question 21

Which of the following statements correctly describe ensemble learning? Select all that apply.

Accepted Answer

Predictions using an ensemble of models can be accurate even when the individual models are barely more accurate than a random guess. Ensemble learning involves aggregating the outputs of multiple models to make a final prediction. If a base learner’s prediction is only slightly better than a random guess, it is called a "weak learner." (CORRECT)

Question 22

Fill in the blank: A random forest is an ensemble of decision-tree _____ that are trained on bootstrapped data.

Accepted Answer

base learners (CORRECT)

Question 23

What are some benefits of boosting? Select all that apply.

Accepted Answer

Boosting is a powerful predictive methodology. Boosting can handle both numeric and categorical features. Boosting does not require the data to be scaled. (CORRECT)

Question 24

Which of the following statements correctly describe gradient boosting? Select all that apply.

Accepted Answer

Gradient boosting machines have many hyperparameters. Gradient boosting machines do not give coefficients or directionality for their individual features. Gradient boosting machines are often called black-box models because their predictions can be difficult to explain. (CORRECT)

Question 25

A data professional uses tree-based learning for an operations project. Currently, they are interested in the nodes at which the trees split. What type of nodes do they examine?

Accepted Answer

Decision (CORRECT)

Question 26

What are some benefits of decision trees? Select all that apply.

Accepted Answer

When preparing data to train a decision tree, very little preprocessing is required. Decision trees enable data professionals to make predictions about future events based on currently available information. Decision trees require no assumptions regarding the distribution of underlying data. (CORRECT)

Question 27

In a decision tree, what type(s) of nodes can decision nodes point to? Select all that apply.

Accepted Answer

Leaf node, Decision node (CORRECT)

Question 28

In a decision tree model, which hyperparameter sets the threshold below which nodes become leaves?

Accepted Answer

Min samples split (CORRECT)

Question 29

When might you use a separate validation dataset? Select all that apply.

Accepted Answer

If you want to choose the specific samples used to validate the model. If you have a very large amount of data. If you want to compare different model scores to choose a champion before predicting on test holdout data. (CORRECT)

Question 30

What tool is used to confirm that a model achieves its intended purpose by systematically checking combinations of hyperparameters to identify which set produces the best results, based on the selected metric?

Accepted Answer

GridSearchCV (CORRECT)

Question 31

Which of the following statements correctly describe ensemble learning? Select all that apply.

Accepted Answer

It's possible to use the same methodology for each contributing model, as long as there are numerous base learners. Ensemble learning involves building multiple models. It's possible to use very different methodologies for each contributing model. (CORRECT)

Question 32

Which of the following statements correctly describe gradient boosting? Select all that apply.

Accepted Answer

Gradient boosting machines work well with missing data. Gradient boosting machines do not require the data to be scaled. (CORRECT)

Question 33

Which of the following statements accurately describe decision trees? Select all that apply.

Accepted Answer

Decision trees work by sorting data. Decision trees require no assumptions regarding the distribution of underlying data. Decision trees are susceptible to overfitting. (CORRECT)

Question 34

What is the only section of a decision tree that contains no predecessors?

Accepted Answer

Root node (CORRECT)

Question 35

In a decision tree, nodes are where decisions are made, and they are connected by edges.

Accepted Answer

True (CORRECT)

Question 36

Fill in the blank: Each base learner in a random forest model has different combinations of features available to it, which helps prevent correlated errors among _____ in the ensemble.

Accepted Answer

learners (CORRECT)

Question 37

What are some benefits of boosting? Select all that apply.

Accepted Answer

Boosting reduces bias. Because no single tree weighs too heavily in the ensemble, boosting reduces the problem of high variance. Boosting can improve model accuracy. (CORRECT)

Question 38

Which of the following statements correctly describe gradient boosting? Select all that apply.

Accepted Answer

Each base learner in the sequence is built to predict the residual errors of the model that preceded it. Gradient boosting machines can be difficult to interpret. Gradient boosting machines have difficulty with extrapolation. (CORRECT)

Question 39

A data analytics team uses tree-based learning for a research and development project. Currently, they are interested in the parts of the decision tree that represent an item’s target value. What are they examining?

Accepted Answer

Leaves (CORRECT)

Question 40

In a decision tree model, which hyperparameter specifies the number of attributes that each tree selects randomly from the training data to determine its splits?

Accepted Answer

Max features (CORRECT)

Question 41

Adaboost is a tree-based boosting methodology in which each consecutive base learner assigns greater weight to the observations that were correctly predicted by the preceding learner.

Accepted Answer

False (CORRECT)

Question 42

Why might a GBM, or gradient-boosting machine, be inappropriate for use in the health care or financial fields?

Accepted Answer

Its predictions cannot be precisely explained. (CORRECT)

COURSE 6: THE NUTS AND BOLTS OF MACHINE LEARNING

Module 4: Tree-Based Modeling

GOOGLE ADVANCED DATA ANALYTICS PROFESSIONAL CERTIFICATE

Complete Coursera Study Guide

TABLE OF CONTENT

INTRODUCTION – Tree-Based Modeling

Learning Objectives

PRACTICE QUIZ: TEST YOUR KNOWLEDGE: ADDITIONAL SUPERVISED LEARNING TECHNIQUES

1. Tree-based learning is a type of unsupervised machine learning that performs classification and regression tasks.

2. Fill in the blank: Similar to a flow chart, a _____ is a classification model that represents various solutions available to solve a given problem based on the possible outcomes of each solution.

3. In a decision tree, which node is the location where the first decision is made?

4. In tree-based learning, how is a split determined?

PRACTICE QUIZ: TEST YOUR KNOWLEDGE: TUNE TREE-BASED MODELS

1. Fill in the blank: The hyperparameter max depth is used to limit the depth of a decision tree, which is the number of levels between the _____ and the farthest node away from it.

2. What tuning technique can a data professional use to confirm that a model achieves its intended purpose?

3. During model validation, the validation dataset must be combined with test data in order to function properly.

4. Fill in the blank: Cross validation involves splitting training data into different combinations of _____, on which the model is trained.

PRACTICE QUIZ: TEST YOUR KNOWLEDGE: BAGGING

1. Ensemble learning is most effective when the outputs are aggregated from models that follow the exact same methodology all using the same dataset.

2. What are some of the benefits of ensemble learning? Select all that apply.

3. In a random forest, what type of data is used to train the ensemble of decision-tree base learners?

4. Fill in the blank: When using a decision tree model, a data professional can use _____ to control the threshold below which nodes become leaves.

PRACTICE QUIZ: TEST YOUR KNOWLEDGE: BOOSTING

1. When using the hyperparameter min_child_weight, a tree will not split a node if it results in any child node with less weight than what is specified. What happens to the node instead?

2. Fill in the blank: The supervised learning technique boosting builds an ensemble of weak learners _____, then aggregates their predictions.

3. When using a gradient boosting machine (GBM) modeling technique, which term describes a model’s ability to predict new values that fall outside of the range of values in the training data?

QUIZ: MODULE 4 CHALLENGE

1. A junior data analyst uses tree-based learning for a sales and marketing project. Currently, they are interested in the section of the tree that represents where the first decision is made. What are they examining?

2. What are some disadvantages of decision trees? Select all that apply.

3. Which section of a decision tree is where the final prediction is made?

4. In a decision tree ensemble model, which hyperparameter controls how many decision trees the model will build for its ensemble?

5. What process uses different “folds” (portions) of the data to train and evaluate a model across several iterations?

6. Which of the following statements correctly describe ensemble learning? Select all that apply.

7. Fill in the blank: A random forest is an ensemble of decision-tree _____ that are trained on bootstrapped data.

8. What are some benefits of boosting? Select all that apply.

9. Which of the following statements correctly describe gradient boosting? Select all that apply.

10. A data professional uses tree-based learning for an operations project. Currently, they are interested in the nodes at which the trees split. What type of nodes do they examine?

11. What are some benefits of decision trees? Select all that apply.

12. In a decision tree, what type(s) of nodes can decision nodes point to? Select all that apply.

13. In a decision tree model, which hyperparameter sets the threshold below which nodes become leaves?

14. When might you use a separate validation dataset? Select all that apply.

15. What tool is used to confirm that a model achieves its intended purpose by systematically checking combinations of hyperparameters to identify which set produces the best results, based on the selected metric?

16. Which of the following statements correctly describe ensemble learning? Select all that apply.

17. Which of the following statements correctly describe gradient boosting? Select all that apply.

18. Which of the following statements accurately describe decision trees? Select all that apply.

19. What is the only section of a decision tree that contains no predecessors?

20. In a decision tree, nodes are where decisions are made, and they are connected by edges.

21. Fill in the blank: Each base learner in a random forest model has different combinations of features available to it, which helps prevent correlated errors among _____ in the ensemble.

22. What are some benefits of boosting? Select all that apply.

23. Which of the following statements correctly describe gradient boosting? Select all that apply.

24. A data analytics team uses tree-based learning for a research and development project. Currently, they are interested in the parts of the decision tree that represent an item’s target value. What are they examining?

25. In a decision tree model, which hyperparameter specifies the number of attributes that each tree selects randomly from the training data to determine its splits?

26. Adaboost is a tree-based boosting methodology in which each consecutive base learner assigns greater weight to the observations that were correctly predicted by the preceding learner.

27. Why might a GBM, or gradient-boosting machine, be inappropriate for use in the health care or financial fields?

CONCLUSION – Tree-Based Modeling

Subscribe to our site

Quiztudy Top Courses

Popular in Coursera

Mood Zone for Studying & Relaxing