COURSE 6: THE NUTS AND BOLTS OF MACHINE LEARNING

Module 1: The Different Types of Machine Learning

GOOGLE ADVANCED DATA ANALYTICS PROFESSIONAL CERTIFICATE

Complete Coursera Study Guide

INTRODUCTION – The Different Types of Machine Learning

In this comprehensive overview, participants will embark on a journey through the fundamental concepts of machine learning, gaining a deep understanding of its pivotal role in the expansive realm of data science. The course initiates with an exploration of the basic principles that underpin machine learning, providing participants with a solid foundation for the subsequent exploration of its multifaceted applications.

The intricate landscape of machine learning will be unfolded, with a specific focus on the four primary types: supervised, unsupervised, reinforcement, and deep learning. This holistic approach ensures that participants not only grasp the theoretical underpinnings but also appreciate the diverse applications and implications of each machine learning type, setting the stage for a nuanced and comprehensive understanding of the field.

Learning Objectives

  • Recognize the most common online resources in the data science field for ML
  • Explore Python packages available for ML development, what they’re used for, and their important differences
  • Identify questions to ask during each stage of PACE to prevent and recognize unfair or unethical models
  • Understand two approaches to recommendation systems
  • Review examples of continuous and categorical variables, as well as each ML type
  • Recognize common IDEs, resources, and libraries
  • Identify the major characteristics of the three types of ML

PRACTICE QUIZ: TEST YOUR KNOWLEDGE: INTRODUCTION TO MACHINE LEARNING

1. Fill in the blank: Machine learning involves using algorithms and _____ to teach computer systems to analyze and discover patterns in data.

  • dynamic reports
  • statistical models (CORRECT)
  • decision-support systems
  • computer software

Correct: Machine learning involves using algorithms and statistical models to teach computer systems to analyze and discover patterns in data.

2. A data professional using an unsupervised machine learning technique will ask a model to provide information based on a specified outcome.

  • True
  • False (CORRECT)

Correct: A data professional using an unsupervised machine learning technique will ask a model to provide information without telling the model what the outcome should be.

3. Which approach to machine learning involves rewarding or punishing a computer’s behaviors?

  • Reinforcement learning (CORRECT)
  • Supervised machine learning
  • Deep learning
  • Artificial intelligence

Correct: Reinforcement learning involves rewarding or punishing a computer’s behaviors. Based on which one it receives, the computer will update its policy, trying to optimize for rewards.

PRACTICE QUIZ: TEST YOUR KNOWLEDGE: CATEGORICAL VERSUS CONTINUOUS DATA TYPES AND MODELS

1. The weight of a surfboard is a continuous variable, whereas the number of surfboards currently at Bondi Beach is a discrete variable.

  • True (CORRECT)
  • False

Correct: The weight of a surfboard is a continuous variable because the possible value of expressing a weight in measurement such a kilograms is infinite and uncountable. The number of surfboards currently at Bondi Beach is a discrete variable because the surfboards can be counted.

2. A data professional is working on a project that involves labeling thousands of books by their various book genres. What type of variable should they use when working with this dataset?

  • Quantitative
  • Categorical (CORRECT)
  • Continuous
  • Discrete

Correct: They should use a categorical variable. Categorical variables contain a finite number of groups or categories.

PRACTICE QUIZ: TEST YOUR KNOWLEDGE: MACHINE LEARNING IN EVERYDAY LIFE

1. What term describes the subclass of machine learning algorithms that offers relevant suggestions to users?

  • Sensor techniques
  • Suggestion maps
  • Recommendation systems (CORRECT)
  • Data models

Correct: Recommendation systems are a subclass of machine learning algorithms that offer relevant suggestions to users.

2. Content-based systems are very effective at making recommendations across content types.

  • True
  • False (CORRECT)

Correct: Content-based systems are ineffective at making recommendations across content types. This is because different content types rarely share the same features, so user preferences about one content type often cannot be applied to another.

3. Fill in the blank: When several users actively like or dislike content by rating it or giving it a review, this enables _____ filtering.

  • Collaborative (CORRECT)
  • preferential
  • crowdsourced
  • merit-based

Correct: When several users actively like or dislike content by rating it or giving it a review, this enables collaborative filtering. A recommendation system using collaborative filtering makes comparisons based on who else likes a piece of content. Then, it will suggest that same content to others with similar preferences. 

PRACTICE QUIZ: TEST YOUR KNOWLEDGE: ETHICS IN MACHINE LEARNING

1. In recommendation systems, what term describes the phenomenon of more well-known items being recommended too frequently?

  • Non-objectivity
  • Popularity bias (CORRECT)
  • Trend partiality
  • Fame factor

Correct: Popularity bias describes the phenomenon of more well-known items being recommended too frequently. This leaves other items, which might be just as pleasing to users, not getting the attention they deserve.

2. A data professional has just begun considering the intended purpose of a model and how harmful or significant its effects could be. Which PACE stage of model development does this scenario describe?

  • Construct
  • Analyze
  • Execute
  • Plan (CORRECT)

Correct: This scenario describes the plan stage of model development. Other planning questions include how will the predictions be used, who is affected by the model, and several key issues surrounding the use of personal information. 

PRACTICE QUIZ: TEST YOUR KNOWLEDGE: UTILIZE THE PYTHON TOOLBELT FOR MACHINE LEARNING

1. What is the term for a software application that includes an interface for writing, running, and testing a piece of code?

  • HTML
  • CSV
  • VIF
  • IDE (CORRECT)

2. Code completion automatically finishes what a data professional is typing based on the functions and variables that are present in their code.  

  • True (CORRECT)
  • False

Correct: Code completion automatically finishes what a data professional is typing based on the functions and variables that are present in their code.

3. What types of packages are used to load, structure, and prepare a dataset for further analysis?

  • Visualization
  • Operational (CORRECT)
  • Machine learning
  • Processing

Correct: Operational packages are used to load, structure, and prepare a dataset for further analysis.

PRACTICE QUIZ: TEST YOUR KNOWLEDGE: MACHINE LEARNING RESOURCES FOR DATA PROFESSIONALS

1. Fill in the blank: Documentation is a _____ written by developers that includes specific information about various functions and features of a package.

  • Workbook
  • code
  • guide (CORRECT)
  • checklist

Correct: Documentation is a guide written by developers that includes specific information about various functions and features of a package.

2. If a data professional requires guidance regarding a particular piece of hardware, which team should they reach out to?

  • Information technology (CORRECT)
  • Product management
  • Marketing
  • Business intelligence

Correct: If a data professional requires guidance regarding a particular piece of hardware, they should reach out to the information technology department.

QUIZ: MODULE 1 CHALLENGE

1. Which of the following statements correctly describe supervised and unsupervised machine learning? Select all that apply. 

  • Unsupervised machine learning uses labeled datasets to train algorithms to classify or predict outcomes.  
  • Supervised machine learning uses labeled datasets to train algorithms to classify or predict outcomes. (CORRECT)  
  • In unsupervised machine learning, data professionals ask the model to give them information without telling the model what the answer should be. (CORRECT)
  • Unsupervised machine learning involves data professionals asking a model to give them information without specifying a desired outcome. (CORRECT)

2. Fill in the blank: The terms machine learning and _____ both refer to training a computer to detect patterns in data without being explicitly programmed to do so. 

  • Coding
  • artificial intelligence (CORRECT)
  • reinforcement learning
  • quality assurance 

3. An analytics team at a college works on a task involving categorical variables. Which of the following variables might be part of the project dataset? Select all that apply.

  • Number of books in a classroom
  • Languages spoken at the college (CORRECT)
  • Student nationalities (CORRECT)
  • Teacher subject area expertise (CORRECT)

4. Which of the following statements accurately describes content-based filtering? Select all that apply.

  • Content-based filtering effectively makes recommendations across content types.
  • Content-based filtering does not require information from other users to work properly. (CORRECT)
  • Content-based filtering properties often have to be selected and mapped manually. (CORRECT)
  • Content-based filtering recommends more of what a user likes. (CORRECT)

5. Fill in the blank: A key benefit of collaborative filtering is that it finds hidden _____ in the data.

  • duplicates 
  • correlations (CORRECT)
  • contradictions
  • errors

6. A data professional is considering whether the data they are using to build a model is well-sourced. Which PACE stage does this scenario describe?

  • Plan
  • Analyze (CORRECT)
  • Construct
  • Execute

7. Which of the following statements accurately describe Python notebooks and scripts? Select all that apply.

  • Python scripts are useful for pairing code with human-readable descriptions and outputs.
  • Python notebooks are executed by a computer without the need for human supervision.
  • Data professionals often alternate between Python notebooks and scripts. (CORRECT)
  • Data professionals can use both Python notebooks and scripts to execute code. (CORRECT)

8. Fill in the blank: The data visualization package _____ is designed primarily for statistical visualization.

  • Tableau
  • Plotly
  • Matplotlib
  • Seaborn (CORRECT)

9. Fill in the blank: In a typical business, a data professional is most likely to request assistance from the _____ department to obtain preliminary information about a dataset.

  • Sales
  • information technology
  • business intelligence (CORRECT)
  • marketing 

10. A data analytics team at a household goods manufacturer works on a task involving discrete variables. Which of the following variables might be part of the project dataset? Select all that apply.

  • Type of most popular toaster
  • Total days a sale lasts in March (CORRECT)
  • Number of appliances for sale at a retail store (CORRECT)
  • Amount of people in a household (CORRECT)

11. Which of the following statements accurately describe content-based filtering? Select all that apply.

  • Content-based filtering properties never have to be selected and mapped manually.
  • Content-based filtering does not require information from other users to work properly. (CORRECT)
  • Content-based filtering is ineffective at making recommendations across content types. (CORRECT)
  • Content-based filtering can go beyond comparing items to recommending other things that match a user’s preferences. (CORRECT)

12. A data professional is considering whether the data they are using to build a model is appropriate. Which PACE stage does this scenario describe? 

  • Construct
  • Execute
  • Analyze (CORRECT)
  • Plan

13. What are some advantages of Python notebooks? Select all that apply.

  • They automatically choose the best machine learning model to use for a data project. 
  • They are useful for pairing code with human-readable descriptions and outputs. (CORRECT)
  • Noncode elements can be embedded directly into the file. (CORRECT)
  • They offer functional advantages, such as the ability to export PDF files. (CORRECT)

14. Fill in the blank: A data professional may request assistance from the _____ department to find out what hardware and software are available for a data project.

  • sales
  • business intelligence
  • marketing
  • information technology (CORRECT)

15. Fill in the blank: In the process of _____, policies will change depending on whether a reward or punishment is received. 

  • quality assurance
  • artificial intelligence 
  • deep learning
  • reinforcement learning (CORRECT)

16. A data professional at a construction company works on a task involving continuous variables. Which of the following variables might be part of the project dataset? Select all that apply.

  • The number of pallets on a truck
  • The age of a building (CORRECT)
  • The height of a skyscraper (CORRECT)
  • The weight of a concrete block (CORRECT)

17. Fill in the blank: One benefit of collaborative filtering is that it can effectively _____ across content types.

  • make recommendations (CORRECT)
  • produce metadata
  • eliminate outliers
  • visualize data

18. Which of the following applications would be well-suited to the use of Python scripts? Select all that apply.

  • A task pairs code with human-readable descriptions.
  • A task that requires a human-readable output (CORRECT)
  • A program that incorporates several files (CORRECT)
  • A program that contains errors in need of debugging (CORRECT)

19. Fill in the blank: The data visualization package _____ is effective when creating presentations, such as designing a data visualization for an interactive dashboard.

  • Matplotlib
  • HTML
  • Tableau
  • Plotly (CORRECT)

20. Fill in the blank: A data professional working on an email campaign may request assistance from the _____ department to understand the purpose of their data work and confirm they are working toward a clear target.

  • business intelligence
  • information technology
  • finance
  • marketing (CORRECT)

21. Which of the following statements correctly describe supervised and unsupervised machine learning? Select all that apply. 

  • Supervised machine learning uses algorithms to analyze and cluster unlabeled datasets. 
  • In unsupervised machine learning, data professionals ask the model to give them information without telling the model what the answer should be. (CORRECT)
  • Supervised machine learning uses labeled datasets to train algorithms to classify or predict outcomes. (CORRECT)  
  • Data professionals use supervised machine learning for prediction. (CORRECT) 

22. Fill in the blank: Matplotlib is a type of _____, which enables data professionals to create plots and graphs for data projects.

  • data visualization package (CORRECT)
  • machine learning package
  • operational package
  • mathematical package

23. Fill in the blank: The process of _____ involves models made of layers of interconnected nodes. Each layer receives signals from its preceding layer, and nodes that are activated pass transformed signals to another layer or a final output. 

  • deep learning (CORRECT)
  • reinforcement learning
  • artificial intelligence
  • quality assurance

24. Fill in the blank: One drawback of collaborative filtering is that the data has a lot of _____ values.

  • Missing (CORRECT)
  • inaccurate
  • conflicting
  • redundant

25. Fill in the blank: Supervised machine learning uses labeled datasets to train _____ to classify or predict outcomes

  • Clusters
  • algorithms (CORRECT)
  • dashboards
  • networks

Correct: Supervised machine learning uses labeled datasets to train algorithms to classify or predict outcomes. Data professionals use supervised machine learning for prediction.

26. What type of variables would a data professional use to classify types of homes, such as apartment, single-family, or townhouse? 

  • Categorical (CORRECT)
  • Numeric
  • Continuous
  • Discrete

Correct: A data professional would use categorical variables to classify types of homes. Categorical variables contain a finite number of groups or categories. 

27. Fill in the blank: Content-based filtering is a recommendation system in which the recommendations are made based on _____ of the attributes of the content.

  • improvements
  • comparisons (CORRECT)
  • differentiations
  • segmentations

Correct: Content-based filtering is a recommendation system in which comparisons are made based on attributes of the content. They access a list of attributes describing a piece of content and compare that to related content in order to find items with the same list of attributes.

28. Fill in the blank: An integrated _____ environment, or IDE, is a software application that has an interface for writing, running, and testing a piece of code.

  • design
  • development (CORRECT)
  • dynamic
  • data

Correct: An integrated development environment, or IDE, is a software application that has an interface for writing, running, and testing a piece of code.

CONCLUSION – The Different Types of Machine Learning

In conclusion, this comprehensive exploration of machine learning has equipped participants with a profound understanding of its essential concepts and diverse applications within the realm of data science. Participants have delved into the intricacies of supervised, unsupervised, reinforcement, and deep learning, gaining valuable insights into how these methodologies contribute to solving complex problems and extracting meaningful patterns from data.

As participants conclude this course, they are well-prepared to leverage machine learning techniques in real-world scenarios, armed with the knowledge and skills needed to navigate the evolving landscape of data science. This journey serves as a solid foundation for those aspiring to apply machine learning effectively and contribute meaningfully to the field of data science.