COURSE 5: PREPARE FOR DP-100: DATA SCIENCE ON MICROSOFT AZURE EXAM
Module 2: Exam Preparation Course 1
MICROSOFT AZURE DATA SCIENTIST ASSOCIATE (DP-100) PROFESSIONAL CERTIFICATE
Complete Coursera Study Guide
Last updated:
TABLE OF CONTENT
INTRODUCTION – EXAM PREPARATION COURSE 1
In this module, you will review the content from Course 1 of the Microsoft Azure Data Scientist Associate specialization. This course covers foundational concepts and practical skills necessary for data science using Azure. You will revisit key topics such as data exploration, data preparation, and the basics of machine learning.
Additionally, you will refresh your understanding of how to use Azure Machine Learning service to build, train, and deploy machine learning models. This review will reinforce your knowledge and prepare you for more advanced topics in subsequent courses of the specialization.
Learning Objectives
- Outline the key points covered in the Microsoft Azure Data Scientist Associate specialization
- Recap on main topic in Course 1: Create machine learning models Assess knowledge and skills in the creating machine learning models”
Quiz: Create machine learning models
1. Your manager has asked you to create a binary classification model to predict whether a person has a disease. You need to detect possible classification errors.
Which error type should you choose for the following description?
“A person has a disease. The model classifies the case as having a disease”.
- False positives
- False negatives
- True negatives (CORRECT)
- True positives
Correct: A true negative is an outcome where the model correctly predicts the negative class.
2. Your manager has asked you to create a binary classification model to predict whether a person has a disease. You need to detect possible classification errors.
Which error type should you choose for the following description?
“A person does not have a disease. The model classifies the case as having a disease”.
- False negatives
- True positives
- False positives (CORRECT)
- True negatives
Correct: A false positive is an outcome where the model incorrectly predicts the positive class.
3. You are a senior data scientist in the company and you are tasked with evaluating a completed binary classification machine learning model.
You need to use the precision as the evaluation metric. Which visualization should you use?
- Scatter plot
- Gradient descent
- Receiver Operating Characteristic (ROC) curve (CORRECT)
- Violin plot
Correct: Receiver operating characteristic (or ROC) is a plot of the correctly classified labels vs. the incorrectly classified labels for a particular model.
4. You are a data scientist of a company and you are tasked with building a deep convolutional neural network (CNN) for image classification. The CNN model you built shows signs of overfitting. You need to reduce overfitting and converge the model to an optimal fit.
Which two actions should you perform?
- Add an additional dense layer with 64 input units
- Reduce the amount of training data
- Add an additional dense layer with 512 input units
- Add L1/L2 regularization (CORRECT)
- Use training data augmentation (CORRECT)
Correct: Weight regularization provides an approach to reduce the overfitting of a deep learning neural network model on the training data and improve the performance of the model on new data, such as the holdout test set. L1L2: Sum of the absolute and the squared weights.
Correct: Adding more training records should decrease the overfitting.
5. Your manager has provided you a dataset created for multiclass classification tasks that contains a normalized numerical feature set with 10,000 data points and 150 features. You use 75 percent of the data points for training and 25 percent for testing.
| Name | Description |
|---|---|
| X_train | Training feature set |
| Y_train | Training class labels |
| x_train | Testing feature set |
| y_train | Testing class labels |
You need to apply the Principal Component Analysis (PCA) method to reduce the dimensionality of the feature set to 10 features in both training and testing sets.
You are using the scikit-learn machine learning library in Python.
You use X to denote the feature set and Y to denote class labels.
You create the following Python data frames:
From sklearn.decomposition import PCA
pca – […]
x_train=[…] .fit_transform(X_train)
x_test = pca.[…]
How should you complete the code segment?
- Box1: PCA(n_components=10000);
- Box2: pca;
- Box3: X_train
- Box1: PCA(n_components=10);
- Box2: model;
- Box3: transform(x_test)
- Box1: PCA(n_components=150);
- Box2: pca;
- Box3: x_test
- Box1: PCA(n_components=10); (CORRECT)
- Box2: pca;
- Box3: transform(x_test)
Correct: This is the described metric.
6. You are creating a model to predict the price of a student’s artwork depending on the following variables: the student’s length of education, degree type, and art form.
You start by creating a linear regression model. You need to evaluate the linear regression model.
Solution: Use the following metrics: Mean Absolute Error, Root Mean Absolute Error, Relative Absolute Error, Accuracy, Precision, Recall, F1 score, and AUC:
Does the solution meet the goal?
- Yes
- No (CORRECT)
Correct: Accuracy, Precision, Recall, F1 score, and AUC are metrics for evaluating classification models; Mean Absolute Error, Root Mean Absolute Error, Relative Absolute Error are OK for the linear regression model.
7. What happens when a NumPy array is multiplied by 5?
- Array stays the same size, but each element is multiplied by 5. (CORRECT)
- The new array will be 5 times longer, with the sequence repeated 5 times.
- The new array will be 5 times longer, with the sequence repeated 5 times and also all the elements are multiplied by 5.
Correct: This is how a list behaves when multiplied.
8. You are creating a model and you want to evaluate it. One metric yields an absolute metric in the same unit as the label.
Which metric is described?
- Root Mean Square Error (RMSE) (CORRECT)
- Coefficient of Determination (known as R-squared or R2)
- Mean Square Error (MSE)
Correct: This is the described metric. This means that the smaller the value, the better the model.
9. Complete the sentence:
The Support Vector Machine algorithm is an example of machine learning __________ type model.
- Regression
- Classification (CORRECT)
- Clustering
Correct: Logistic Regression is a well-established algorithm for classification.
10. It is well known that Python provides extensive functionality with powerful and statistical numerical libraries. What is TensorFlow useful for?
- Offering simple and effective predictive data analysis (CORRECT)
- Supplying machine learning and deep learning capabilities
- Providing attractive data visualizations
- Analyzing and manipulating data
Correct: Scikit-learn offers simple and effective predictive data analysis.
11. You are creating a binary classification by using a two-class logistic regression model. You need to evaluate the model results for imbalance. Which evaluation metric should you use?
- Relative Squared Error
- Mean Absolute Error
- Relative Absolute Error
- AUC Curve (CORRECT)
Correct: This is the described metric.
12. What happens when a list is multiplied by 5?
- The new list remains the same size, but the elements are multiplied by 5.
- The new list created has the length 5 times the original length with the sequence repeated 5 times and also all the elements are also multiplied by 5.
- The new list created has the length 5 times the original length with the sequence repeated 5 times. (CORRECT)
Correct: This is how a list behaves when multiplied.
13. You are tasked to analyze a dataset containing historical data from a local taxi company. You are developing a regression model for this. Your goal is to predict the fare of a taxi trip. You need to select performance metrics to correctly evaluate the regression model.
Which two metrics can you use?
- An F1 score that is low
- An R-Squared value close to 0
- An R-Squared value close to 1 (CORRECT)
- A Root Mean Square Error value that is low (CORRECT)
Correct: RMSE and R2 are both metrics for regression models. Coefficient of determination, often referred to as R2, represents the predictive power of the model as a value between 0 and 1. Zero means the model is random (explains nothing); 1 means there is a perfect fit.
Correct: RMSE and R2 are both metrics for regression models. Root mean squared error (RMSE) creates a single value that summarizes the error in the model.
14. You are creating a model and you want to evaluate it. For this, you take a look on a specific metric which is direct proportional with how well the model fits.
Which evaluation model is described?
- Mean Square Error (MSE)
- Coefficient of Determination (known as R-squared or R2) (CORRECT)
- Root Mean Square Error (RMSE)
Correct: This is the evaluation metric described. In essence, this metric represents how much of the variance between predicted and actual label values the model is able to explain.
15. Your manager has asked you to create a binary classification model to predict whether a person has a disease. You need to detect possible classification errors.
Which error type should you choose for the following description?
“A person does not have a disease. The model classifies the case as having no disease”.
- True positives
- True negatives (CORRECT)
- False negatives
- False positives
Correct: A true negative is an outcome where the model correctly predicts the negative class.
16. Your manager has asked you to create a binary classification model to predict whether a person has a disease. You need to detect possible classification errors.
Which error type should you choose for the following description?
“A person has a disease. The model classifies the case as having no disease”.
- False positives
- False negatives (CORRECT)
- True positives
- True negatives
Correct: A false negative is an outcome where the model incorrectly predicts the negative class.
17. You are creating a model to predict the price of a student’s artwork depending on the following variables: the student’s length of education, degree type, and art form.
You start by creating a linear regression model. You need to evaluate the linear regression model.
Solution: Use the following metrics: Relative Squared Error, Coefficient of Determination, Accuracy, Precision, Recall, F1 score, and AUC:
Does the solution meet the goal?
- Yes
- No (CORRECT)
Correct: Accuracy, Precision, Recall, F1 score, and AUC are metrics for evaluating classification models; Mean Absolute Error, Root Mean Absolute Error, Relative Absolute Error are OK for the linear regression model.
CONCLUSION – EXAM PREPARATION COURSE 1
By the end of this module, you will have a solid understanding of the foundational concepts and practical skills necessary for data science using Azure, as covered in Course 1 of the Microsoft Azure Data Scientist Associate specialization. You will be well-prepared to advance to more complex topics, having refreshed your knowledge on data exploration, data preparation, and the basics of machine learning, as well as how to use the Azure Machine Learning service for building, training, and deploying models.
Quiztudy Top Courses
Popular in Coursera
- Google Advanced Data Analytics
- Google Cybersecurity Professional Certificate
- Meta Marketing Analytics Professional Certificate
- Google Digital Marketing & E-commerce Professional Certificate
- Google UX Design Professional Certificate
- Meta Social Media Marketing Professional Certificate
- Google Project Management Professional Certificate
- Meta Front-End Developer Professional Certificate
Liking our content? Then, don’t forget to ad us to your BOOKMARKS so you can find us easily!

