course 2: USING PYTHON TO INTERACT WITH THE OPERATING SYSTEM

Module 4: Managing Data and Processes

GOOGLE IT AUTOMATION WITH PYTHON PROFESSIONAL CERTIFICATE

Complete Coursera Study Guide

Last updated:

INTRODUCTION – Managing Data and Processes

In this module, you’ll learn about reading and writing to data files based on an interaction with the user. Along the way, we’ll dive into standard streams, environment variables, and command line arguments. Next, we’ll jump into Python subprocesses, including system commands and how they can be used. We’ll review how to obtain output from a system command, and dive into subprocess management, including how to check exit values and manipulate the normal versus error exit values.

Finally, we’ll rundown processing log files, and will cover what a log file is, how to filter log files using regular expressions, and how to understand the output captured from log files.

Learning Objectives

  • Interact with log files using regular expressions
  • Utilize Python to interact with a user to attain certain values
  • Use the input() module to interact with the user
  • Describe how subprocess.run works and interacts with system commands like ping
  • Explain what a log file is
  • Use the get command to pull data from log files

PRACTICE QUIZ: DATA STREAMS

1. Which command will print out the exit value of a script that just ran successfully?

  • echo $? (CORRECT)
  • import sys
  • echo $PATH
  • wc variables.py

Great work! Echo will print out the exit value (question mark variable) of a script that just ran successfully.

2. Which command will create a new environment variable?

  • export (CORRECT)
  • env
  • input
  • wc

Right on! This command will create a new environment variable, and give it a value.

3. When a child process is run using the subprocess module, which of the following are true? (check all that apply)

  • The child process is run in a secondary environment. (CORRECT)
  • The parent process is blocked while the child process finishes. (CORRECT)
  • The parent process and child process both run simultaneously. 
  • Control is returned to the parent process when the child process ends. (CORRECT)

Nice work! To run the external command, a secondary environment is created for the child subprocess, where the command is executed.

Excellent! While the parent process is waiting on the subprocess to finish, it’s blocked, meaning the parent can’t do any work until the child finishes.

Right on! After the external command completes its work, the child process exits, and the flow of control returns to the parent.

4. When using the run command of the subprocess module, what parameter, when set to True, allows us to store the output of a system command?

  • cwd
  • capture_output (CORRECT)
  • timeout
  • shell

Not quite. The cwd parameter allows us to change the current working directory where the command will be executed.

5. What does the copy method of os.environ do?

  • Creates a new dictionary of environment variables (CORRECT)
  • Runs a second instance of an environment
  • Joins two strings
  • Removes a file from a directory

Nice work! The copy method of os.environ makes a new copy of the dictionary containing the environment variables, making modification easier.

6. A system command that sends ICMP packets can be executed within a script by using which of the following?

  • subprocess.run
  • Ping (CORRECT)
  • CompletedProcess
  • Arguments

Right on! This function will execute a system command such as ping.

7. Which of the following is a Unicode standard used to convert an array of bytes into a string?

  • UTF-8 (CORRECT)
  • stdout
  • capture_output
  • Host

Woohoo! This encoding is part of the Unicode standard that can transform an array of bytes into a string.

8. Which method do you use to prepare a new environment to modify environment variables?

  • join
  • env
  • copy (CORRECT)
  • cwd

Awesome! Calling this method of the os.environ dictionary will copy the current environment variables to store and prepare a new environment.

Practice Quiz: Python Subprocesses

1. What type of object does a run function return?

  • CompletedProcess (CORRECT)
  • returncode
  • stdout
  • capture_output

Awesome! This object includes information related to the execution of a command.

2. How can you change the current working directory where a command will be executed?

  • Use the capture_output parameter. 
  • Use the shell parameter.
  • Use the env parameter. Use the cwd parameter. (CORRECT)

Right on! This will change the current working directory where the command will be executed.

3. When a child process is run using the subprocess module, which of the following are true? (check all that apply)

  • The child process is run in a secondary environment. (CORRECT)
  • The parent process is blocked while the child process finishes. (CORRECT)
  • The parent process and child process both run simultaneously. Control is returned to the parent process when the child process ends. (CORRECT)

Nice work! To run the external command, a secondary environment is created for the child subprocess, where the command is executed.

Excellent! While the parent process is waiting on the subprocess to finish, it’s blocked, meaning the parent can’t do any work until the child finishes.

Right on! After the external command completes its work, the child process exits, and the flow of control returns to the parent.

4. When using the run command of the subprocess module, what parameter, when set to True, allows us to store the output of a system command?

  • cwd
  • capture_output (CORRECT)
  • timeout
  • shell

Not quite. The cwd parameter allows us to change the current working directory where the command will be executed.

5. What does the copy method of os.environ do?

  • Creates a new dictionary of environment variables (CORRECT)
  • Runs a second instance of an environment
  • Joins two strings
  • Removes a file from a directory

Nice work! The copy method of os.environ makes a new copy of the dictionary containing the environment variables, making modification easier.

6. A system command that sends ICMP packets can be executed within a script by using which of the following?

  • subprocess.run
  • Ping (CORRECT)
  • CompletedProcess
  • Arguments

Right on! This function will execute a system command such as ping.

7. Which of the following is a Unicode standard used to convert an array of bytes into a string?

  • UTF-8 (CORRECT)
  • stdout
  • capture_output
  • Host

Woohoo! This encoding is part of the Unicode standard that can transform an array of bytes into a string.

8. Which method do you use to prepare a new environment to modify environment variables?

  • join
  • env
  • copy (CORRECT)
  • cwd

Awesome! Calling this method of the os.environ dictionary will copy the current environment variables to store and prepare a new environment.

PRACTICE QUIZ: PROCESSING LOG FILES

1. You have created a Python script to read a log of users running CRON jobs. The script needs to accept a command line argument for the path to the log file. Which line of code accomplishes this?

  • import sys
  • syslog=sys.argv[1] (CORRECT)
  • print(line.strip())
  • usernames = {}

Right on! This will assign the script’s first command line argument to the variable “syslog”.

2. Which of the following is a data structure that can be used to count how many times a specific error appears in a log?

  • Search
  • Continue
  • Dictionary (CORRECT)
  • Get

Great work! A dictionary is useful to count appearances of strings.

3. Which keyword will return control back to the top of a loop when iterating through logs?

  • Continue (CORRECT)
  • Get
  • With
  • Search

Excellent! The continue statement is used to return control back to the top of a loop.

4. When searching log files using regex, which regex statement will search for the alphanumeric word “IP” followed by one or more digits wrapped in parentheses using a capturing group?

  • r”IP \(\d+\)$”
  • b”IP \((\w+)\)$”
  • r”IP \((\d+)\)$” (CORRECT)
  • r”IP \((\D+)\)$” 

Awesome! This expression will search for the word “IP” followed by a space and parentheses. It uses a capture group and \d+ to capture any digit characters found in the parentheses.

5. Which of the following are true about parsing log files? (Select all that apply.)

  • Load the entire log files into memory.
  • You should parse log files line by line. (CORRECT)
  • It is efficient to ignore lines that don’t contain the information we need. (CORRECT)
  • We have to open() the log files first. (CORRECT)

Well done! Since log files can get pretty large, it’s a good idea to parse them one line at a time instead of loading the entire file into memory at once.

Right on! We can save a lot of time by not parsing lines that don’t contain what we need.

Nice job! Before we can parse our log file, we have to use the open() or with open() command on the file first.

6. Which of the following is a correct printout of a dictionary?

  • {‘carrots’:100, ‘potatoes’:50, ‘cucumbers’: 65} (CORRECT)
  • {50:’apples’, 55:’peaches’, 15:’banana’} (CORRECT)
  • {55:apples, 55:peaches, 15:banana}
  • {carrots:100, potatoes:50, cucumbers: 65}

You got it! A dictionary stores key:value pairs.

WORKING WITH LOG FILES

1. What should Windows users do to connect to their VM in the provided lab environment?

  • Use the local Terminal application to connect using a PEM key.
  • Download the PPK key file from the Qwiklabs Start Lab page and use PuTTY for SSH connection. (CORRECT)
  • Open the Terminal application in Linux and enter the VM’s IP address.
  • Add Secure Shell to the Chrome browser and enter the VM’s hostname.

Correct

2. What is the primary purpose of the os module?

  • It allows you to perform mathematical operations and calculations efficiently.
  • It provides a portable way to interact with the Python interpreter. (CORRECT)
  • It enables you to create graphical user interfaces (GUIs) for Python applications.
  • It is used for managing data serialization and deserialization tasks.

Correct

3. In the lab’s Python script, what is the primary role of the error_search function when working with regular expressions (RegEx)?

  • To compile a fixed RegEx pattern for matching specific error codes in the log file
  • To use RegEx for splitting the log file into individual logs
  • To encrypt and secure log file data using RegEx patternsTo create and search for RegEx patterns based on user input to identify errors in the log file (CORRECT)

Correct

4. Which file did you use that contained the system log?

  • import sys
  • error_search
  • fishy.log (CORRECT)
  • find_error.py

Correct

5. What is the purpose of defining the main function in the script, and why is it significant for the script’s execution?

  • The main function defines the file paths for the log file and the output file and is executed at the beginning of the script’s execution. 
  • The main function is used to display error messages to the console and is executed only if errors are encountered during script execution.
  • The main function serves as the entry point of the script and is executed when the script is run as the main program. (CORRECT)
  • The main function is used to define custom functions within the script and is executed at the end of the script’s execution.

Correct

6. In the lab’s Python script find_error.py, what happens when the script is executed with a log file like fishy.log?

  • It encrypts fishy.log and stores it in a secure location.
  • The script prompts the user for a type of error, searches fishy.log for that error, and writes the found errors to errors_found.log. (CORRECT)
  • The script merges fishy.log with other log files to create a comprehensive error report.
  • It compiles fishy.log into a Python executable file for faster error analysis.

Correct

7. Apply what you’ve learned from this lab to answer this question. You are tasked with enhancing the find_error.py script to also search for warning messages in addition to errors in the fishy.log file. How would you modify the script to accomplish this?

  • Create a separate script specifically for searching warning messages in the log file.
  • Modify the error_patterns list initialization to include both “error” and “WARN” as base patterns. (CORRECT)
  • Add a new input prompt to the script for the user to specify if they want to search for “WARN” messages.
  • Replace the error_patterns list with a new list containing only “WARN” patterns.

Correct

8. Which of the following does the sys module provide information about in the Python interpreter? Select all that apply.

  • Constants (CORRECT)
  • Methods (CORRECT)
  • Operating system
  • Functions (CORRECT)

Correct

9. What is the function that takes the errors returned by another function as a formal parameter? 

  • returned_errors
  • Either file_output or error_search are used for this task.
  • file_output (CORRECT)
  • error_search

Correct

10. What is the primary purpose of the re module in Python?

  • To enhance graphical capabilities and user interface design
  • For data encryption and cybersecurity purposes
  • To enable network connectivity and communication over the internet
  • To provide support for working with Regular Expressions for pattern matching in strings (CORRECT)

Correct

11. Apply what you’ve learned from this lab to answer this question. What is the purpose of using regular expressions when you interact with log files in Python?

  • To make code more readable
  • To speed up script execution
  • To filter and extract information (CORRECT)
  • To modify the log files

Correct

12. In the process of connecting to a virtual machine using SSH and PuTTY on Windows, which of the following steps is necessary?

  • Entering the username and external IP address in the Host Name (or IP address) box.
  • Entering a password for authentication
  • Opening the PuTTY Secure Shell (SSH) client
  • Downloading the PPK key file from the Qwiklabs Start Lab page (CORRECT)

Correct

13. What is the primary function of regular expressions (RegEx) in Python programming?

  • To speed up the execution of code by optimizing algorithm performance
  • To act as a programming language for creating complex software applications
  • To define a sequence of characters that form a search pattern for text processing (CORRECT)
  • To serve as a method for encrypting and securing data within a program

Correct

14. What is the role of fishy.log in the provided Python script for log file analysis?

  • This is a configuration file that dictates how the script should process logs.
  • It is the name of the script that contains the regular expressions for error analysis. (CORRECT)
  • It refers to a function within the script that generates log files for testing.
  • It is the log file that is being analyzed for specific error patterns.

Correct

15. What is the step-by-step process of how errors are searched for and processed in the script within the lab?

  • Set the log_file variable, call the error_search() function with the log_file parameter to search for errors, and store the matching errors in the returned_errors list. (CORRECT)
  • Define the main function, call the error_search() function with the log file path, and display the errors to the console.
  • Define the file_output() function, read the log file, search for a specific error type, and write the errors to an errors_found.log file.
  • Start by defining the error_search() and file_output() functions, and then read the log file specified by the user.

Correct

16. Which of the following statements is the best definition of a log file? Select the best answer. 

  • A file that stores user data
  • A file that contains an application’s source code
  • A file that stores machine codeA file that keeps track of events in an operating system (CORRECT)

Correct

17. In the script’s execution process described, what are the main functions called, and in what order?

  • error_search() and file_output() called in that order
  • file_output() and error_search()called in that order
  • file_output() and error_search() called in that order
  • error_search() and file_output() called in that order  (CORRECT)

Correct

18. How does the find_error.py script process the fishy.log file according to the provided content?

  • It compresses fishy.log to reduce its size for storage efficiency.
  • The script first searches fishy.log for user-specified errors and then generates a new file, errors_found.log, containing these errors. (CORRECT)
  • It automatically detects and fixes syntax errors within fishy.log.
  • The script translates the contents of fishy.log into another programming language for cross-platform compatibility.

Correct

19. What is the primary function of the os module in Python?

  • To handle machine learning algorithms and data analysis
  • For managing and manipulating file paths and directory structures (CORRECT)
  • It provides functions for creating and managing graphical interfaces.
  • The os module is used exclusively for web development purposes.

Correct

20. What is the purpose of the sys.exit(0) statement in the script, and how does it affect the execution of the Python script?

  • The sys.exit(0) statement is used to pause the script’s execution and wait for user input before continuing.
  • The sys.exit(0) statement is used to indicate successful termination of the script, and it has no impact on the script’s execution. (CORRECT)
  • The sys.exit(0) statement is used to forcibly terminate the script, even if there are errors in the code.
  • The sys.exit(0) statement is used to display an error message to the user and halt the script’s execution if any errors are encountered.

Correct

21. In the lab’ script, what is the purpose of the if __name__ == “__main__”: block, and why is it important for the execution of the script?

  • The if __name__ == “__main__”: block is used to define custom functions for the script, and it doesn’t impact the script’s execution.
  • The if __name__ == “__main__”: block is used to display an error message to the user if any errors are encountered during script execution.
  • The if __name__ == “__main__”: block is the main entry point of the script, where the script’s execution begins when it is run as the main program. (CORRECT)
  • The if __name__ == “__main__”: block is used to specify the author’s name and copyright information for the script.

Correct

22. Apply what you’ve learned from this lab to answer this question. You are working on the find_error.py script to search for specific errors in the fishy.log file. If you need to find all instances of a network connection failure, which of the following steps would you take to modify the script accordingly?

  • Change the error_patterns list to include only “network” and “failure”, and then run the script with fishy.log.
  • Rewrite the Regular Expression in the script to only match logs with the word “network”.
  • Modify the user input line to specifically ask for “network connection failure” errors, then process fishy.log. (CORRECT)
  • Edit the file_output function to filter out all logs except those containing the word “network”.

Correct

23. What role does the if __name__ == “__main__”: block play in the execution of the lab’s script, and at what point in the script’s execution does it come into play?

  • The if __name__ == “__main__”: block is responsible for defining custom functions within the script, and it runs at the beginning of the script’s execution.
  • The if __name__ == “__main__”: block serves as the main entry point of the script, and it is where the script’s execution begins when run as the main program. (CORRECT)
  • The if __name__ == “__main__”: block handles syntax errors and runs only if an error is encountered during the script’s execution.
  • The if __name__ == “__main__”: block is used to specify the author’s name and copyright information for the script, and it runs at the end of the script.

Correct

24. Which of the following statements about log files is true? Select all that apply.

  • They can help in identifying and fixing issues. (CORRECT)
  • They can be used to monitor system performance. (CORRECT)
  • They are created only when an error occurs.
  • They can be programmed to record specific events. (CORRECT)

Correct

25. Which term describes a program that provides a text-based interface for typing commands? 

  • An IP address
  • A download
  • A console
  • A terminal (CORRECT)

Correct

26. What is the primary purpose of the sys module in Python?

  • Provides functions and variables to interact with the Python interpreter and the runtime environment (CORRECT)
  • Performing system-level operations like managing files and directories
  • Creating graphical user interfaces (GUIs) in Python
  • Mathematical and numerical computations in Python

Correct

27. In the lab’s Python script, what is the role of the error_search function in relation to processing log files with regular expressions (RegEx)?

  • To interactively receive an error type from the user and use RegEx to find corresponding logs (CORRECT)
  • The function uses RegEx to compress log files for efficient storage
  • To convert all log file data into a single regular expression pattern for bulk processing
  • To apply a standard RegEx pattern to every log file for general error detection

Correct

CONCLUSION – Managing Data and Processes

In conclusion, this module has provided a comprehensive exploration of essential concepts related to reading and writing data files through user interaction. We navigated through key components such as standard streams, environment variables, and command line arguments, building a foundational understanding. The exploration of Python subprocesses, focusing on system commands and their applications, broadened our knowledge. Further, we delved into obtaining output from system commands and effectively managing subprocesses, learning techniques to assess and manipulate exit values.

The final segment covered the intricacies of handling log files, encompassing their definition, filtering using regular expressions, and deciphering the captured output. Armed with these skills, you are well-equipped to navigate and manipulate data files and subprocesses effectively in your Python projects.