Intermediate Python for Data Engineers (English)
Description
In the "Intermediate Python for Data Engineers" training you will learn how to perform common Data Engineering tasks in Python: from loading common file formats to unlocking APIs and saving and later loading Python objects (such as trained Machine Learning models). Afterwards you can effectively use Python to write scripts for data processing at random, for example in Databricks or Azure Functions.
The "Intermediate Python for Data Engineers" training is aimed at Data Engineers, Data Analysts and Data Scientists who want to be able to process data effectively. In terms of cloud use, we focus on Azure, but the ways of working are not Azure-specific: participants who work more on-premises, in private clouds or on other public clouds (e.g. AWS, GCP or Oracle Cloud) also benefit greatly from this training.
After the course you will be able to write Python scripts to access and process data from various sources. The focus here is on loading, storing and transferring more complex resources, APIs and file formats.
Do you want to learn more about the (substantive) transformation and analysis of data? Then take a look at our training Python Fundamentals for Data Engineers
We will work with many hands-on assignments in Python over the course of two days. At the end of the course you will have achieved the following learning objectives:
- Being able to handle complex(er) files, such as nested JSON files, XML and Parquet
- Understand how filesystems differ in Windows and Linux environments
- Can copy and move files
- Knowing when to do things in Python or better within a shell environment
- Pickle to store Python objects such as trained ML models or processed Data Frames on a data lake or disk.
- Able to read and write to an Azure Data Lake using the Azure modules
- Being able to unlock APIs and knowing smart ways to do this on a larger scale
- Apply logging to monitor progress in a structured way during the execution of your program code and to link up with existing logging solutions.
Prerequisites to follow the training "Intermediate Python for Data Engineers"
Experience with Python is a requirement for this training. We expect you to have mastered at least the following:
- Reading simple CSV files
- Loading and using modules in Python
- Do simple data operations with DataFrames, for example in Pandas, Koalas or PySpark
A good preparation for this training is the training Python Fundamentals for Data Engineers
Outline
- Unlocking APIs in Python
- The request module
- Process nested JSON
- Processing XML
- Dealing with Azure Data Lake Storage
- Pickle: Save objects, Dataframes and ML models in files
- File system operations: glob, os, pathlib and file copy
- Logging
Study material
In the training "Intermediate Python for Data Engineers" we use material that we have developed ourselves at Wortell Smart Learning. We make sure that you receive all the necessary material on time.
Available dates
Title | Date |
---|---|
Intermediate Python voor Data Engineers dag 1 | |
Intermediate Python voor Data Engineers dag 2 |
Title | Date |
---|---|
Intermediate Python for Data Engineers (English) (EN) day 1 | |
Intermediate Python for Data Engineers (English) (EN) day 2 |
Title | Date |
---|---|
Intermediate Python voor Data Engineers dag 1 | |
Intermediate Python voor Data Engineers dag 2 |
Title | Date |
---|---|
Intermediate Python for Data Engineers (English) (EN) day 1 | |
Intermediate Python for Data Engineers (English) (EN) day 2 |