* Cantinho Satkeys

Refresh History
  • j.s.: dgtgtr a todos  4tj97u<z
    07 de Julho de 2025, 13:50
  • FELISCUNHA: Votos de um santo domingo para todo o auditório  4tj97u<z
    06 de Julho de 2025, 11:43
  • j.s.: [link]
    05 de Julho de 2025, 16:31
  • j.s.: dgtgtr a todos  4tj97u<z
    05 de Julho de 2025, 16:31
  • j.s.: h7t45 ao convidado de Honra batatinha pela sua ajuda
    05 de Julho de 2025, 16:30
  • FELISCUNHA: ghyt74  pessoal   4tj97u<z
    04 de Julho de 2025, 11:58
  • JPratas: dgtgtr Pessoal  101041 Vamos Todos Ajudar na Manutenção do Forum, Basta 1 Euro a Cada Um  43e5r6
    03 de Julho de 2025, 19:02
  • cereal killa: Todos os anos e preciso sempre a pedir esmolas e um simples gesto de nem que seja 1€ que fosse dividido por alguns ajudava, uma coisa e certa mesmo continuando isto vai levar volta a como se tem acesso aos tópicos, nunca se quis implementar esta ideia mas quem não contribuir e basta 1 € por ano não terá acesso a sacar nada, vamos ver desenrolar disto mais ate dia 7,finalmente um agradecimento em nome do satkeys a quem já fez a sua doação, obrigada
    03 de Julho de 2025, 15:07
  • m1957: Por favor! Uma pequena ajuda, não deixem que o fórum ecerre. Obrigado!
    03 de Julho de 2025, 01:10
  • j.s.: [link]
    02 de Julho de 2025, 21:09
  • j.s.: h7t45 ao membro anónimo pela sua ajuda  49E09B4F
    02 de Julho de 2025, 21:09
  • j.s.: dgtgtr a todos  4tj97u<z
    01 de Julho de 2025, 17:18
  • FELISCUNHA: Votos de um santo domingo para todo o auditório  4tj97u<z
    29 de Junho de 2025, 11:59
  • m1957: Foi de boa vontade!
    28 de Junho de 2025, 00:39
  • j.s.: passem f.v. por aqui [link]    h7t45
    27 de Junho de 2025, 17:20
  • j.s.: renovamos o nosso pedido para uma pequena ajuda para pagemento  do nosso forum
    27 de Junho de 2025, 17:19
  • j.s.: h7t45 aos convidados de honra Felizcunha e M1957 pela ajuda
    27 de Junho de 2025, 17:15
  • j.s.: dgtgtr a todos  4tj97u<z
    27 de Junho de 2025, 17:13
  • FELISCUNHA: ghyt74  pessoal  4tj97u<z
    27 de Junho de 2025, 11:51
  • JPratas: try65hytr A Todos  classic k7y8j0
    27 de Junho de 2025, 04:35

Autor Tópico: Writing production-ready ETL pipelines in Python / Pandas  (Lida 109 vezes)

0 Membros e 1 Visitante estão a ver este tópico.

Offline mitsumi

  • Sub-Administrador
  • ****
  • Mensagens: 121842
  • Karma: +0/-0
Writing production-ready ETL pipelines in Python / Pandas
« em: 17 de Julho de 2021, 18:27 »

MP4 | Video: h264, 1280x720 | Audio: AAC, 44.1 KHz, 2 Ch
Genre: eLearning | Language: English + srt | Duration: 78 lectures (7h 3m) | Size: 2.43 GB
Learn how to write professional ETL pipelines using best practices in Python and Data Engineering

What you'll learn:
How to write professional ETL pipelines in Python.
Steps to write production level Python code.
How to apply functional programming in Data Engineering.
How to do a proper object oriented code design.
How to use a meta file for job control.
Coding best practices for Python in ETL/Data Engineering.
How to implement a pipeline in Python extracting data from an AWS S3 source, transforming and loading the data to another AWS S3 target.

Requirements
Basic Python and Pandas knowledge is desirable.
Basic ETL and AWS S3 knowledge is desirable.

Description
This course will show each step to write an ETL pipeline in Python from scratch to production using the necessary tools such as Python 3.9, Jupyter Notebook, Git and Github, Visual Studio Code, Docker and Docker Hub and the Python packages Pandas, boto3, pyyaml, awscli, jupyter, pylint, moto, coverage and the memory-profiler.

Two different approaches how to code in the Data Engineering field will be introduced and applied - functional and object oriented programming.

Best practices in developing Python code will be introduced and applied:

design principles

clean coding

virtual environments

project/folder setup

configuration

logging

exeption handling

linting

dependency management

performance tuning with profiling

unit testing

integration testing

dockerization

What is the goal of this course?

In the course we are going to use the Xetra dataset. Xetra stands for Exchange Electronic Trading and it is the trading platform of the Deutsche Börse Group. This dataset is derived near-time on a minute-by-minute basis from Deutsche Börse's trading system and saved in an AWS S3 bucket available to the public for free.

The ETL Pipeline we are going to create will extract the Xetra dataset from the AWS S3 source bucket on a scheduled basis, create a report using transformations and load the transformed data to another AWS S3 target bucket.

The pipeline will be written in a way that it can be deployed easily to almost any production environment that can handle containerized applications. The production environment we are going to write the ETL pipeline for consists of a GitHub Code repository, a DockerHub Image Repository, an execution platform such as Kubernetes and an Orchestration tool such as the container-native Kubernetes workflow engine Argo Workflows or Apache Airflow.

So what can you expect in the course?

You will receive primarily practical interactive lessons where you have to code and implement the pipeline and theory lessons when needed. Furthermore you will get the python code for each lesson in the course material, the whole project on GitHub and the ready to use docker image with the application code on Docker Hub.

There will be power point slides for download for each theoretical lesson and useful links for each topic and step where you find more information and can even dive deeper.

Who this course is for
Data engineers, scientists and developers who want to write professional production-ready data pipelines in Python.
Everyone who is interested in writing data pipelines in Python that are ready for production.


Download link:
Só visivel para registados e com resposta ao tópico.

Only visible to registered and with a reply to the topic.

Links are Interchangeable - No Password - Single Extraction