* Cantinho Satkeys

Refresh History
  • JPratas: dgtgtr Pessoal  49E09B4F k7y8j0
    06 de Novembro de 2024, 17:19
  • FELISCUNHA: Votos de um santo domingo para todo o auditório  4tj97u<z
    03 de Novembro de 2024, 10:49
  • j.s.: bom fim de semana  43e5r6 49E09B4F
    02 de Novembro de 2024, 08:37
  • j.s.: ghyt74 a todos  4tj97u<z
    02 de Novembro de 2024, 08:36
  • FELISCUNHA: ghyt74   49E09B4F  e bom feriado   4tj97u<z
    01 de Novembro de 2024, 10:39
  • JPratas: try65hytr Pessoal  h7ft6l k7y8j0
    01 de Novembro de 2024, 03:51
  • j.s.: try65hytr a todos  4tj97u<z
    30 de Outubro de 2024, 21:00
  • JPratas: dgtgtr Pessoal  4tj97u<z k7y8j0
    28 de Outubro de 2024, 17:35
  • FELISCUNHA: Votos de um santo domingo para todo o auditório  k8h9m
    27 de Outubro de 2024, 11:21
  • j.s.: bom fim de semana   49E09B4F 49E09B4F
    26 de Outubro de 2024, 17:06
  • j.s.: dgtgtr a todos  4tj97u<z
    26 de Outubro de 2024, 17:06
  • FELISCUNHA: ghyt74   49E09B4F  e bom fim de semana
    26 de Outubro de 2024, 11:49
  • JPratas: try65hytr Pessoal  101yd91 k7y8j0
    25 de Outubro de 2024, 03:53
  • JPratas: dgtgtr A Todos  4tj97u<z 2dgh8i k7y8j0
    23 de Outubro de 2024, 16:31
  • FELISCUNHA: ghyt74  pessoal   49E09B4F
    23 de Outubro de 2024, 10:59
  • j.s.: dgtgtr a todos  4tj97u<z
    22 de Outubro de 2024, 18:16
  • j.s.: dgtgtr a todos  4tj97u<z
    20 de Outubro de 2024, 15:04
  • FELISCUNHA: Votos de um santo domingo para todo o auditório  101041
    20 de Outubro de 2024, 11:37
  • axlpoa: hi
    19 de Outubro de 2024, 22:24
  • FELISCUNHA: ghyt74   49E09B4F  e bom fim de semana  4tj97u<z
    19 de Outubro de 2024, 11:31

Autor Tópico: Scrapy masterclass: Python web scraping and data pipelines  (Lida 44 vezes)

0 Membros e 2 Visitantes estão a ver este tópico.

Online mitsumi

  • Moderador Global
  • ***
  • Mensagens: 115840
  • Karma: +0/-0
Scrapy masterclass: Python web scraping and data pipelines
« em: 18 de Novembro de 2022, 15:12 »

Scrapy masterclass: Python web scraping and data pipelines
Published 11/2022
Created by Ahmed Elfakharany
MP4 | Video: h264, 1280x720 | Audio: AAC, 44.1 KHz, 2 Ch
Genre: eLearning | Language: English | Duration: 40 Lectures ( 5h 44m ) | Size: 2.75 GB
Work on 7 real-world web-scraping projects using Scrapy, Splash, and Selenium. Build data pipelines locally and on AWS

What you'll learn
Extract data from the most difficult web sites using Scrapy
Build ETL pipelines and store data in CSV, JSON, MySQL, MongoDB, and S3
Avoid getting banned and evade bot-protection techniques
Use Splash for scraping JavaScript-powered websites
Harness the power of Selenium browser automation to scrape any website
Deploy your Scrapy bots in local and AWS environments
Requirements
Some Python background
All projects are run on Python 3.10 so it needs to be installed
Familiarity with Linux is recommended but not strictly required
Familiarity with the HTTP protocol and HTML
Description
This is the era of data! Everyone is telling you what to do with the data that you already have. But how can you "have" this data?Most of the Data Engineering / Data Science discussions today focus on how to analyze and process datasets to draw some useful information out of them. However, they all assume that those datasets are already available to you. That they've been collected somehow. They spend little time showing how you can obtain this dataset firsthand! This course fills this gap.Scrapy for building powerful web scraping pipelines is all about walking you through the process of extracting data of interest from websites. True, there are a lot of datasets already available for you to consume either for free or at some cost. However, what if those datasets are outdated? What if they don't address your specific needs? You'd better know how to build your own dataset from scratch no matter how unstructured your data source was.Scrapy is a Python web scraping framework. Thousands of companies and professionals use it to collect data and build datasets. Then they can sell them or use them in their own projects. Today, you can be one of those professionals. Even build your own business around data harvesting!Today, data scientists and data engineers are among the most highly paid in the industry. Yet, if they don't have enough data to work on, they can do nothing. In this class, I'll show you how to obtain, organize, and store unstructured data from within websites' HTML, CSS, and JavaScript. Having mastered that skill, you can start your data engineering/data science career with an extra skillset under your belt: web scraping.You will also learn the next steps after you obtain your data. ETL (Extract, Transform, and Load) starts with Scrapy (Extract). But this course covers the other two aspects (Transform and Load). Using Scrapy pipelines, we'll see how we can store our data to SQL, and NoSQL databases, Elastic Search clusters, event brokers like Kafka, object storage like S3, and message queues like AWS SQS.Even if you know nothing about web scraping or data harvesting, even if all of this seems new to you, you've come to the right place.I've designed this class for total beginners. It will walk you from "What is web scraping? What is Scrapy? Why should I learn and use it?" all the way up to "Now I have several gigabytes of web-scraped data from dozens of websites. Let's figure out how we can put them to effective use".Web scraping can be as easy as extracting some text from some HTML page do going several levels deep among several websites, crawling each link, and hoping from one page to another. It can also get incredibly challenging when websites place blockers to disallow web bots from accessing them. Don't worry, we'll address all use-cases and, together, figure out how we can overcome them.
Who this course is for
Anyone who wants to automate data collection from websites (web scraping) using Scrapy
Anyone who wants to build a business around web scraping and data collection
Data engineers, data scientists, ML engineers who want to master web scraping for their data collection needs
Developers, DevOps engineers or IT professionals who want to switch careers to data engineering
Python programmers who want to know more about Scrapy or web scraping in general

Download link

rapidgator.net:
Citar
https://rapidgator.net/file/40e4eb581ee3d4349644179b4ca4ec6d/jweci.Scrapy.masterclass.Python.web.scraping.and.data.pipelines.part1.rar.html
https://rapidgator.net/file/0a8c260c7359201867d2836470f4d890/jweci.Scrapy.masterclass.Python.web.scraping.and.data.pipelines.part2.rar.html
https://rapidgator.net/file/faf7541cac967dba57c4b96baab2186a/jweci.Scrapy.masterclass.Python.web.scraping.and.data.pipelines.part3.rar.html

uploadgig.com:
Citar
https://uploadgig.com/file/download/AA5b6e174610b9B3/jweci.Scrapy.masterclass.Python.web.scraping.and.data.pipelines.part1.rar
https://uploadgig.com/file/download/A68f8c685D42fb32/jweci.Scrapy.masterclass.Python.web.scraping.and.data.pipelines.part2.rar
https://uploadgig.com/file/download/9ab00Ea8347f0258/jweci.Scrapy.masterclass.Python.web.scraping.and.data.pipelines.part3.rar

nitroflare.com:
Citar
https://nitroflare.com/view/E6B568A709DF124/jweci.Scrapy.masterclass.Python.web.scraping.and.data.pipelines.part1.rar
https://nitroflare.com/view/E8CDB70CD4D2FCC/jweci.Scrapy.masterclass.Python.web.scraping.and.data.pipelines.part2.rar
https://nitroflare.com/view/DCCE76BB1A3B5D3/jweci.Scrapy.masterclass.Python.web.scraping.and.data.pipelines.part3.rar

1dl.net:
Citar
https://1dl.net/0csj1zsmencr/jweci.Scrapy.masterclass.Python.web.scraping.and.data.pipelines.part1.rar.html
https://1dl.net/t9v83dnqje68/jweci.Scrapy.masterclass.Python.web.scraping.and.data.pipelines.part2.rar.html
https://1dl.net/exq75ymap3fx/jweci.Scrapy.masterclass.Python.web.scraping.and.data.pipelines.part3.rar.html