* Cantinho Satkeys

Refresh History
  • FELISCUNHA: ghyt74   49E09B4F  e bom feriado   4tj97u<z
    Hoje às 10:39
  • JPratas: try65hytr Pessoal  h7ft6l k7y8j0
    Hoje às 03:51
  • j.s.: try65hytr a todos  4tj97u<z
    30 de Outubro de 2024, 21:00
  • JPratas: dgtgtr Pessoal  4tj97u<z k7y8j0
    28 de Outubro de 2024, 17:35
  • FELISCUNHA: Votos de um santo domingo para todo o auditório  k8h9m
    27 de Outubro de 2024, 11:21
  • j.s.: bom fim de semana   49E09B4F 49E09B4F
    26 de Outubro de 2024, 17:06
  • j.s.: dgtgtr a todos  4tj97u<z
    26 de Outubro de 2024, 17:06
  • FELISCUNHA: ghyt74   49E09B4F  e bom fim de semana
    26 de Outubro de 2024, 11:49
  • JPratas: try65hytr Pessoal  101yd91 k7y8j0
    25 de Outubro de 2024, 03:53
  • JPratas: dgtgtr A Todos  4tj97u<z 2dgh8i k7y8j0
    23 de Outubro de 2024, 16:31
  • FELISCUNHA: ghyt74  pessoal   49E09B4F
    23 de Outubro de 2024, 10:59
  • j.s.: dgtgtr a todos  4tj97u<z
    22 de Outubro de 2024, 18:16
  • j.s.: dgtgtr a todos  4tj97u<z
    20 de Outubro de 2024, 15:04
  • FELISCUNHA: Votos de um santo domingo para todo o auditório  101041
    20 de Outubro de 2024, 11:37
  • axlpoa: hi
    19 de Outubro de 2024, 22:24
  • FELISCUNHA: ghyt74   49E09B4F  e bom fim de semana  4tj97u<z
    19 de Outubro de 2024, 11:31
  • j.s.: ghyt74 a todos  4tj97u<z
    18 de Outubro de 2024, 09:33
  • JPratas: try65hytr Pessoal  4tj97u<z classic k7y8j0
    18 de Outubro de 2024, 03:28
  • schmeagle: iheartradio
    17 de Outubro de 2024, 22:58
  • j.s.: dgtgtr a todos  4tj97u<z
    17 de Outubro de 2024, 18:09

Autor Tópico: Coursera - How to Win a Data Science Competition Learn from Top Kagglers  (Lida 117 vezes)

0 Membros e 1 Visitante estão a ver este tópico.

Online mitsumi

  • Moderador Global
  • ***
  • Mensagens: 115395
  • Karma: +0/-0

Coursera - How to Win a Data Science Competition: Learn from Top Kagglers
WEBRip | English | MP4 | 1280 x 720 | AVC ~358 kbps | 25 fps
AAC | 128 Kbps | 44.1 KHz | 2 channels | Subs: English (.srt) | ~10 hours | 2.02 GB
Genre: eLearning Video / Computer Science, Data Science
If you want to break into competitive data science, then this course is for you! Participating in predictive modelling competitions can help you gain practical experience, improve and harness your data modelling skills in various domains such as credit, insurance, marketing, natural language processing, sales' forecasting and computer vision to name a few.

At the same time you get to do it in a competitive context against thousands of participants where each one tries to build the most predictive algorithm. Pushing each other to the limit can result in better performance and smaller prediction errors. Being able to achieve high ranks consistently can help you accelerate your career in data science.

In this course, you will learn to analyse and solve competitively such predictive modelling tasks.

When you finish this class, you will:

- Understand how to solve predictive modelling competitions efficiently and learn which of the skills obtained can be applicable to real-world tasks.
- Learn how to preprocess the data and generate new features from various sources such as text and images.
- Be taught advanced feature engineering techniques like generating mean-encodings, using aggregated statistical measures or finding nearest neighbors as a means to improve your predictions.
- Be able to form reliable cross validation methodologies that help you benchmark your solutions and avoid overfitting or underfitting when tested with unobserved (test) data.
- Gain experience of analysing and interpreting the data. You will become aware of inconsistencies, high noise levels, errors and other data-related issues such as leakages and you will learn how to overcome them.
- Acquire knowledge of different algorithms and learn how to efficiently tune their hyperparameters and achieve top performance.
- Master the art of combining different machine learning models and learn how to ensemble.
- Get exposed to past (winning) solutions and codes and learn how to read them.

Disclaimer : This is not a machine learning course in the general sense. This course will teach you how to get high-rank solutions against thousands of competitors with focus on practical usage of machine learning methods rather than the theoretical underpinnings behind them.

Prerequisites:
- Python: work with DataFrames in pandas, Description figures in matDescriptionlib, import and train models from scikit-learn, XGBoost, LightGBM.
- Machine Learning: basic understanding of linear models, K-NN, random forest, gradient boosting and neural networks.

Syllabus

Introduction & Recap
-This week we will introduce you to competitive data science. You will learn about competitions' mechanics, the difference between competitions and a real life data science, hardware and software that people usually use in competitions. We will also briefly recap major ML models frequently used in competitions.

Feature Preprocessing and Generation with Respect to Models
-In this module we will summarize approaches to work with features: preprocessing, generation and extraction. We will see, that the choice of the machine learning model impacts both preprocessing we apply to the features and our approach to generation of new ones. We will also discuss feature extraction from text with Bag Of Words and Word2vec, and feature extraction from images with Convolution Neural Networks.

Final Project Description
-This is just a reminder, that the final project in this course is better to start soon! The final project is in fact a competition, in this module you can find an information about it.

Exploratory Data Analysis
-We will start this week with Exploratory Data Analysis (EDA). It is a very broad and exciting topic and an essential component of solving process. Besides regular videos you will find a walk through EDA process for Springleaf competition data and an example of prolific EDA for NumerAI competition with extraordinary findings.

Validation
-In this module we will discuss various validation strategies. We will see that the strategy we choose depends on the competition setup and that correct validation scheme is one of the bricks for any winning solution.

Data Leakages
-Finally, in this module we will cover something very unique to data science competitions. That is, we will see examples how it is sometimes possible to get a top position in a competition with a very little machine learning, just by exploiting a data leakage.

Metrics Optimization
-This week we will first study another component of the competitions: the evaluation metrics. We will recap the most prominent ones and then see, how we can efficiently optimize a metric given in a competition.

Advanced Feature Engineering I
-In this module we will study a very powerful technique for feature generation. It has a lot of names, but here we call it "mean encodings". We will see the intuition behind them, how to construct them, regularize and extend them.

Hyperparameter Optimization
-In this module we will talk about hyperparameter optimization process. We will also have a special video with practical tips and tricks, recorded by four instructors.

Advanced feature engineering II
-In this module we will learn about a few more advanced feature engineering techniques.

Ensembling
-Nowadays it is hard to find a competition won by a single model! Every winning solution incorporates ensembles of models. In this module we will talk about the main ensembling techniques in general, and, of course, how it is better to ensemble the models in practice.

Competitions go through
-For the 5th week we've prepared for you several "walk-through" videos. In these videos we discuss solutions to competitions we took prizes at. The video content is quite short this week to let you spend more time on the final project. Good luck!

Final Project
-Final project for the course.

        General
Complete name                            : 010. Numeric features.mp4
Format                                   : MPEG-4
Format profile                           : Base Media
Codec ID                                 : isom (isom/iso2/avc1/mp41)
File size                                : 48.3 MiB
Duration                                 : 13 min 41 s
Overall bit rate                         : 493 kb/s
Writing application                      : Lavf55.33.100

Video
ID                                       : 1
Format                                   : AVC
Format/Info                              : Advanced Video Codec
Format profile                           : Main@L3.1
Format settings                          : CABAC / 4 Ref Frames
Format settings, CABAC                   : Yes
Format settings, RefFrames               : 4 frames
Codec ID                                 : avc1
Codec ID/Info                            : Advanced Video Coding
Duration                                 : 13 min 41 s
Bit rate                                 : 358 kb/s
Width                                    : 1 280 pixels
Height                                   : 720 pixels
Display aspect ratio                     : 16:9
Frame rate mode                          : Constant
Frame rate                               : 25.000 FPS
Color space                              : YUV
Chroma subsampling                       : 4:2:0
Bit depth                                : 8 bits
Scan type                                : Progressive
Bits/(Pixel*Frame)                       : 0.016
Stream size                              : 35.1 MiB (73%)
Writing library                          : x264 core 142
Encoding settings                        : cabac=1 / ref=3 / deblock=1:0:0 / analyse=0x1:0x111 / me=hex / subme=7 / psy=1 / psy_rd=1.00:0.00 / mixed_ref=1 / me_range=16 / chroma_me=1 / trellis=1 / 8x8dct=0 / cqm=0 / deadzone=21,11 / fast_pskip=1 / chroma_qp_offset=-2 / threads=12 / lookahead_threads=2 / sliced_threads=0 / nr=0 / decimate=1 / interlaced=0 / bluray_compat=0 / constrained_intra=0 / bframes=3 / b_pyramid=2 / b_adapt=1 / b_bias=0 / direct=1 / weightb=1 / open_gop=0 / weightp=2 / keyint=250 / keyint_min=25 / scenecut=40 / intra_refresh=0 / rc_lookahead=40 / rc=crf / mbtree=1 / crf=24.0 / qcomp=0.60 / qpmin=0 / qpmax=69 / qpstep=4 / ip_ratio=1.40 / aq=1:1.00
Language                                 : English

Audio
ID                                       : 2
Format                                   : AAC
Format/Info                              : Advanced Audio Codec
Format profile                           : LC
Codec ID                                 : mp4a-40-2
Duration                                 : 13 min 41 s
Duration_LastFrame                       : -6 ms
Bit rate mode                            : Constant
Bit rate                                 : 128 kb/s
Channel(s)                               : 2 channels
Channel positions                        : Front: L R
Sampling rate                            : 44.1 kHz
Frame rate                               : 43.066 FPS (1024 SPF)
Compression mode                         : Lossy
Stream size                              : 12.5 MiB (26%)
Language                                 : English
Default                                  : Yes
Alternate group                          : 1   

Screenshots
   

   
Download link:
Só visivel para registados e com resposta ao tópico.

Only visible to registered and with a reply to the topic.

Links are Interchangeable - No Password - Single Extraction