* Cantinho Satkeys

Refresh History
  • j.s.: tenham um excelente fim de semana  49E09B4F
    Hoje às 16:19
  • j.s.: dgtgtr a todos  4tj97u<z
    Hoje às 16:18
  • FELISCUNHA: ghyt74   49E09B4F  e bom fim de semana  4tj97u<z
    07 de Novembro de 2025, 12:04
  • JPratas: try65hytr Pessoal  2dgh8i classic k7y8j0 yu7gh8
    07 de Novembro de 2025, 03:38
  • j.s.: try65hytr a todos
    06 de Novembro de 2025, 19:11
  • FELISCUNHA: Votos de um santo domingo para todo o auditório  101041
    02 de Novembro de 2025, 11:58
  • j.s.: tenham um excelente domingo  49E09B4F
    02 de Novembro de 2025, 11:27
  • j.s.: ghyt74 a todos  4tj97u<z
    02 de Novembro de 2025, 11:26
  • FELISCUNHA: ghyt74   49E09B4F  e bom fim de semana  4tj97u<z
    01 de Novembro de 2025, 11:04
  • JPratas: try65hytr Pessoal  2dgh8i classic k7y8j0 yu7gh8
    31 de Outubro de 2025, 04:19
  • j.s.: try65hytr a todos  4tj97u<z
    30 de Outubro de 2025, 18:51
  • FELISCUNHA: ghyt74  pessoal  49E09B4F
    30 de Outubro de 2025, 11:38
  • haruri: Delta
    29 de Outubro de 2025, 07:54
  • FELISCUNHA: ghyt74   49E09B4F  e bom fim de semana  4tj97u<z
    25 de Outubro de 2025, 12:03
  • JPratas: try65hytr Pessoal  2dgh8i k7y8j0 yu7gh8
    24 de Outubro de 2025, 03:28
  • FELISCUNHA: Votos de um santo domingo para todo o auditório  4tj97u<z
    19 de Outubro de 2025, 11:16
  • j.s.: tenham um excelente domingo  43e5r6 49E09B4F
    19 de Outubro de 2025, 10:32
  • j.s.: ghyt74 a todos  4tj97u<z
    19 de Outubro de 2025, 10:32
  • FELISCUNHA: dgtgtr   49E09B4F  e bom fim de semana  4tj97u<z
    17 de Outubro de 2025, 12:08
  • JPratas: try65hytr Pessoal  4tj97u<z htg6454y k7y8j0
    17 de Outubro de 2025, 03:34

Autor Tópico: Deep Learning for NLP - Part 6  (Lida 100 vezes)

0 Membros e 1 Visitante estão a ver este tópico.

Online mitsumi

  • Sub-Administrador
  • ****
  • Mensagens: 126356
  • Karma: +0/-0
Deep Learning for NLP - Part 6
« em: 13 de Agosto de 2021, 14:54 »
MP4 | Video: h264, 1280x720 | Audio: AAC, 44.1 KHz
Language: English | Size: 1.38 GB | Duration: 2h 39m

What you'll learn
Deep Learning for Natural Language Processing
Popular Transformer encoder and decoder models
Multi-modal Transformer models
Large scale Transformer models
DL for NLP

Requirements
Basics of machine learning
Basic understanding of Transformer based models and word embeddings
Transformer Models like BERT and GPT
Description
This course is a part of "Deep Learning for NLP" Series. In this course, I will talk about various popular Transformer models beyond the ones I have already covered in the previous sessions in this series. Such Transformer models including encoder as well as decoder based models and differ in terms of various aspects like form of input, pretraining objectives, pretraining data, architecture variations, etc.

These Transformer models have been all proposed after 2019 and some of them are also from early 2021. Thus, as of Aug 2021, these models are very recent and state of the art across multiple NLP tasks.

The course consists of three main sections as follows.

In the first section, I will talk about a few Transformer encoder and decoder models which extend the original Transformer framework. Specifically I will cover SpanBERT, Electra, DeBERTa and DialoGPT. SpanBERT, Electra and DeBERTa are Transformer encoders while DialoGPT is a Transformer decoder model. For each model, we will also talk about their architecture or pretraining differs from standard Transformer. We will also talk important results on various NLP tasks.

In the second section, I will talk about multi-modal Transformer models. Multimodal learning has gained a lot of momentum in recent years. Thus, there was a need to come up with Transformer models which could handle text and image data together. In this part, I will cover VisualBERT and vilBERT which both process the multi-modal input very effectively. Both the models have many similarities. We will discuss about theri similarities and differences in detail.

Lastly, in the third section, I will talk about lareg scale Transformer models. I will introduce the mixture of experts (MoE) architecture. Then I will talk about how GShard adapts the MoE architecture, and shows great results on massive multilingual machine translation. Lastly, I will discuss Switch Transformers which simplify the MoE routing algorithm and also do several engineering optimizations to reduce network communciation and computation costs and mitigate instabilities.

In general, each of these papers is pretty long and thus it becomes very difficult and time consuming to understand them. In these sessions, I have tried to summarize them nicely bringing out the intuitions and tying the important concepts across such papers in a coherent story. Hope you will find it useful for your work and understanding.

Who this course is for:
Beginners in deep learning
Python developers interested in data science concepts
Masters or PhD students who wish to learn deep learning concepts quickly

Screenshots


Download link:
Só visivel para registados e com resposta ao tópico.

Only visible to registered and with a reply to the topic.

Links are Interchangeable - No Password - Single Extraction