* Cantinho Satkeys

Refresh History
  • FELISCUNHA: Votos de um santo domingo para todo o auditório  4tj97u<z
    03 de Novembro de 2024, 10:49
  • j.s.: bom fim de semana  43e5r6 49E09B4F
    02 de Novembro de 2024, 08:37
  • j.s.: ghyt74 a todos  4tj97u<z
    02 de Novembro de 2024, 08:36
  • FELISCUNHA: ghyt74   49E09B4F  e bom feriado   4tj97u<z
    01 de Novembro de 2024, 10:39
  • JPratas: try65hytr Pessoal  h7ft6l k7y8j0
    01 de Novembro de 2024, 03:51
  • j.s.: try65hytr a todos  4tj97u<z
    30 de Outubro de 2024, 21:00
  • JPratas: dgtgtr Pessoal  4tj97u<z k7y8j0
    28 de Outubro de 2024, 17:35
  • FELISCUNHA: Votos de um santo domingo para todo o auditório  k8h9m
    27 de Outubro de 2024, 11:21
  • j.s.: bom fim de semana   49E09B4F 49E09B4F
    26 de Outubro de 2024, 17:06
  • j.s.: dgtgtr a todos  4tj97u<z
    26 de Outubro de 2024, 17:06
  • FELISCUNHA: ghyt74   49E09B4F  e bom fim de semana
    26 de Outubro de 2024, 11:49
  • JPratas: try65hytr Pessoal  101yd91 k7y8j0
    25 de Outubro de 2024, 03:53
  • JPratas: dgtgtr A Todos  4tj97u<z 2dgh8i k7y8j0
    23 de Outubro de 2024, 16:31
  • FELISCUNHA: ghyt74  pessoal   49E09B4F
    23 de Outubro de 2024, 10:59
  • j.s.: dgtgtr a todos  4tj97u<z
    22 de Outubro de 2024, 18:16
  • j.s.: dgtgtr a todos  4tj97u<z
    20 de Outubro de 2024, 15:04
  • FELISCUNHA: Votos de um santo domingo para todo o auditório  101041
    20 de Outubro de 2024, 11:37
  • axlpoa: hi
    19 de Outubro de 2024, 22:24
  • FELISCUNHA: ghyt74   49E09B4F  e bom fim de semana  4tj97u<z
    19 de Outubro de 2024, 11:31
  • j.s.: ghyt74 a todos  4tj97u<z
    18 de Outubro de 2024, 09:33

Autor Tópico: Deep Learning for NLP - Part 6  (Lida 62 vezes)

0 Membros e 1 Visitante estão a ver este tópico.

Online mitsumi

  • Moderador Global
  • ***
  • Mensagens: 115810
  • Karma: +0/-0
Deep Learning for NLP - Part 6
« em: 13 de Agosto de 2021, 14:54 »
MP4 | Video: h264, 1280x720 | Audio: AAC, 44.1 KHz
Language: English | Size: 1.38 GB | Duration: 2h 39m

What you'll learn
Deep Learning for Natural Language Processing
Popular Transformer encoder and decoder models
Multi-modal Transformer models
Large scale Transformer models
DL for NLP

Requirements
Basics of machine learning
Basic understanding of Transformer based models and word embeddings
Transformer Models like BERT and GPT
Description
This course is a part of "Deep Learning for NLP" Series. In this course, I will talk about various popular Transformer models beyond the ones I have already covered in the previous sessions in this series. Such Transformer models including encoder as well as decoder based models and differ in terms of various aspects like form of input, pretraining objectives, pretraining data, architecture variations, etc.

These Transformer models have been all proposed after 2019 and some of them are also from early 2021. Thus, as of Aug 2021, these models are very recent and state of the art across multiple NLP tasks.

The course consists of three main sections as follows.

In the first section, I will talk about a few Transformer encoder and decoder models which extend the original Transformer framework. Specifically I will cover SpanBERT, Electra, DeBERTa and DialoGPT. SpanBERT, Electra and DeBERTa are Transformer encoders while DialoGPT is a Transformer decoder model. For each model, we will also talk about their architecture or pretraining differs from standard Transformer. We will also talk important results on various NLP tasks.

In the second section, I will talk about multi-modal Transformer models. Multimodal learning has gained a lot of momentum in recent years. Thus, there was a need to come up with Transformer models which could handle text and image data together. In this part, I will cover VisualBERT and vilBERT which both process the multi-modal input very effectively. Both the models have many similarities. We will discuss about theri similarities and differences in detail.

Lastly, in the third section, I will talk about lareg scale Transformer models. I will introduce the mixture of experts (MoE) architecture. Then I will talk about how GShard adapts the MoE architecture, and shows great results on massive multilingual machine translation. Lastly, I will discuss Switch Transformers which simplify the MoE routing algorithm and also do several engineering optimizations to reduce network communciation and computation costs and mitigate instabilities.

In general, each of these papers is pretty long and thus it becomes very difficult and time consuming to understand them. In these sessions, I have tried to summarize them nicely bringing out the intuitions and tying the important concepts across such papers in a coherent story. Hope you will find it useful for your work and understanding.

Who this course is for:
Beginners in deep learning
Python developers interested in data science concepts
Masters or PhD students who wish to learn deep learning concepts quickly

Screenshots


Download link:
Só visivel para registados e com resposta ao tópico.

Only visible to registered and with a reply to the topic.

Links are Interchangeable - No Password - Single Extraction