* Cantinho Satkeys

Refresh History
  • Gerard: j'espère que tous sont en train d'être bem
    Hoje às 13:28
  • Gerard: Boas tardes
    Hoje às 13:26
  • FELISCUNHA: ghyt74   49E09B4F  e bom fim de semana   4tj97u<z
    Hoje às 11:51
  • JPratas: try65hytr Pessoal  4tj97u<z classic k7y8j0
    Hoje às 03:29
  • yaro-82: 1994
    07 de Setembro de 2025, 16:49
  • FELISCUNHA: Votos de um santo domingo para todo o auditório  43e5r6
    07 de Setembro de 2025, 10:52
  • j.s.: tenham um excelente fim de semana  49E09B4F
    06 de Setembro de 2025, 17:07
  • j.s.: dgtgtr a todos  4tj97u<z
    06 de Setembro de 2025, 17:07
  • FELISCUNHA: Boa tarde pessoal  49E09B4F bom fim de semana  htg6454y
    05 de Setembro de 2025, 14:53
  • JPratas: try65hytr A Todos  4tj97u<z classic k7y8j0
    05 de Setembro de 2025, 03:10
  • cereal killa: dgtgtr pessoal  4tj97u<z
    03 de Setembro de 2025, 15:26
  • FELISCUNHA: ghyt74  pessoal   49E09B4F
    01 de Setembro de 2025, 11:36
  • j.s.: de regresso a casa  535reqef34
    31 de Agosto de 2025, 20:21
  • j.s.: try65hytr a todos  4tj97u<z
    31 de Agosto de 2025, 20:21
  • FELISCUNHA: ghyt74   49E09B4e bom fim de semana  4tj97u<z
    30 de Agosto de 2025, 11:48
  • henrike: try65hytr     k7y8j0
    29 de Agosto de 2025, 21:52
  • JPratas: try65hytr Pessoal 4tj97u<z 2dgh8i classic k7y8j0
    29 de Agosto de 2025, 03:57
  • cereal killa: dgtgtr pessoal  2dgh8i
    27 de Agosto de 2025, 12:28
  • FELISCUNHA: Votos de um santo domingo para todo o auditório  4tj97u<z
    24 de Agosto de 2025, 11:26
  • janstu10: reed
    24 de Agosto de 2025, 10:52

Autor Tópico: Deep Learning for NLP - Part 5  (Lida 111 vezes)

0 Membros e 1 Visitante estão a ver este tópico.

Offline mitsumi

  • Sub-Administrador
  • ****
  • Mensagens: 124987
  • Karma: +0/-0
Deep Learning for NLP - Part 5
« em: 13 de Agosto de 2021, 14:31 »
MP4 | Video: h264, 1280x720 | Audio: AAC, 44.1 KHz
Language: English | Size: 1.56 GB | Duration: 3h 31m

What you'll learn
Deep Learning for Natural Language Processing
Efficient Transformer Models: Star Transformers, Sparse Transformers, Reformer, Longformer, Linformer, Synthesizer
Efficient Transformer Models: ETC (Extended Transformer Construction), Big bird, Linear attention Transformer, Performer, Sparse Sinkhorn Transformer, Routing transformers
Efficient Transformer benchmark: Long Range Arena
Comparison of various efficient Transformer methods
DL for NLP
Requirements
Basics of machine learning
Basic understanding of Transformer based models and word embeddings
Description
This course is a part of "Deep Learning for NLP" Series. In this course, I will talk about various design schemes for efficient Transformer models. These techniques will come in very handy for academic as well as industry participants. For industry use cases, Transformer models have been shown to lead to very high accuracy values across many NLP tasks. But they have quadratic memory as well as computational complexity making it very difficult to ship them. Thus, this course which focuses on methods to make Transformers efficient is very critical for anyone who wants to ship Transformer models as part of their products.

Time and activation memory in Transformers grows quadratically with the sequence length. This is because in every layer, every attention head attempts to come up with a transformed representation for every position by "paying attention" to tokens at every other position. Quadratic complexity implies that practically the maximum input size is rather limited. Thus, we cannot extract semantic representation for long documents by passing them as input to Transformers. Hence, in this module we will talk about methods to address this challenge.

The course consists of two main sections as follows. In the two sections, I will talk about Efficient Transformer Models, Efficient Transformer benchmark and a Comparison of various efficient Transformer methods.

In the first section, I will talk about methods like Star Transformers, Sparse Transformers, Reformer, Longformer, Linformer, Synthesizer.

In the second section, I will talk about methods like ETC (Extended Transformer Construction), Big bird, Linear attention Transformer, Performer, Sparse Sinkhorn Transformer, Routing transformers. Long Range Arena is a recent benchmark for evaluating models on long sequence tasks with respect to accuracy, memory usage and inference time. We will discuss details about long range arena and finally wrap up with a philosophical categorization of various efficient Transformer methods.

For each method, we will discuss specific scheme for optimization, architecture and results obtained for pretraining as well as downstream tasks.

Who this course is for:
Beginners in deep learning
Python developers interested in data science concepts
Masters or PhD students who wish to learn deep learning concepts quickly
Folks wanting to ship their products across regions and languages (internationalization of their learning/predictive/generative models)

Screenshots


Download link:
Só visivel para registados e com resposta ao tópico.

Only visible to registered and with a reply to the topic.

Links are Interchangeable - No Password - Single Extraction