* Cantinho Satkeys

Refresh History
  • JPratas: try65hytr Pessoal  4tj97u<z classic k7y8j0
    Hoje às 03:29
  • yaro-82: 1994
    07 de Setembro de 2025, 16:49
  • FELISCUNHA: Votos de um santo domingo para todo o auditório  43e5r6
    07 de Setembro de 2025, 10:52
  • j.s.: tenham um excelente fim de semana  49E09B4F
    06 de Setembro de 2025, 17:07
  • j.s.: dgtgtr a todos  4tj97u<z
    06 de Setembro de 2025, 17:07
  • FELISCUNHA: Boa tarde pessoal  49E09B4F bom fim de semana  htg6454y
    05 de Setembro de 2025, 14:53
  • JPratas: try65hytr A Todos  4tj97u<z classic k7y8j0
    05 de Setembro de 2025, 03:10
  • cereal killa: dgtgtr pessoal  4tj97u<z
    03 de Setembro de 2025, 15:26
  • FELISCUNHA: ghyt74  pessoal   49E09B4F
    01 de Setembro de 2025, 11:36
  • j.s.: de regresso a casa  535reqef34
    31 de Agosto de 2025, 20:21
  • j.s.: try65hytr a todos  4tj97u<z
    31 de Agosto de 2025, 20:21
  • FELISCUNHA: ghyt74   49E09B4e bom fim de semana  4tj97u<z
    30 de Agosto de 2025, 11:48
  • henrike: try65hytr     k7y8j0
    29 de Agosto de 2025, 21:52
  • JPratas: try65hytr Pessoal 4tj97u<z 2dgh8i classic k7y8j0
    29 de Agosto de 2025, 03:57
  • cereal killa: dgtgtr pessoal  2dgh8i
    27 de Agosto de 2025, 12:28
  • FELISCUNHA: Votos de um santo domingo para todo o auditório  4tj97u<z
    24 de Agosto de 2025, 11:26
  • janstu10: reed
    24 de Agosto de 2025, 10:52
  • FELISCUNHA: ghyt74   49E09B4F  e bom fim de semana  4tj97u<z
    23 de Agosto de 2025, 12:03
  • joca34: cd Vem dançar Kuduro Summer 2025
    22 de Agosto de 2025, 23:07
  • joca34: cd Kizomba Mix 2025
    22 de Agosto de 2025, 23:06

Autor Tópico: Modern Reinforcement Learning: Actor-Critic Methods  (Lida 142 vezes)

0 Membros e 1 Visitante estão a ver este tópico.

Offline mitsumi

  • Sub-Administrador
  • ****
  • Mensagens: 124987
  • Karma: +0/-0
Modern Reinforcement Learning: Actor-Critic Methods
« em: 19 de Outubro de 2020, 12:23 »

Modern Reinforcement Learning: Actor-Critic Methods
Video: .mp4 (1280x720, 30 fps(r)) | Audio: aac, 48000 Hz, 2ch | Size: 3.47 GB
Genre: eLearning Video | Duration: 58 lectures (8 hour, 10 mins) | Language: English
 How to Implement Cutting Edge Artificial Intelligence Research Papers in the Open AI Gym Using the PyTorch Framework

What you'll learn

    How to code policy gradient methods in PyTorch
    How to code Deep Deterministic Policy Gradients (DDPG) in PyTorch
    How to code Twin Delayed Deep Deterministic Policy Gradients (TD3) in PyTorch
    How to code actor critic algorithms in PyTorch
    How to implement cutting edge artificial intelligence research papers in Python

Requirements

    Understanding of college level calculus
    Prior courses in reinforcement learning
    Able to code deep neural networks independently

Description

In this advanced course on deep reinforcement learning, you will learn how to implement policy gradient, actor critic, deep deterministic policy gradient (DDPG), and twin delayed deep deterministic policy gradient (TD3) algorithms in a variety of challenging environments from the Open AI gym.

The course begins with a practical review of the fundamentals of reinforcement learning, including topics such as:

    The Bellman Equation

    Markov Decision Processes

    Monte Carlo Prediction

    Monte Carlo Control

    Temporal Difference Prediction TD(0)

    Temporal Difference Control with Q Learning

And moves straight into coding up our first agent: a blackjack playing artificial intelligence. From there we will progress to teaching an agent to balance the cart pole using Q learning.

After mastering the fundamentals, the pace quickens, and we move straight into an introduction to policy gradient methods. We cover the REINFORCE algorithm, and use it to teach an artificial intelligence to land on the moon in the lunar lander environment from the Open AI gym. Next we progress to coding up the one step actor critic algorithm, to again beat the lunar lander.

With the fundamentals out of the way, we move on to our harder projects: implementing deep reinforcement learning research papers. We will start with Deep Deterministic Policy Gradients, which is an algorithm for teaching robots to excel at a variety of continuous control tasks.

Finally, we implement a state of the art artificial intelligence algorithm: Twin Delayed Deep Deterministic Policy Gradients. This algorithm sets a new benchmark for performance in robotic control tasks, and we will demonstrate world class performance in the Bipedal Walker environment from the Open AI gym.

By the end of the course, you will know the answers to the following fundamental questions in Actor-Critic methods:

    Why should we bother with actor critic methods when deep Q learning is so successful?

    Can the advances in deep Q learning be used in other fields of reinforcement learning?

    How can we solve the explore-exploit dilemma with a deterministic policy?

    How do we get overestimation bias in actor-critic methods?

    How do we deal with the inherent errors in deep neural networks?

This course is for the highly motivated and advanced student. To succeed, you must have prior course work in all the following topics:

    College level calculus

    Reinforcement learning

    Deep learning

The pace of the course is brisk, but the payoff is that you will come out knowing how to read cutting edge research papers and turn them into functional code as quickly as possible.

Who this course is for:

    Advanced students of artificial intelligence who want to implement state of the art academic research papers

Download link:
Só visivel para registados e com resposta ao tópico.

Only visible to registered and with a reply to the topic.

Links are Interchangeable - No Password - Single Extraction