Autor Tópico: Coursera - Practical Reinforcement Learning (Higher School of Economics) (Lida 447 vezes)

mitsumi · « **em:** 27 de Abril de 2020, 08:14 »

Welcome to the Reinforcement Learning course. Here you will find out about: - foundations of RL methods: value/policy iteration, q-learning, policy gradient, etc. -- with math & batteries included - using deep neural networks for RL tasks -- also known as "the hype train"

- state of the art RL algorithms
-- and how to apply duct tape to them for practical problems.

- and, of course, teaching your neural network to play games
-- because that's what everyone thinks RL is about. We'll also use it for seq2seq and contextual bandits.

Jump in. It's gonna be fun!

Syllabus

Intro: why should I care?
-In this module we are gonna define and "taste" what reinforcement learning is about. We'll also learn one simple algorithm that can solve reinforcement learning problems with embarrassing efficiency.

At the heart of RL: Dynamic Programming
-This week we'll consider the reinforcement learning formalisms in a more rigorous, mathematical way. You'll learn how to effectively compute the return your agent gets for a particular action - and how to pick best actions based on that return.

Model-free methods
-This week we'll find out how to apply last week's ideas to the real world problems: ones where you don't have a perfect model of your environment.

Approximate Value Based Methods
-This week we'll learn to scale things even farther up by training agents based on neural networks.

Policy-based methods
-We spent 3 previous modules working on the value-based methods: learning state values, action values and whatnot. Now's the time to see an alternative approach that doesn't require you to predict all future rewards to learn something.

Exploration
-In this final week you'll learn how to build better exploration strategies with a focus on contextual bandit setup. In honor track, you'll also learn how to apply reinforcement learning to train structured deep learning models.

General
Complete name : 020. Monte-Carlo & Temporal Difference; Q-learning.mp4
Format : MPEG-4
Format profile : Base Media
Codec ID : isom (isom/iso2/avc1/mp41)
File size : 30.1 MiB
Duration : 8 min 50 s
Overall bit rate : 476 kb/s
Writing application : Lavf55.33.100

Video
ID : 1
Format : AVC
Format/Info : Advanced Video Codec
Format profile : Main@L3.1
Format settings : CABAC / 4 Ref Frames
Format settings, CABAC : Yes
Format settings, RefFrames : 4 frames
Codec ID : avc1
Codec ID/Info : Advanced Video Coding
Duration : 8 min 50 s
Bit rate : 341 kb/s
Width : 1 280 pixels
Height : 720 pixels
Display aspect ratio : 16:9
Frame rate mode : Constant
Frame rate : 25.000 FPS
Color space : YUV
Chroma subsampling : 4:2:0
Bit depth : 8 bits
Scan type : Progressive
Bits/(Pixel*Frame) : 0.015
Stream size : 21.6 MiB (72%)
Writing library : x264 core 142
Encoding settings : cabac=1 / ref=3 / deblock=1:0:0 / analyse=0x1:0x111 / me=hex / subme=7 / psy=1 / psy_rd=1.00:0.00 / mixed_ref=1 / me_range=16 / chroma_me=1 / trellis=1 / 8x8dct=0 / cqm=0 / deadzone=21,11 / fast_pskip=1 / chroma_qp_offset=-2 / threads=12 / lookahead_threads=2 / sliced_threads=0 / nr=0 / decimate=1 / interlaced=0 / bluray_compat=0 / constrained_intra=0 / bframes=3 / b_pyramid=2 / b_adapt=1 / b_bias=0 / direct=1 / weightb=1 / open_gop=0 / weightp=2 / keyint=250 / keyint_min=25 / scenecut=40 / intra_refresh=0 / rc_lookahead=40 / rc=crf / mbtree=1 / crf=24.0 / qcomp=0.60 / qpmin=0 / qpmax=69 / qpstep=4 / ip_ratio=1.40 / aq=1:1.00
Language : English

Audio
ID : 2
Format : AAC
Format/Info : Advanced Audio Codec
Format profile : LC
Codec ID : mp4a-40-2
Duration : 8 min 50 s
Duration_LastFrame : -21 ms
Bit rate mode : Constant
Bit rate : 128 kb/s
Channel(s) : 2 channels
Channel positions : Front: L R
Sampling rate : 44.1 kHz
Frame rate : 43.066 FPS (1024 SPF)
Compression mode : Lossy
Stream size : 8.10 MiB (27%)
Language : English
Default : Yes
Alternate group : 1

Screenshots

Download link:

Só visivel para registados e com resposta ao tópico.

Only visible to registered and with a reply to the topic.

Links are Interchangeable - No Password - Single Extraction

Cantinho Satkeys

Autor Tópico: Coursera - Practical Reinforcement Learning (Higher School of Economics) (Lida 447 vezes)

mitsumi

Coursera - Practical Reinforcement Learning (Higher School of Economics)