当前位置:网站首页>[don't bother with reinforcement learning] video notes (I) 3. Why use reinforcement learning?
[don't bother with reinforcement learning] video notes (I) 3. Why use reinforcement learning?
2022-07-24 09:16:00 【Your sister Xuan】
【 Don't bother with reinforcement learning videos 】 The notebook
The first 3 section Why use reinforcement learning ?
Reinforcement learning is a big branch of machine learning , With the development of recent years , Reinforcement learning is also often combined with deep learning . in general , Reinforcement learning is to make your computer From scratch To learn , No need for any “ supervise (Um…… Basically, self-study , From little white to big guy ) Reference resources ” To learn how to choose actions , How to act to get Higher returns . I said before , Reinforcement learning has “ Score orientation ”.
Here are some small examples of reinforcement learning , The environment is, for example, maze , Where is the starting point , There is a wall , Where is the end ( These can be called States ), And the probability of state transition , Visualization and so on , Actions such as southeast, northwest 、 Up, down, left, right, etc . The computer keeps trying ( Early stage is like mental retardation ), And then constantly sum up the experience ( Update strategy ), The process of finally obtaining the optimal strategy .
youku Reinforcement learning simulation video
- Robots go through mazes

- Inverted pendulum

- Car climbing
Last one :【 Don't bother to strengthen learning 】 Video notes ( One )2. Summary of reinforcement learning methods
Next :【 Don't bother to strengthen learning 】 Video notes ( Two )1. What is? Q Learning
边栏推荐
- [FFH] websocket practice of real-time chat room
- Firewall off and on command
- xtrabackup 实现mysql的全量备份与增量备份
- [assembly language practice] (II). Write a program to calculate the value of expression w=v- (x+y+z-51) (including code and process screenshots)
- 在npm上发布自己的库
- 【汇编语言实战】(二)、编写一程序计算表达式w=v-(x+y+z-51)的值(含代码、过程截图)
- Android系统安全 — 5.3-APK V2签名介绍
- TT ecosystem - cross border in-depth selection
- Pulse netizens have a go interview question, can you answer it correctly?
- Why is TCP a triple handshake
猜你喜欢

数据中台:始于阿里,兴于DaaS

Android系统安全 — 5.2-APK V1签名介绍

Houdini notes

【汇编语言实战】(二)、编写一程序计算表达式w=v-(x+y+z-51)的值(含代码、过程截图)

Asyncdata cross domain error after nuxt route switching

Why does TCP shake hands three times instead of two times (positive version)

Data collection solution for forestry survey and patrol inspection

Tiflash source code reading (V) deltatree storage engine design and implementation analysis - Part 2

C语言练习题目+答案:

TiFlash 源码阅读(五) DeltaTree 存储引擎设计及实现分析 - Part 2
随机推荐
The difference between classification and regression
Xtrabackup realizes full backup and incremental backup of MySQL
Tiflash source code reading (V) deltatree storage engine design and implementation analysis - Part 2
web安全入门-开源防火墙Pfsense安装配置
How to integrate and use log4net logging plug-in in vs2019 class library
Houdini notes
gnuplot软件学习笔记
Linked list - 19. Delete the penultimate node of the linked list
& 和 &&、| 和 || 的区别
Android system security - 5.3-apk V2 signature introduction
Tiktok video traffic golden release time
Es document CRUD
How to import CAD files into the map new earth and accurately stack them with the image terrain tilt model
Leetcode102-二叉树的层序遍历详解
Tiktok live broadcast with goods marketing play
(5) Cloud integrated gateway gateway +swagger documentation tool
The difference between & &, | and |
Little dolphin "transformed" into a new intelligent scheduling engine, which can be explained in simple terms in the practical development and application of DDS
C语言练习题目+答案:
Read write lock, shared lock, exclusive lock