当前位置：网站首页>[don't bother with reinforcement learning] video notes (I) 1. What is reinforcement learning?

[don't bother with reinforcement learning] video notes (I) 1. What is reinforcement learning?

2022-07-24 09:16:00 【Your sister Xuan】

【 Don't bother with reinforcement learning videos 】 The notebook

Section 1 What is reinforcement learning ？

We humans are learning , Always know nothing from the beginning , After constant attempts and corrections , The process of finally obtaining the correct solution to the problem , This can be seen as a Strengthen the learning process .
actual , There are many examples of reinforcement learning ：

Alpha-Go A master who defeats human beings on the go field Alpha-Go Baidu Encyclopedia
Let the computer learn how to play some classic games , Such as Atari game ：

These are all for the computer to constantly try and learn the code of conduct , To win the go game or get high scores in the brick game .

How to learn ？

Imagine a virtual teacher teaching computers how to learn , But he can only Rate your behavior . So how to learn through these scores ？ It's simple , By remembering high scores 、 Low scores correspond to behaviors , Avoid low marks in learning , Sum up experience in behavior . This feature can be called Score orientation .

further , stay Supervised learning in , We need to get data and labels , however At first there was no data or labels , It's through Interact with the environment again and again Produce behavior , And get the corresponding label , Then learn which data can correspond to which labels , By learning this Law , To get behaviors that can get high scores . As the following example ：
Strengthen the learning process From Mo fan Python

actually , At first, there was a blank table （ It's like Windows Card game table ）, There are only two parts: data and labels . Our goal is to try to make some happy expressions , To get a higher score .
We keep making expressions （ Suppose we don't know what expression is happy （ High marks ） Or sad （ Low score ））,“ Virtual teacher ” Will tell us whether your expression is low or high （ That's the label ）, In this way, we will get a lot data and label La .
We get labels by making a lot of expressions and get certain rules from them , After your bitter lesson , You will find that if you do it, you will get high marks , If you do it, you will get a low score .
In order to get high marks , Will always do .

What algorithms are there for reinforcement learning ？

There are many kinds of reinforcement learning algorithms , for example ：

Choose behavior through value ：Q Study 、Sarsa Study （ Both of them are in the form of tables , That's data discrete ）、DQN（Deep Q Network Using neural networks ）
Direct selection behavior ：Policy Gradients（ Policy gradient ）
Imagine the environment and learn from it （ This is really , There is no environment ）： Model based reinforcement learning （Model Based RL）

Next ：【 Don't bother to strengthen learning 】 Video notes （ One ）2. Summary of reinforcement learning methods

原网站

版权声明
本文为[Your sister Xuan]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/204/202207221617233183.html