当前位置:网站首页>强化学习2
强化学习2
2022-07-22 18:47:00 【大力力无穷】
马尔可夫过程(MP) 马尔可夫奖励过程(MRP)
只取决于现在 :马尔可夫
Horizon:一个回合的长度(每个回合最大的时间步数)由有限个步数决定
Return:(回报)奖励的逐步叠加
需要折扣因子的原因:有些马尔可夫过程带环,没有终结(避免无穷)
把这个不确定性表示出来,希望尽可能得到模型,而不是在未来某一点得到奖励
希望立即达到奖励超参数:Discount factor
贝尔曼方程
蒙特卡曼
动态规划:当最后更新状态跟上次状态差别不大的时候停止(Bootstrapping)
.断续器
状态-价值函数
边栏推荐
- MNIST dataset
- Learning Pyramid-Context Encoder Network for HighQuality Image Inpainting 论文笔记
- [foundation 2] - container
- 2019_ IJCAI_ Adapting BERT for Target-Oriented Multimodal Sentiment Classification
- Video knowledge points (17) - flv Skills of playing local video files with JS
- IM即时通讯开发时手机信号为什么会差
- [strong net cup 2019] casual note
- 【考研词汇训练营】Day 10 —— capital,expand,force,adapt,depand
- Introduction to distributed learning and federated learning
- How to configure a cute little shark theme for typera?
猜你喜欢

Tan Zhongyi, the initiator of xingce community: promote the intelligent transformation of enterprises by means of open source

Source code analysis of robot arm manipulator

安装不了schedule

China's open source is moving towards the second tier!

Evolution Atlas of interface documents. People who have used the first interface document tool are exposed to their age

怎么为typora配置一个可爱的小鲨鱼主题?

Conditions affecting interface query speed

兆易创新GD25WDxxK6 SPI NOR Flash产品系列问世
![[strong net cup 2019] casual note](/img/a4/4c7f647f2dc8e535699e8e5fa25685.png)
[strong net cup 2019] casual note

If there is only express delivery order number, how to query the logistics progress and check the order number of the delivery
随机推荐
使用mediapipe和OpenCV 实现简单人脸检测
第6.3章:ARM架构下手动编译StarRocks(拓展篇)
分布式学习和联邦学习简介
JS complex data type
OSPF中LSA相关内容
Memory allocation of string in JVM
最大连续子序列--每日一题
[basic 4] - document reading and writing, module
Stack overflow basic exercise - 5 (string vulnerability)
Derivative in R language
【考研词汇训练营】Day 10 —— capital,expand,force,adapt,depand
Memory leaks and overflows
[Fifth space 2019 finals]pwn5 - two solutions
Redis cluster setup
Ni Guangnan, academician of the Chinese Academy of Engineering: embrace open source and world collaborative innovation
最新可用的二维码生成 api
图文并茂演示小程序movable-view的可移动范围
[SUCTF 2019]EasySQL
urllib下载(urlretrieve())
R 语言绘制 倾斜图