当前位置:网站首页>【ICML2022】CtrlFormer: 通过Transformer学习视觉控制的可迁移状态表示
【ICML2022】CtrlFormer: 通过Transformer学习视觉控制的可迁移状态表示
2022-06-21 18:16:00 【智源社区】

Transformer在学习视觉和语言表示方面取得了巨大的成功,这在各种下游任务中都是通用的。在视觉控制中,学习可在不同控制任务间迁移的可迁移状态表示对于减少训练样本的大小具有重要意义。然而,将Transformer移植到采样高效的视觉控制仍然是一个具有挑战性和未解决的问题。为此,我们提出了一种新颖的控制Transformer(CtrlFormer),它具有许多现有技术所没有的吸引人的优点。首先,CtrlFormer在不同控制任务之间联合学习视觉令牌和策略令牌之间的自注意力机制,可以在不发生灾难性遗忘的情况下学习和迁移多任务表示。其次,我们精心设计了一个对比强化学习范式来训练CtrlFormer,使其能够达到较高的样本效率,这在控制问题中是非常重要的。例如,在DMControl基准测试中,不像最近的先进方法在使用100k样本迁移学习后在“Cartpole”任务中产生零分而失败,CtrlFormer可以在仅使用100k样本的情况下获得769±34的最先进的分数,同时保持之前任务的性能。代码和模型发布在我们的项目主页上。
论文链接:
https://arxiv.org/abs/2206.08883

边栏推荐
- With a playback volume of up to 4000w+, how do couples get out of the ring by scattering dog food?
- The R language uses the follow up The plot function visualizes the longitudinal follow-up chart of multiple ID (case) monitoring indicators, and uses line Col parameter custom curve color (color)
- 2022年6月25日PMP考试通关宝典-5
- Nepal graph has settled in Alibaba cloud computing nest to help enterprises build a super large-scale map database on the cloud
- Jupyter notebook compiles ipynb files into latex and then converts them into PDF
- How to create network redundancy for network managed national production reinforced switch
- 剑指 Offer II 029. 排序的循环链表
- Is it safe to open futures accounts online? Can I open an account without going offline?
- Medical expense list can be entered at a second speed, and OCR recognition can help double the efficiency
- 系统集成项目管理工程师(软考中级)怎么备考?
猜你喜欢

机器学习之神经网络与支持向量机

DataGear 使用坐标映射表制作地理坐标数据可视化看板

W10添加系统环境变量Path

Forwarding to remind metamask how to deal with the potential private key disclosure of the expansion program

ArrayList源码解析

Gradle下载与安装配置

HMS core machine learning service ID card identification function to achieve efficient information entry

yolov5训练自己的数据集报错记录

RecycleView懒加载失效问题(二)

在 KubeSphere 上部署 Apache Pulsar
随机推荐
nacos-配置中心-源码
新手使用APICloud可视化开发搭建商城主页
删除倒数第k个节点-链表专题
TensorFlow 2:使用神经网络对Fashion MNIST分类并进行比较分析
How to set the picture background to transparent
Leetcode personal question solution (Sword finger offer 21-25) 21 Adjust the array order so that odd numbers precede even numbers, 22 The penultimate node in the linked list, 24 Reverse linked list, 2
[high frequency interview questions] difficulty 1.5/5, classic "prefix and + two points" application questions
Whether Gorm database needs to set foreign keys
How to use devaxpress WPF to create the first MVVM application in winui?
Yolov5 trains its own data set to report error records
R language uses the statstack function of epidisplay package to view the statistics (mean, median, etc.) and corresponding hypothesis tests of continuous variables in a hierarchical manner based on fa
QT creator 7.0 frequently asked questions and common usage
[interval and topic prefix and] prefix and + hash table application questions
论文解读(USIB)《Towards Explanation for Unsupervised Graph-Level Representation Learning》
Two problems that may occur in the use of ThreadLocal and thread pool
Patch package cannot be used to patch pnpm
Security框架中使用FastJson反序列化SimpleGrantedAuthority
系统集成项目管理工程师(软考中级)怎么备考?
When the move protocol beta is in progress, the ecological core equity Momo is divided
剑指 Offer II 029. 排序的循环链表