当前位置:网站首页>UCLA | generative pre training for black box optimization
UCLA | generative pre training for black box optimization
2022-06-25 04:24:00 【Zhiyuan community】
The figure on the left shows an offline problem on a one-dimensional problem BBO example . The red dotted line is the boundary of the value range . therefore , The correct optimal value is x∗, The gradient rise method of the fitting function will output the distribution points outside the domain x¯.
The figure on the right shows two-dimensional Branin Function trace example , The two dashed lines represent the trajectories in the offline data set of this paper , The solid line refers to the trajectory of the model , The blue dot indicates exploration , The red dot indicates the use of .

The picture above shows BOOMER The schematic diagram of .BOOMER from 3 A series of stages : Track construction 、 Autoregressive modeling , Launch assessment . In the first phase , Use SPORT-SAMPLE Take offline datasets D Convert to a track data set D_traj. In the second phase , by D_traj Learn an autoregressive model . In the third phase , Put the model on an offline prefix sequence , And further expand to obtain candidates , Introduction Q Candidate points to evaluate the model . Among them Rˆ by Evaluation Regret Budget

The picture above shows the Branin benchmark The result on , This benchmark is a benchmark that has 3 A global optimum (−0.398) The binary function of , For offline optimization , In this paper, we sample uniformly in the domain N=5000 A little bit , And remove from this set the pre - 10%, To remove points close to the optimization point , Make the task more challenging . then , This article is based on SORT-SAMPLE Strategy building 400 The length of each bar is 64 Track of . In the process of evaluation , This article initializes the four prefixes as 32 Track of , And expand it for evaluation . The length is 32 Track of , Unfold the 32 Step , And output the best results , This consumes 128 Query budget .
BOOMER It successfully generalizes the best point in the offline data set of this paper , The best thing about an offline dataset is −6.119,BOOMER The global optimum is obtained −1.79 ± 0.843. Gradient ascending baseline uses offline data sets to train a forward model ( One 2 Layer of NN) To map x To y, Then on x To infer the optimal value by ascending the gradient , The effect is −3.953 ± 4.258.
Branin The task is in multiple Rˆ Value . chart (b) It shows that the prefix length is 32 Track of , In the figure (a) in , Change the prefix length to 16, And in the picture (c) in , Update Evaluation in suffix subsequence RB. You can see , With Gao Rˆ comparison , low Rˆ Launched higher quality points . To verify, this article will regret The semantics of budget as a knob to control the degree of exploration and development , In the figure 5c in , This article also draws the update suffix in RB Value trajectory . If RB Become non positive , Just stop pushing . Obviously , For the smaller Rˆ, The agent model is rapidly accelerated to high-quality areas and continuously developed , And for high Rˆ, It gradually shifts to high-quality points . This shows that Rˆ How to control the speed of transition from exploration to development .

Next , In this paper Design-Bench Of 5 On a complex real-world mission BOOMER. These tasks are considered challenging , Because they have high dimensions 、 Low quality points in offline data sets 、 Approximation in some cases oracle, And high sensitivity with narrow optimal region landscape.
in general ,BOOMER To obtain the 0.772 The average sum of 2.4 The average ranking of , This is the best of all baselines .BOOMER stay TF-Bind-10( discrete ) and D'Kitty( continuity ) The best results were achieved on these two tasks . Besides , This paper ranks the top two in five of the seven tasks . stay TF-Bind-8、TF-Bind-10、Ant and D'Kitty On ,BOOMER It's better than that MINs or CbAS And so on COMs Obvious improvement of equal forward mapping method . This article also notes that , although BOOMER stay Ant Second in the list , But its standard deviation (0.012) Better than best CMA-ES Much lower , The standard deviation of the latter is 0.928. In all tasks ,BOOMER The mean standard deviation of is also the lowest , This shows that BOOMER Compared with other methods , Low sensitivity to poor initialization .
Innovation points
This paper proposes BOOMER, A new generation framework , Used to pre train the black box optimizer with offline data .BOOMER It includes a three-stage process . In the first phase , Use a novel PORTSAMPLE Policies generate tracks from offline data , These trajectories use sorting heuristics to transition from exploration to development . This article further provides a mechanism , adopt Regret Budget To control the degree of exploration and development . In the second and third stages , In this paper, the autoregressive converter is used to train the model , And use it to generate candidate points that maximize the black box function .
边栏推荐
- 无法安装redis接口
- Uniapp makes mobile app programs, using uni Choosevideo record video, video playback is fuzzy, and the resolution is low
- Hello CTP (I) - basic knowledge of futures
- 数字时代的“文艺复兴”?起底数字藏品,让人欢喜让人愁
- Siddhartha: the book of life can be regurgitated frequently
- Intel 13th generation core showed its true colors for the first time: 68mb cache improved significantly
- Laravel document sorting 7. View
- 【LeetCode】22. 括号生成
- mysql的tinyint字段类型判断的疑惑
- client-go gin的简单整合十-Update
猜你喜欢

SEO的5大关键指标:排名+流量+会话+停留时长+跳出率

Hello CTP (II) -- Introduction to CTP

Development of trading system (III) - risk control system

小心被偷脸!天天用的人脸识别风险原来这么多?

如何绘制产业招商地图

Coinlist how to operate the middle lot number security tutorial

Development of trading system (V) -- Introduction to Sinovel counter

1. Phase II of the project - user registration and login

client-go gin的简单整合十一-Delete

Finereport (sail soft) handling the problem that the histogram data label is blocked
随机推荐
How to draw an industry investment map
Laravel document sorting 8. Middleware
OBS Browser+浏览器的基本使用
长沙“求才”:“下力气”与“出实招”并进,“快发展”和“慢生活”兼得
Development of trading system (III) - risk control system
openmmlab-环境配置
IntStream API介绍
Serious PHP defects can lead to rce attacks on QNAP NAS devices
Nodejs 通过Heidisql连接mysql出现ER_BAD_DB_ERROR: Unknown database 'my_db_books'
Intel 13th generation core showed its true colors for the first time: 68mb cache improved significantly
Cesium drag 3D model
如何绘制产业招商地图
Changsha's "talent seeking": "making efforts" and "making practical moves" go hand in hand, "rapid development" and "slow life" go hand in hand
Hello CTP (II) -- Introduction to CTP
Siddhartha: the book of life can be regurgitated frequently
数字时代的“文艺复兴”?起底数字藏品,让人欢喜让人愁
Cesium 加载显示热力图
“语法糖”——我的编程新知
SQL, CTE, FLG CASE问题
1. first knowledge of chromatic harmonica
