当前位置：网站首页>[sharing of competition experience] Rank5 in goose Rose Square - hard search

[sharing of competition experience] Rank5 in goose Rose Square - hard search

2022-06-24 05:14:00 【User 5556120】

First, briefly summarize the topic , It's a The sequence is completely fixed to 10000 A Tetris game , You need to give a sequence of operations to maximize the score , The number of scores is related to the number of lines eliminated in a single time and the number of blocks in the field .
The main algorithms used are ：
1. greedy
2. Single step dfs Search for
3. Monte Carlo search
- What else can I do besides searching , Just search hard
Go to the code warehouse first ：https://github.com/Suikasxt/tetris
- The contents are mainly in the following two files ：
  - GameController.cpp： Game logic
  - tetris.cpp： Policy algorithm
- Because the code was not written for sharing , Don't expect it to look good .

In fact, this is to be a The valuation function of the situation , Then enumerate all the operations currently available , Select an operation that makes the situation most valued , The main purpose of writing this algorithm is to adjust the valuation function first , This is the cornerstone of all the following algorithms .
Because I have participated in botzone Upper tetris match , The appraisal function written at that time can be used as a reference （ Of course, as the game environment and scoring methods have changed, some adjustments need to be made ）, It mainly includes the following aspects ：
- Maximum height of each column 、 Average height
- The sum of the XOR values of the occupied or not states of adjacent cells （ That is, the blank area and the square area shall be separated as far as possible ）
- The number of voids completely blocked 、 The number of elongated cavities （ Prevent the program from falling into a long hole and waiting too long I Type A causes sudden death ）
Used such a greedy , In conjunction with enumeration （ In fact, there is only one level of search ）, Has been able to have a certain degree of intelligent performance , About able to persist 100 Left and right squares .

On the basis of the above greed , many Search a few steps , You can get a very good algorithm .
say concretely , The number of possible operations at each step is large , Only the three operations with the highest evaluation can be selected for expansion , Search for 10 Around the floor , The amount of calculation is 10000*3^10, Multiply one in front 10000 Because every step of our operation is simulated 10 Step , After getting the operation, all the information obtained in this search will be discarded , Search further in the new situation 10 Step . This operation is actually a waste in this problem , Because our whole process is single person and completely certain , There is no randomness caused by competitors or other factors , But I didn't optimize this until the end .
At this point, our algorithm has been able to run safely 10000 Square block , In order to get a higher score, remember to add the current number of blocks to the evaluation function , Let the algorithm consciously store blocks .
At first, this method was used to search for low levels , Can achieve 70w About the score , It was later used MCTS not one's day , Back to search , Improved to search 11 layer , score 117w, But in fact search 13 Layer can get 122w about , It just takes a long time. About 12 Hours , I couldn't finish before the end of the race .

I understand MCTS The main idea is , Each step in the search process forks out 3 About States , As a result, the search can not be very deep , So we don't do this fixed bifurcation , But every time you search, you go all the way to the end , But search a few more times , At the same time, randomness is introduced into the strategy , It is expected that this randomness can realize the bifurcation sampling effect for us . This is because it can Deeper search layers , And to a certain extent, it takes into account the choice of strategies , Can often get better results .
In my old games AI Writing experience ,MCTS Is able to bring great improvement over ordinary search , But this time it seems different , Just used MCTS when , Can get 111w Score of , Looks better than the initial step-by-step search 70w Is much better , But I can only say 70w Not the upper limit of a step-by-step search . Later, I tried to use double-layer MCTS Nor has it been improved .

greedy ： Can't finish 10000 Square block .
Search for ：
- Don't Tun square ：38w
- Tun square ：68w
Monte Carlo ：
- Greed as a one-step strategy ：79w
- 3 Layer search as a one-step strategy ： Finally roll to 111w
Go back and search
- 11 layer ：117w, run 2 Hour or so
- 12 layer （ Before the end of the 4 Hours ran out but died suddenly and didn't get the operation sequence ）：119w, run 5 Hour or so
- 13 layer （ Can't run ）： expect 122w, run 12 Hour or so

Multithreading ： Originally my search 13 It takes more than a day to run , The day before the end of the race, I hung up and ran , At the end of the day, I realized that I couldn't finish the race , Just started handwriting multithreading , But it's too late . Not before c++ Too many threads are used on , That's why I didn't dare to do it , After writing, I found that it was very simple
State compression ：20 That's ok 10 Column , altogether 200 individual bit Information about , In fact, it can be compressed into 10 individual int, One in each column , Convenient operation , But because I don't think the promotion will be very big （ And lazy ） I didn't write this improvement .

After the game dalao Your notes are very rewarding , See the cluster search （ Solve my problem about information waste ）, I have seen the wonderful skills of dynamic planning , I see jemalloc And other optimization techniques .
Thank you for , Let me talk to dalao We have a chance to communicate , ~~Thank you for letting me show off this little white goose factory coding, And the mechanical keyboard of my dreams~~ .

版权声明
本文为[User 5556120]所创，转载请带上原文链接，感谢
https://yzsam.com/2021/08/20210820115502229z.html