当前位置:网站首页>Solving a random number problem
Solving a random number problem
2022-07-25 08:40:00 【Four fires】
There is such an algorithm problem :
Given one that can generate uniform 1~5 Function of random enumerator , Please design one that can generate uniform 1~7 Function of random enumerator .
That is to say , There is a function that generates random numbers rand5, May return 1、2、3、4、5 this 5 Enumeration values , The probability of each value being returned is strict 1/5, Now we need to design a similar random number function rand7, May return 1、2、3、4、5、6、7 These enumeration values , The probability of each value being returned is strict 1/7.
Cover your paper and think first , Thoughts that come to mind include :
- call rand5 Divided by 5, Multiplied by 7, The range of such results is 7/5~7, Not the desired result ;
- Call again and again rand5 function 7 Time , The result is divided by 5, The range of such results is also 7/5 ~ 7, Not the desired result .
What if the title is reversed , It is known that rand7, seek rand5 Well ?
Then I can call rand7, Look at the results , If the result is 1~5, Go straight back to ; If the result is 6、7, Just keep trying again ?
Then come back to reality , How to base it on rand5 seek rand7?
- If rand5 + rand5 Result , The scope is 2~10, You can only get 2~7 Value , Can't get 1, Not to the point .
([2019-4-5] Some people say , That can reduce the above result 1 Not to go ? namely rand5 + rand5 – 1, The range of the number obtained is 1~9, Give up 8、9, It seems that we can , But it's still wrong to think about it carefully : To abandon this 9 For example , You have to get it twice rand5 It has to be 5, So probability is 1/5*1/5=1/25, This probability is obviously lower than the average probability of a single value . in other words , This method yields 1~7 Every number in is not possible )
But still got a kind of inspiration , Call once rand5, The various possibilities of the result are 5 Kind of , To map to rand7 Of 7 There are two possible outcomes , It's unrealistic . But if you throw it twice , Without considering weight removal , There are 5*5=25 Maybe , Map in some way and keep the final 7 A possibility , But it is a thought worth trying .
Think of the 5*5, So try to build a two-dimensional array arr[5][5], Then every element of the array can represent the possibility of a result , Get the first in the array 7 Elements , Map to respectively 1~7:
[1, 2, 3, 4, 5]
[6, 7, 0, 0, 0]
[0, 0, 0, 0, 0]
[0, 0, 0, 0, 0]
[0, 0, 0, 0, 0]So call rand5 two , Get the abscissa respectively i And the vertical j, If arr[i][j]>0, The retention , Otherwise try again .
This method is not perfect , because 25 There are only 7 One is effective , In most cases, you can only try again , Efficiency is too low .
therefore , In this two-dimensional array, not only the former 7 position , But to keep as much as possible 7 Complete multiple of :
[1, 2, 3, 4, 5]
[6, 7, 1, 2, 3]
[4, 5, 6, 7, 1]
[2, 3, 4, 5, 6]
[7, 0, 0, 0, 0]thus , In most cases , Will hit more than 0 The elements of .
Then write the code :
public class R {
private static Random random = new Random();
private static int[][] arr = new int[][]{
{1,2,3,4,5},
{6,7,1,2,3},
{4,5,6,7,1},
{2,3,4,5,6},
{7,0,0,0,0}
};
public static int rand5(){
random.setSeed(System.nanoTime());
return random.nextInt(5) + 1;
}
public static int rand7() {
int i = rand5() - 1;
int j = rand5() - 1;
if (i == 4 && j >= 1)
return rand7();
return arr[i][j];
}
}Write another main Function test :
Map<Integer, Integer> counter = new HashMap<Integer, Integer>();
for (int i = 0; i < 10000000; i++) {
int key = rand7();
Integer val = counter.get(key);
if (null == val)
val = 0;
val++;
counter.put(key, val);
}
for (Map.Entry<Integer, Integer> entry : counter.entrySet()) {
System.out.println(entry.getKey() + ": " + entry.getValue());
}Repeat the test ten million times , But from the results , The distribution is not random enough :
1: 1476605
2: 1764393
3: 1274549
4: 1219960
5: 1454842
6: 1425833
7: 1383818The test was repeated several times , It's all output 2 Mostly . In terms of this amount of data , I don't think it's a coincidence .
In order to make this possible difference more obvious , I will rand5 seek rand7 Changed from rand2 seek rand3:
public class R {
private static Random random = new Random();
private static int[][] arr = new int[2][2];
static {
arr[0][0] = 1;
arr[0][1] = 2;
arr[1][0] = 3;
arr[1][1] = 0;
}
public static int rand2(){
random.setSeed(System.nanoTime());
return random.nextInt(2) + 1;
}
public static int rand3() {
int i = rand2() - 1;
int j = rand2() - 1;
if (i == 1 && j == 1)
return rand3();
return arr[i][j];
}
}Test again 10 million times , It turns out to be a surprise :
One: 5043264
Two: 2472499
Three: 2484237There is a double difference .
What's going on ? I began to wonder Java The result of this pseudo-random function of ( It is impossible to realize absolute randomness in the computer world ) Not random enough , So I wrote a program to call 10 million times Java From the pseudo-random function of :
Map<Integer, Integer> counter = new HashMap<Integer, Integer>();
for (int i = 0; i < 10000000; i++) {
random.setSeed(System.nanoTime());
int key = random.nextInt(7) + 1;
Integer val = counter.get(key);
if (null == val)
val = 0;
val++;
counter.put(key, val);
}
for (Map.Entry<Integer, Integer> entry : counter.entrySet()) {
System.out.println(entry.getKey() + ": " + entry.getValue());
}From the results, the distribution is very uniform :
1: 1429576
2: 1425422
3: 1427902
4: 1427424
5: 1429457
6: 1430664
7: 1429555I suddenly feel puzzled .
So I reexamine my thinking , I still feel there is nothing wrong . Although the overall results initially have 25 individual , But before 21 The results of each are consistent , The last four lose and start again , Continuous testing can still ensure equal probability of results . In essence, this method is statistically called “Reject Sampling”.
I consulted a mathematician @ Wanjing ink green , He said there was nothing wrong with thinking , If there's a problem , It can only be a matter of code .
There is actually another way , It is similar in essence , According to :
5 * (rand5()-1) + rand5()The result of the above formula can be obtained from 1 To 25 All the results , And obviously this 25 When results appear , these two items. rand5() Can be uniquely determined by the return value , Therefore, the probability of their occurrence is equal to each other . So we can use the above formula , When the result is greater than 3*7=21 Recalculate when , Otherwise, it returns divided by 7 The remainder of :
public class R {
private static Random random = new Random();
public static int rand5() {
random.setSeed(System.nanoTime());
return random.nextInt(5) + 1;
}
public static int rand7() {
int i = rand5();
int j = rand5();
int res = 5 * (i - 1) + j;
if (res > 21)
return rand7();
return res % 7 + 1;
}
}Also ran several tests , Test 10 million pieces of data each time , This time, I found that the larger number went to 3 It's gone. :
1: 1383566
2: 1486463
3: 1748051
4: 1275854
5: 1219712
6: 1451438
7: 1434916In this way, I feel a little enlightened , I think it's because Java Pseudo random number generation method , The generated number is not random enough ? Although it looks random , But that's just what it looks like . When used “ Small random ” To generate “ Big random ” When , Those non random defects are amplified . And compare rand2 Generate rand3, and rand5 Generate rand7, Obviously the former “ Zoom in ” The multiple of , So in the final result ,“ Randomness ” Look bad .
To further test this conjecture , I began to think about whether I could make the seeds of random numbers change more . Because the random number seed currently used is System.nanoTime(), This method looks like nanosecond , In fact, it's just :
Returns the current value of the most precise available system timer, in nanoseconds.
I think it is much more accurate than milliseconds in my experiment , But it is only guaranteed to be as accurate as possible .
Good. , To verify or partially verify such conjectures , Now suppose that this conjecture is correct , Then we can draw such a conclusion :
- If the random number seed is replaced by System.currentTimeMillis(), in other words , Change to milliseconds , Then the final result should be less random ;
- If I take a few milliseconds before each random number , Make the time seed difference between every two increases , It should be possible to see an increase in the randomness of the final result .
( notes , The version I tested JDK The treatment method for the set seed is :seed = (seed ^ 0x5DEECE66DL) & ((1L << 48) -1).)
ok , Now let's verify the first , In order to make the result as obvious as possible , Use rand2 Generate rand3 The plan of . Change the use of nanoseconds as random number seeds to milliseconds as random number seeds , It turned out to be :
One: 10000000
Two: 0
Three: 0In other words , The abscissa and ordinate of the two-dimensional array are actually in 10 million tests , All the results are the same , That is, in most cases i and j All operations are completed within the same millisecond .
Now let's verify the second , Before taking random numbers every time , Sleep 3 millisecond , Of course , This 3 Milliseconds must also be inaccurate 3 millisecond . In order to increase the sleep time , I can get the final result within my patient time , I can't test 10 million times , My test case is changed to test 10000 times , The result is :
One: 3691
Two: 3103
Three: 3206Sure enough , The uniformity of distribution is much better .
The problem is not over , In order to eliminate the pseudo randomness caused by this seed similarity as much as possible , In fact, you can only initialize the seed once , Then call continuously inside the iterative method Random Of nextInt Just the way , This is also a more reasonable way to use the random number generator in this case :
public class R {
private static Random random = new Random(System.nanoTime());
private static int[][] arr = new int[][]{
{1,2,3,4,5},
{6,7,1,2,3},
{4,5,6,7,1},
{2,3,4,5,6},
{7,0,0,0,0}
};
public static int rand5(){
return random.nextInt(5) + 1;
}
public static int rand7() {
int i = rand5() - 1;
int j = rand5() - 1;
if (i == 4 && j >= 1)
return rand7();
return arr[i][j];
}
public static void main(String[] args) {
Map<Integer, Integer> counter = new HashMap<Integer, Integer>();
for (int i = 0; i < 10000000; i++) {
int key = rand7();
Integer val = counter.get(key);
if (null == val)
val = 0;
val++;
counter.put(key, val);
}
for (Map.Entry<Integer, Integer> entry : counter.entrySet()) {
System.out.println(entry.getKey() + ": " + entry.getValue());
}
}
}This time, , Not only is it evenly distributed , And the execution speed is much faster :
1: 1428864
2: 1429574
3: 1427055
4: 1429929
5: 1429162
6: 1426933
7: 1428483It was fun .
The articles are all original without special indication , It shall not be used for any commercial purpose without permission , Please keep the integrity of reprint and indicate the source link 《 Four fire's nagging 》
边栏推荐
- [Sesame Street family] & Bert Bart Roberta
- JD cloud and Forrester consulting released a hybrid cloud report that cloud Nativity has become a new engine driving industrial development
- [dark horse programmer] redis learning notes 002: persistence: RDB and AOF
- Graduation project of wechat small program ordering system of small program completion works (4) opening report
- 聊下自己转型测试开发的历程
- 哈希表刷题(上)
- Keep your Eyes on the Lane: Real-time Attention-guided Lane Detection
- nuscenes数据集3D MOT demo,端到端的目标检测和跟踪,检测跟踪联合框架
- read
- Recursive call to print every bit of an integer
猜你喜欢

为什么要使用MQ消息中间件?这几个问题必须拿下!

Graduation design of wechat small program ordering system of small program completion works (3) background function

Basis 33: XPath acquisition methods of page elements under various browsers

FreeMaker模板引擎

Wechat reservation applet graduation design of applet completion works (2) applet function

第3章业务功能开发(查询线索)

Pet adoption system based on ssm+mysql+layui

【黑马程序员】Redis学习笔记003:Redis事务

Swift初始化器及可选链

@Autowired注解的实现原理
随机推荐
Redis最佳实践
Redis4.0.14 sentinel automatic failover failed
Chapter 3 business function development (modifying clues, data echo and modifying data)
When testing VPN, the IP found by the command line is inconsistent with that of Baidu search
【芝麻街一家】& Bert Bart RoBERTa
Redis learning
JS cool rolling picture deformation animation JS special effects
Some easy-to-use plug-ins and settings installed in vscode
Swift initializer and optional chain
nuscenes数据集3D MOT demo,端到端的目标检测和跟踪,检测跟踪联合框架
Record the process of two multi terminal troubleshooting
Online shopping E-commerce mall system based on jsp+servlet+mysql+
Technical aspect ② what are the index types in MySQL and briefly introduce them? When do I need to create an index? When is it not necessary to create an index? Why does the query speed increase after
Qt|QLable多行展示时更改行间距
C语言基础
【无标题】
DIY can decorate the mall system, you can also have!
Pet adoption system based on ssm+mysql+layui
Apartment repair reporting system (idea, SSM, MySQL)
Huawei device remote login (Telnet, SSH) configuration