当前位置:网站首页>The core battlefield of China US AI arms race: trillion level pre training model
The core battlefield of China US AI arms race: trillion level pre training model
2022-06-24 01:52:00 【Data ape】
In the field of artificial intelligence , The mainstream players are China and the United States . Overall, it shows that the United States is leading , China is catching up . Both China and the United States regard artificial intelligence as a strategic highland , Poured a lot of resources .
so to speak , The artificial intelligence industry competition between China and the United States , It has been very intense . In a way , China and the United States are developing artificial intelligence “ The arms race ”.
How is the current competition going ?
Artificial intelligence is a huge industry , It is difficult to have a comprehensive assessment . however , We can come from a typical field “ To see only one spot ”—— Super large scale pre training model .
The reason why the super large-scale pre training model , As an observer of the China US AI competition “ window ”, Because this field is more in line with several characteristics of the arms race :
First of all , Significant strategic position .
At this stage , Artificial intelligence technology has great limitations , A certain kind of model can only solve the problems in a specific domain , Model “ generalization ” Poor ability . General artificial intelligence is the ultimate pursuit of people , The current special artificial intelligence model obviously can not meet the requirements . A way of thinking to solve problems , Is to increase the parameters of the model , Increase the complexity of the model , Improve the generalization ability of the model . People expect a larger parameter scale , It can bring higher model accuracy , And a model to solve more domain problems .
Whether the super large-scale pre training model can realize general artificial intelligence , Not yet known. . But for now , This is the most promising way . Quantitative change causes qualitative change , Only “ The amount ” Enough is enough , There is the possibility of qualitative change . We can compare a set of data : The adult brain contains about 850-860 Billion neurons , Each neuron is associated with 3 Ten thousand synaptic connections , The number of synapses in the human brain is estimated 2500 Trillions .
How did human intelligence come from , It is essentially from these neurons 、 synaptic . The human brain is also a computer , These neurons 、 Synapse is the basic computing unit . If you want artificial intelligence to reach the human level , That reaches or even surpasses the human brain in the number and scale of basic computing units , Is a necessary condition .
So this is the way to think about it , Build a large-scale pre training model , Add model parameters , It is equivalent to adding the calculation unit of the model . Maybe , Artificial intelligence “ singularity ” Namely 2500 Trillion computing units . Of course , The parameters of the pre training model are different from the concept of the computing unit . however , Now there is no better way , We can only increase the parameter scale of the model to 2500 Trillions of magnitude , See what happens then , There may be a miracle .
From this point of view , Create a pre training model with a scale of billions of parameters , It is a super project of mankind , It may have a significant impact on the country and even human society . There are many super scientific projects in modern history , The Manhattan Project is inevitable 、 Apollo moon program 、 Human genome project, etc , These super projects have broadened the scope of human development “ The ceiling ”.
second , The results of the competition are easy to evaluate .
Evaluate which of the two pre training models is better , There are many indicators , But there is one key indicator , That is the parameter scale . Overall ,1000 100 million parameter pre training model , than 100 The 100 million parameter pre training model is more powerful .
It's a bit like naval armament , Evaluate the combat effectiveness of the two warships , An important indicator is the tonnage of warships . A ten thousand ton warship , The combat effectiveness is generally stronger than that of a kiloton warship . The total tonnage of all warships , It has also become a key indicator to measure the strength of the navies of the two countries .
Same thing , It depends on the artificial intelligence competition between China and the United States , The parameter scale of the pre training model , Is a good indicator .
Third , Huge investment of resources , It's a game of burning money .
Similar to the arms race , Super large scale pre training model , Not only technical ability is needed , Also needed “ Banknote ability ”. There are three core elements of artificial intelligence : Algorithm 、 data 、 Calculate the force . A successful large-scale pre training model , It takes a lot of genius to solve algorithmic problems , Need to accumulate massive amounts of data , Model training requires a lot of computational effort . in every particular , Need to be “ Banknote ability ” Support for .
therefore , Super large scale pre training model , It's a giant game . at present , There are only a few players in the world from China and the United States .
Fourth ,“ The war situation ” Intense , race each other .
I counted the major large-scale pre training models in China and the United States , Especially some models that keep breaking the record of parameter scale , A statistical chart is made as follows :
China US pre training model competition
Several features can be seen from the above figure :
(1) The United States started early on the large-scale pre training model , And continue to evolve . from AI2 stay 2018 Only 9400 10000 parameter ELMO Start , Google 、 Microsoft 、 Ying Wei Da 、OpenAI Wait for American companies to relay , Keep breaking the parameter scale record . And China is in 2021 In, he began to work on the large-scale pre training model , Three years later than the United States .
(2) The large-scale pre training model is just a game for a few players , Sino US , There are only a few players . That makes sense , Technology of pre training model 、 data 、 The threshold of calculating power is very high , Only giants can play this game .
(3) China has an obvious advantage of backwardness . Although China is a few years behind the United States , But one shot will heighten the intensity of the competition . America is famous for GPT-3 The scale of the model is still on the order of hundreds of billions , Google's Swith Transformer Just beginning to reach the threshold of trillions . China's “ Trillion Club ” There are already two players , The parameter scale of Zhiyuan Research Institute has 1.75 One trillion , Over Google Swith Transformer Of 1.6 One trillion . Alibaba just released M6 The parameter scale of has broken through 10 One trillion .
It should be said , The reason why Chinese enterprises and institutions can catch up , It is inseparable from the development characteristics of the pre training model itself . The increase of the parameter size of the pre training model is not linear , It's exponential . The parameter scale of the next generation model , Not twoorthree times that of the previous generation , It is likely to be an order of magnitude higher . Just look at the development of the United States in the past few years , It also conforms to this law , The parameter scale has gradually increased from 100 million to 10 Billion 、 Ten billion 、 One hundred billion 、 One trillion .
therefore , The record set by Alibaba will soon be broken again . Google in America 、 Microsoft 、OpenAI、 The strength of NVIDIA, etc , Still strong . These companies are likely to break the record next time .
(4) China has also formed players “ legion ”. For a country , To catch up and surpass in a certain field , It is not safe to rely solely on one enterprise or institution , You need more than one player . In addition to the record breaking Alibaba in China , The strength of Zhiyuan Research Institute is also very strong . Huawei also participates in this game , Although only a pre training model with a scale of 100 billion parameters has been released at present , But with Huawei's nature , As well as its emphasis on artificial intelligence , I believe Huawei will never stop at the scale of 100 billion yuan .
Besides , Chinese players also have Tencent 、 Baidu 、 IFLYTEK, etc . For example, Baidu has ERNIE-M, Tencent has a big star , Although they failed to break the record at that time in terms of parameter scale , But they also have their own characteristics , Belong to “ Small and beautiful ” The existence of .
It's important to point out that , China and the United States are rivals , But in the face of nature , Another teammate . Let's look at another data : At present, the parameter scale of the pre training model is 10 One trillion , Human brain synapses are larger than 2500 One trillion . The parameter scale of artificial intelligence , And the size of synapses in the brain , It's not good 2 An order of magnitude . If we consider the model parameters and synapses in the brain “ Calculate the force ” The difference on , This gap will be even greater .
Comparison between pre training model and synaptic size of human brain
Increase the scale of model parameters to 2500 Trillions of magnitude , It is a common challenge facing mankind . Of course , Countries that have the capacity to address this challenge , Mainly China and the United States . The revolution has not yet succeeded , Comrades still need to work hard .
As mentioned above , Large scale pre training model is a money burning game . The larger the parameter size , The higher the training cost . Based on the parameter scale 1750 Billions of GPT-3 For example , The cost of one training is as high as 1200 Thousands of dollars . The parameter scale is 2500 Trillions of models , How much will the training cost ? Although the training cost does not increase linearly with the parameter size , But bigger models , It will definitely cost more money . If man designed 2500 Trillion parameter scale pre training model , The training cost may reach billions or even tens of billions .
It's a bit like a particle accelerator . In order to explore the physical laws under high energy conditions , Particle accelerators are getting bigger and bigger , More and more money . At present, the largest particle accelerator in the world is European LHC, This machine is joined by dozens of countries , It costs tens of billions . In the Chinese scientific community , There has always been an argument , Whether to spend tens or even hundreds of billions , Make a comparison LHC Higher level particle collider .
In the field of controlled nuclear fusion , There is a similar project , It's famous ITER( International thermonuclear fusion experimental reactor program ).ITER The device is a superconducting tokamak that can produce large-scale nuclear fusion reactions , Be commonly called “ Artificial sun ”. Even by 1998 The currency value of the year is , It costs money 50 Billion dollars , Dozens of countries are also involved .
In the field of super large scale pre training model , If we reach the end , It will cost billions or even tens of billions to find a model with tens of millions of parameters , Can you also refer to the above example , To engage in a global cooperation ? Of course , China and the United States are the main force , Other countries also make soy sauce .
Just imagine , Once put AI Parameter scale of the model , To the level of synapses in the human brain , Will there be “ singularity ”? I still have a little expectation .
writing : Gaze into the deep space / Data ape
边栏推荐
- From idea to finished product, the necessary process of APP product development
- Global and Chinese alumina nanoparticle market scale and Development Trend Outlook report 2022-2028
- Software cost evaluation: a method for estimating software scale by fast function point method
- Introduction to trusted service manager
- [dry goods] configure failover active/acitve in transparent mode on Cisco ASA firewall
- Collation of commonly used glusterfs commands
- Cost composition and calculation method of system software
- [technical grass planting] the cloud driver takes you straight to the clouds
- Analysis report on operation situation and development trend of global and Chinese diisobutyl aluminum hydride (Dibah) industry 2022-2028
- Thorough and thorough analysis of factory method mode
猜你喜欢
![[SQL injection 12] user agent injection foundation and Practice (based on burpsuite tool and sqli labs LESS18 target machine platform)](/img/c8/f6c2a62b8ab8fa88bd2b3d8f35f592.jpg)
[SQL injection 12] user agent injection foundation and Practice (based on burpsuite tool and sqli labs LESS18 target machine platform)

It's too difficult for me. Ali has had 7 rounds of interviews (5 years of experience and won the offer of P7 post)

I, a 27 year old female programmer, feel that life is meaningless, not counting the accumulation fund deposit of 430000
![[SQL injection 13] referer injection foundation and Practice (based on burpseuite tool and sqli labs less19 target platform)](/img/b5/a8c4bbaf868dd20b7dc9449d2a4378.jpg)
[SQL injection 13] referer injection foundation and Practice (based on burpseuite tool and sqli labs less19 target platform)
随机推荐
Go language core 36 lectures (go language practice and application VII) -- learning notes
[tcapulusdb knowledge base] tcapulusdb introduction Questions Summary
4-data persistence and shared interconnection
Smart supply chain collaborative management platform for the home industry integrated upstream and downstream collaboration of the supply chain to improve management efficiency
Introduction to easycvr interfacing with Huawei IVS subscription camera and user change request interface
[actual combat] how to realize people nearby through PostGIS
Tcapulusdb Jun · industry news collection
Cloud computing "keeping the promise"
Tcapulusdb database: the king behind the "Tianya Mingyue swordsman Tour"
Devops learning notes (II)
What is the relationship between cloud desktop and cloud server? How to understand the relationship between the two
How does Huawei weautomate RPA achieve the natural growth of government enterprise automation?
Software cost evaluation: a method for estimating software scale by fast function point method
SAP mm maintains inter company sto error -no delivery type defined for supplying
How does easynvr set the video recording to be saved for more than 30 days?
It's too difficult for me. Ali has had 7 rounds of interviews (5 years of experience and won the offer of P7 post)
Introduction to trusted service manager
Flink weapon: introduction to the open source platform streamx
[tcapulusdb knowledge base] how to rebuild tables in tcapulusdb table management?
[dry goods] configure failover active/acitve in transparent mode on Cisco ASA firewall