当前位置:网站首页>Figure operation flow of HAMA BSP Model
Figure operation flow of HAMA BSP Model
2022-06-22 16:57:00 【ZH519080】
Hama-architecture:

Apache-hama Cluster is based on BSP Based on a framework BSPMaster、( Multiple ) Unrelated GroomServer node controler 、 It can run independently Zookpeer The cluster consists of .BSPMaster use “ fifo ” The principle is right GroomServer monitor 、job Submission processing of 、 Task allocation and record the whole running dynamics ,BSPMaster call BSP Class setup Method 、bsp Methods and cleanup Method pair superstep Control .GroomServer adopt “HeartBeat” towards BSPMaster Send heartbeat message , towards BSPMaster Report current GroomServer Node cluster status . The status information includes the maximum task quantity and available memory capacity of the node cluster .BSPMaster Start according to the heartbeat information BSP Task handle job Divided into... One by one task, And then task Assigned to GroomServer Calculate the node group ,GroomServer start-up BSPPeer perform GroomServer Assigned task.Zookpeer management BSPPeers The barrier synchronization of , Realization BarrierSynchronisation Mechanism .Zookpeer mainly BSPPeer.sync() Method 、enterBarrier() Methods and leaveBarrier() Methods control BSP Barrier synchronization phase of the mission . Transmission of information 、 The receiving is completed in the barrier synchronization stage .
1、job Submission and distribution of :

stay GraphJob Class waitForCompetition() Call in method submit() Method to job Submit to BSPMaster, The main contents submitted are :VertexClass In the implementation of 、VertexInputReader In the implementation of 、Edge The properties and Vertex Of ID、value etc. .
according to BSPMaster monitor GroomServer The running state of the cluster ,BSPMaster For submitted job According to the number of tasks that can be run GroomServer The cluster of computing nodes assigns tasks . By default GroomServer The number of running tasks is 1, If task The running task quantity of is not 1, be BSPMaster The maximum task amount allocated is the difference between the maximum task amount borne by the cluster and the current task amount . according to BSPMaster The amount of tasks assigned to complete is right job The processed data is divided , After data segmentation BSPMaster control GroomServer Assign data to each BSPPeer in .
HAMA-Graph-setup and bsp Operation diagram
Hama-Graph The operation of the system also follows BSP-HAMA Framework of the , It's all in BSP.setup() Method 、BSP.bsp() Method 、BSP.cleanup() Methods and BSP.clear() Method . use BSP.setup() Methods and BSP.cleanup() Methods are used to execute the start calculation and end calculation, and output the calculation results .superstep The calculation of is mainly in BSP.bsp() Method , therefore BSP.bsp() It mainly controls the core part of the whole parallel computing .BSP.clear() The method is used to clear this superstep And prepare for the next step superstep Calculation .

2、 Data loading and superstep initialization
Data loading and superstep The initialization of is mainly started by controlling superstep Calculated BSP.setup() Method .

The allocated data passes through BSP.setup() Methods GraphJobRunner.loadVertices() Method to load data into memory .Job After publishing, use VertexInputReader.parseVertex() Method to parse the data , After parsing the data, use hashMap Of put(getNumPeers,GraphJobMessage) Method to temporarily store data in hashMap in . obtain hashMap Properties of ( That is, the parsed value value ) use BSPPeer.send(peerName,GraphJobMessage) Method in peers Information transfer between , The main contents to be delivered are peer Address and GraphJobMessage Information . But there is no guarantee that messages are sent and received in the same order , So in the barrier synchronization phase , The same message sent may not arrive BSPPeer On , also BSPPeer There is information transmission between them . Wait until the information transmission is completed in the barrier grid synchronization phase MessageQueue.poll() Method BSPPeer Get the message .
Wait until all the data is loaded into memory superstep The initialization ( The whole program is executed only once ).Superstep When initializing, the GraphJobRunner.doInitialSuperstep() To perform the . When all BSPPeer The work to be done after receiving the corresponding message is to superstep Initialization of calculation , This work is regarded as the first... After the drama data is loaded supertep Calculation , Although there are VerticeInfo.startSuperstep() Methods and VerticesInfo.finishstartstep() Method runs but does not really superstep Calculation , Because the vertex is still inactive , Just to set BSPPeer Cluster computing threads .
3、 Barrier synchronization
But how does the information work in the barrier synchronization stage ?

In the process of synchronization, information transmission mainly uses outgoingMessageManager( Output information manager ) and localQueue( Local message queue ), adopt outgoingMessageManager( Output information manager ) hold peer Address and GraphJobMessage Package to outgoingBundles( Output package ) in .
All the information is packaged in outgoingBundle The preparation for information transmission is completed . Then all the information is in the barrier synchronization phase (sync()) To summarize . hold peer Address and GraphJobMessage Information from outgoingBundle After taking it out of the , adopt LocalBSPRunner.tansfer() Method store in hashMap The information in is loaded into localQueueForNextIteration(SynchronizedQueue object ) in , And use MessageQueue.addBundle() Method to package .
When information is loaded into localQueue( Local message queue ) In the after , All the information starts enterBarrier Stage . stay enterBarrier Phases wait by scheduling threads (wait()) Others have not yet entered the barrier synchronization stage . When the last thread that controls the message enters the barrier synchronization phase , A non empty barrier is provided in the structure , Then the current message thread is executed and other message threads are still in the waiting phase , If the last message thread enters the barrier synchronization phase, there is no non empty barrier in the structure , Then each thread enters the preemptive mode , The thread that grabs the execution right will execute the thread first . It is not until all the message threads enter the barrier synchronization phase that they really start to BSPPeer Information transfer between . During the barrier synchronization phase BSPPeer Transmission between is used in AbstractMessageManager.clearOutgoingMessages() Methods localQueueForNextIteration.getMessageQueue() Method uses localQueue The replacement is completed . In the barrier synchronization phase, when all the taskID The associated information for the identification is passed through localQueue Send it out and go to leaveBarrier Stage .
4、superstep Calculation


In a superstep During the calculation BSPPeer Only messages can be sent or the last one can be processed superstep Message received in .
Really start superstep The calculation is in doSuperstep(GpprootraphJobMessage,BSPPeer) Implemented in the method of . When BSPPeer Cluster start superstep Before calculation, you need to use class AtomitInteger Activate the vertex to make it active , Traverse all the information and the arrangement order of vertices , With the same ID The iteration information of starts as the first vertex superstep The calculation of . Method of use startSuperstep() perform superstep The beginning of , Next, customize compute() Method implementation , When no vertex is activated or the number of iterations set by the user is reached, it means superstep end .
边栏推荐
- uniapp微信小程序获取页面二维码(带有参数)
- spark与mysql:Did not find registered driver with class com.mysql.jdbc.Driver
- Vs2017 solution to not displaying qstring value in debugging status
- [deep anatomy of C language] keywords if & else & bool type
- Task scheduling design of collection system
- The world's "first" IEEE privacy computing "connectivity" international standard led by insight technology was officially launched
- [pop up box 2 at the bottom of wechat applet package]
- 为数字添加千分位符号(金额千分位)
- 如何为政企移动办公加上一道“安全锁”?
- 【微信小程序获取自定义tabbar的高度】绝对可用!!!
猜你喜欢
![Consumption monitoring of Prometheus monitoring [consult exporter]](/img/9e/8547b2c38143ab0e051c1cf0b04986.png)
Consumption monitoring of Prometheus monitoring [consult exporter]

社会担当 广汽本田“梦想童行”倡导儿童道路交通安全
![Web technology sharing | [Gaode map] to realize customized track playback](/img/0b/25fc8967f5cc2cea626e0b3f2b7594.png)
Web technology sharing | [Gaode map] to realize customized track playback

让代码优雅起来(学会调试+代码风格)

Learning about ABAP program tuning (IV) loop where key

Partage de l'architecture du système de paiement du Groupe letv pour traiter 100 000 commandes simultanées élevées par seconde

jsp学习之(二)---------jsp脚本元素和指令

团队管理|如何提高技术 Leader 的思考技巧?

什么是RESTful,REST api设计时应该遵守什么样的规则?

In the era of video explosion, who is supporting the high-speed operation of video ecological network?
随机推荐
NiO file and folder operation examples
scala的相等性
Huawei cloud recruits partners in the field of industrial intelligence to provide strong support + commercial realization
面对默认导入失败的情况
华为云招募工业智能领域合作伙伴,强力扶持+商业变现
每秒处理10万高并发订单的乐视集团支付系统架构分享
[Alibaba cloud server - install MySQL version 5.6 and reinstall]
spark关于数据倾斜问题
Spark's NaiveBayes Chinese text classification
Short video source code development, high-quality short video source code need to do what?
系统吞吐量、TPS(QPS)、用户并发量、性能测试概念和公式
What should I do if I can't hear a sound during a video conference?
[MYSQL]数据同步提示:Specified key was too long;max key length is 767 bytes
Web technology sharing | [Gaode map] to realize customized track playback
同花顺怎么开户?网上开户安全么?
Test for API
jsp學習之(二)---------jsp脚本元素和指令
[C language] deeply analyze the storage of integer and floating-point types in memory
毕业季·本科毕业感想——机械er的自救之路
MYSQL_ERRNO : 1292 Truncated incorrect date value At add_num :1