当前位置：网站首页>[parallel and distributed computing] 10B_ MapReduce GFS Implementation

[parallel and distributed computing] 10B_ MapReduce GFS Implementation

2022-06-21 20:49:00 【I'll carry you】

List of articles

MapReduce
MapReduce Implementations (Hadoop Is an open source implementation )
Work flow （map worker -> Local disk -> reduce worker）
Fault Tolerance: Worker Failure（worker Re execution ）
FAULT Tolerance: Master Failure（master from checkpoint recovery ）
Task Granularity（ Ideally ,M and R It should be much larger than the number of working machines ）
Backup Tasks（ Near the end of the mission , Start multiple processes to execute ）
Partition and Combiner
Skipping Bad Records（ Error skipping ）
GFS
Distributed File System
GFS: Assumptions
GFS: Design Decisions
From GFS to HDFS
HDFS Architecture
- Read Flow （ Read data stream ）
- Write and control data flow

MapReduce

MapReduce Implementations (Hadoop Is an open source implementation )

Insert picture description here

Work flow （map worker -> Local disk -> reduce worker）

Insert picture description here

workflow ：
1、 Split the input file into M File , Usually every file 16 To 64 MB;
2、master Pick your free time worker, And for each worker Distribute map Task or reduce Mission ;
3、 Allocated map Mission worker Read the contents of the corresponding input split .Map Intermediate key generated by function / Value pairs are buffered in memory .
4、 Buffer pairs are written to the local disk periodically , It is divided into... By the partition function R Regions . The location of these buffer pairs on the local disk is passed back to the primary disk ;
5、 When reduce worker When you receive notification from the host about these locations , It uses remote procedure calls from map worker Read buffered data from the local disk of . When reduce worker When all intermediate data is read , It sorts them by the middle key , So that all matches of the same key can be grouped together .
6. Reduce worker Intermediate data of iterative sorting , For each unique intermediate key encountered , It passes the key and the corresponding set of intermediate values to the user reduce function .Reduce The output of the function will be appended to this Reduce In the final output file of the partition .

Fault Tolerance: Worker Failure（worker Re execution ）

Insert picture description here

FAULT Tolerance: Master Failure（master from checkpoint recovery ）

master Write... Regularly master data-structured checkpoints. If the main mission dies , You can start a new copy from the last checkpoint state .

Task Granularity（ Ideally ,M and R It should be much larger than the number of working machines ）

Ideally ,M and R It should be much larger than the number of working machines .
Let each worker Performing many different tasks can improve dynamic load balancing , And in worker Faster recovery in the event of a failure ：
master Must make O（M+R） Scheduling decisions , And keep... In memory O（M*R） state
In practice , We tend to choose M, So that each individual task has about 16 MB To 64 MB Input data for （ Therefore, the local optimization described above is the most effective ）, And we will R Set to a small multiple of the number of machines expected to be used .
Usually ,M=200000,R=5000, Use 2000 Working machine .

Backup Tasks（ Near the end of the mission , Start multiple processes to execute ）

When MapReduce When the operation is near completion , The host will schedule the backup of the remaining ongoing tasks . As long as the main execution or backup execution is completed , The task is marked as completed

Insert picture description here

Partition and Combiner

Insert picture description here

Skipping Bad Records（ Error skipping ）

Sometimes , There is a problem in the user code that causes map or Reduce Function deterministic crash error on some records .

MapReduce The library detects which records cause deterministic crashes , And skip these records to move forward .

Each worker process installs a signal handler , Used to capture segmentation conflicts and bus errors .
If the user code generates a signal , The signal handler will send a message to MapReduce The host sends a message containing the serial number “last gasp.UDP” Data packets .
When the master server sees multiple failures on a particular record , Indicates that the record should be skipped