当前位置:网站首页>RPC kernel details you must know (worth collecting)!!!
RPC kernel details you must know (worth collecting)!!!
2022-06-23 09:46:00 【58 Shen Jian】
Microservice layered architecture , We talked a lot before , Microservices are inseparable from RPC frame ,RPC The principle of the frame 、 Practice and details , Let's talk to you today .
The article is longer ,1 Around the word , It is recommended to collect... In advance .
What are the benefits of servitization ?
One of the benefits of servitization is , It's not limited to what technology the service provider uses , It can realize the technology decoupling of large companies across teams , As shown in the figure below :

(1) service A: The European team maintains , The technical background is Java;
(2) service B: The American team maintains , use C++ Realization ;
(3) service C: The Chinese team maintains , The technology stack is go;
The upstream caller of the service , According to the interface 、 The protocol can complete the call to the remote service .
But actually , Most Internet companies , The R & D team is limited , Most of them use the same set of technical system to realize services :

In this case , Without a unified service framework , The service providers of each team need to implement a set of serialize 、 Deserialization 、 Network framework 、 Connection pool 、 Transceiver thread 、 timeout handler 、 State machine etc. “ Outside the business ” Repetitive technical labor , Cause overall inefficiency .
therefore , The unified service framework puts the above “ Outside the business ” The work of , It's the first problem to be solved by servitization .
What is? RPC?
Remote Procedure Call Protocol, Remote procedure call .
What is? “ long-range ”, Why? “ far ”?
Let's see what is “ near ”, namely “ Local function call ”.
When we write down :
int result = Add(1, 2);
This line of code , What happened ?

(1) Pass two input parameters ;
(2) Called the function in the local code segment , Execution of operational logic ;
(3) Return to a reference ;
These three actions , All in the same process space , This is a Local function call .
Is there any way , Call a cross process function ?
Typical , This process is deployed on another server .

The easiest thing to think of , Two processes agree on a protocol format , Use Socket signal communication , To transmit :
(1) Enter the reference ;
(2) Which function to call ;
(3) The ginseng ;
If it can be achieved , That this is “ long-range ” Procedure call .
Socket Communication can only pass a continuous stream of bytes , How to join 、 Functions are put in a continuous byte stream ?
hypothesis , To design a 11 Byte request message :

(1) front 3 Bytes to fill in the function name “add”;
(2) middle 4 Bytes to fill in the first parameter “1”;
(3) At the end of 4 Bytes to fill in the second parameter “2”;
Empathy , You can design one 4 Byte response message :

(1)4 Bytes to fill in the processing result “3”;
The caller's code may change to :
request = MakePacket(“add”, 1, 2);
SendRequest_ToService_B(request);
response = RecieveRespnse_FromService_B();
int result = unMakePacket(respnse);
this 4 One step is :
(1) Make the incoming parameter a byte stream ;
(2) Send byte stream to service B;
(3) From the service B Accept the return byte stream ;
(4) Change the return byte to an outgoing parameter ;
The server's code may change to :
request = RecieveRequest();
args/function = unMakePacket(request);
result = Add(1, 2);
response = MakePacket(result);
SendResponse(response);
This 5 It's easy to understand :
(1) The server receives the byte stream ;
(2) Flow bytes into function names and parameters ;
(3) Call the function locally to get the result ;
(4) Turn the result into a byte stream ;
(5) Send the byte stream to the caller ;
This process is described as follows with a picture :

The processing steps of the caller and the server are very clear .
What's the biggest problem with this process ?
The caller is in too much trouble , Pay attention to a lot of underlying details every time :
(1) Enter the conversion of byte stream , That is, serializing application layer protocol details ;
(2)socket send out , I.e. details of network transmission protocol ;
(3)socket receive ;
(4) Conversion of byte stream to output parameter , That is, deserializing application layer protocol details ;
Can call layer not pay attention to this detail ?
Sure ,RPC Framework is to solve this problem , It allows callers to “ Call a remote function as if it were a local function ( service )”.
Here we are. , If it's RPC, I feel a little bit about serialization ? To look down , There are more underlying details .
RPC What is the responsibility of the framework ?
RPC frame , To mask complexity from the caller , We need to shield service providers from all kinds of complexity :
(1) Service callers client It feels like calling a local function , To invoke the service ;
(2) service provider server It feels like implementing a local function , To implement services ;
So the whole thing RPC The framework is divided into client part And server part , Achieve the above goals , Shield complexity , Namely RPC The responsibility of the framework .

As shown in the figure above , Business party's responsibilities yes :
(1) The caller A, Pass in the parameter , Execution call , Get the results ;
(2) Service provider B, Receive the parameters , Perform logical , Return results ;
RPC The responsibility of the framework yes , The big blue frame in the middle :
(1)client End : serialize 、 Deserialization 、 Connection pool management 、 Load balancing 、 Fail over 、 Queue management , Timeout Management 、 Asynchronous management and so on ;
(2)server End : Server components 、 The server receives and sends the queue 、io Threads 、 The worker thread 、 Serialization, deserialization, etc ;
server We all know a lot about the end technology , Next, let's focus on client Technical details of the end .
First look at it. RPC-client Part of the “ Serialization deserialization ” part .
Why serialization ?
Engineers usually use “ object ” To manipulate data :
class User{
std::String user_name;
uint64_t user_id;
uint32_t user_age;
};
User u = new User(“shenjian”);
u.setUid(123);
u.setAge(35);
But when it comes to data Storage perhaps transmission when ,“ object ” It's not so easy to use , It is often necessary to transform data into continuous space “ Binary byte stream ”, Some typical scenes are :
(1) database Disk storage for index : The index of the database is in memory b+ Trees , But this format can't be directly stored on disk , So we need to b+ The tree is transformed into a stream of binary bytes in continuous space , To be stored on disk ;
(2) The cache KV Storage :redis/memcache yes KV Cache of type , Cached stored value Must be a stream of binary bytes in contiguous space , It can't be User object ;
(3) Network transmission of data :socket The data sent must be a stream of binary bytes in continuous space , It can't be the object ;
So-called serialize (Serialization), Will be “ object ” The data of form is transformed into “ Continuous space binary byte stream ” The process of morphological data . The reverse process of this process is called Deserialization .
How to serialize ?
It's a very detailed question , If I let you do it “ object ” Convert to byte stream , What would you do ? One easy way to think of it is xml( perhaps json) This kind of markup language with self description features :
<class name=”User”>
<element name=”user_name” type=”std::String” value=”shenjian” />
<element name=”user_id” type=”uint64_t” value=”123” />
<element name=”user_age” type=”uint32_t” value=”35” />
</class>
Set rules for conversion , It's easy for the sender to send User An object of class is serialized as xml, Service received xml After binary stream , It's also easy to serialize its scope into User object .
Voice over : When the language supports reflection , The job is easy .
The second method is to implement binary protocol for serialization , Or on top User Object as an example , You can design a general protocol like this :

(1) head 4 Bytes for sequence number ;
(2) After the serial number 4 Byte representation key The length of m;
(3) Next m Byte representation key Value ;
(4) Next 4 Byte representation value The length of n;
(5) Next n Byte representation value Value ;
(6) image xml Go on recursively , Until the entire object is described ;
above User object , It may be described in this Agreement as follows :

(1) first line : Serial number 4 Bytes ( set up 0 Represents the class name ), Class name length 4 Bytes ( The length is 4), Next 4 Bytes are class names (”User”), common 12 byte ;
(2) The second line : Serial number 4 Bytes (1 Represents the first attribute ), Attribute length 4 Bytes ( The length is 9), Next 9 Bytes are property names (”user_name”), Property value length 4 Bytes ( The length is 8), Property value 8 Bytes ( The value is ”shenjian”), common 29 byte ;
(3) The third line : Serial number 4 Bytes (2 Represents the second attribute ), Attribute length 4 Bytes ( The length is 7), Next 7 Bytes are property names (”user_id”), Property value length 4 Bytes ( The length is 8), Property value 8 Bytes ( The value is 123), common 27 byte ;
(4) In the fourth row : Serial number 4 Bytes (3 Represents the third attribute ), Attribute length 4 Bytes ( The length is 8), Next 8 Bytes are property names (”user_name”), Property value length 4 Bytes ( The length is 4), Property value 4 Bytes ( The value is 35), common 24 byte ;
The whole binary byte stream has 12+29+27+24=92 byte .
The actual serialization protocol has a lot more details to consider , for example : Strongly typed languages not only need to restore attribute names , Property value , Also restore the attribute type ; Complex objects should not only consider common types , Also consider object nesting types and so on . in any case , The idea of serialization is similar .
What factors should be considered in the serialization protocol ?
Regardless of the use of mature protocols xml/json, Or customize the binary protocol to serialize objects , These factors need to be considered in the design of serialization protocol .
(1) Parsing efficiency : This should be the primary consideration of serialization protocol , image xml/json It takes time to parse , Need analysis doom Trees , Binary custom protocol parsing is very efficient ;
(2) compression ratio , Transmission validity : The same object ,xml/json There's a lot of xml label , The effectiveness of information is low , Binary custom protocol takes up a lot less space ;
(3) Scalability and compatibility : Is it convenient to add fields , Whether the old client needs to be forced to upgrade after adding fields , It's all questions to consider ,xml/json And the above binary protocol can be easily extended ;
(4) Readability and debuggability : It's easy to understand ,xml/json Readability is much better than binary protocol ;
(5) Cross language : Both of the above protocols are cross lingual , Some serialization protocols are closely related to the development language , for example dubbo The serialization protocol can only support Java Of RPC call ;
(6) generality :xml/json Very versatile , There are good third-party parsing Libraries , Every language is easy to parse , Although the above custom binary protocol can be cross language , But every language has to write a simple protocol client ;
What are the common serialization methods ?
(1)xml/json: Parsing efficiency , The compression ratio is poor , Extensibility 、 Readability 、 Good versatility ;
(2)thrift;
(3)protobuf:Google Produce , It must be a boutique. , Every aspect is very good , Strongly recommend , It belongs to binary protocol , The readability is a bit poor , But there are similar ones to-string Protocol helps debug problems ;
(4)Avro;
(5)CORBA;
(6)mc_pack: Students who understand understand , What you don't understand is what you don't understand ,09 Used in , Legend goes beyond protobuf, Students who are knowledgeable can talk about the current situation ;
(7)…

RPC-client except :
(1) Serialize the parts of deserialization ( In the picture above 1、4)
Also contains :
(2) Send byte stream and receive byte stream ( In the picture above 2、3)
This part , It can be divided into synchronous call and asynchronous call , Let's talk about it .
Voice over : Find out RPC-client It's not easy .
The code fragment of the synchronous call is :
Result = Add(Obj1, Obj2);// obtain Result It was blocked before
The code fragment of the asynchronous call is :
Add(Obj1, Obj2, callback);// Call and return to , Wait for the result
The result of processing is called :
callback(Result){// The callback function will be called after the processing result is obtained
…
}
These two types of calls , stay RPC-client in , It's implemented in a completely different way .
RPC-client How about synchronous invocation Architecture ?

So called synchronous call , Before we get the result , It's stuck , Will always occupy a worker thread , The figure above simply illustrates the components 、 Interaction 、 Process steps :
Big box on the left , Represents a worker thread of the caller
On the left Pink middle frame , On behalf of RPC-client Components
On the right Orange Box , On behalf of RPC-server
Two small blue frames , Represents synchronization RPC-client Two core components , Serialization components and connection pool components
White flow box , And the arrow number 1-10, Serial execution steps representing the entire worker thread :
1) Business code origination RPC call :
Result=Add(Obj1,Obj2)
2) Serialization component , Serialize object calls into binary byte streams , It can be understood as a packet to be sent packet1;
3) Get an available connection through the connection pool component connection;
4) By connecting connection Package packet1 Send to RPC-server;
5) Send packets over the network , issue RPC-server;
6) Response packets are transmitted over the network , Send back to RPC-client;
7) By connecting connection from RPC-server Collect response package packet2;
8) By connecting the pool components , take conneciont Put it back in the connection pool ;
9) Serialization component , take packet2 Fan serialized as Result Object returned to caller ;
10) Business code acquisition Result result , The worker thread continues down ;
Voice over : Please refer to... In the architecture diagram 1-10 Step reading .
What is the role of the connection pool component ?
RPC Load balancing supported by frame lock 、 Fail over 、 Send timeout and other features , They are all implemented through the connection pool component .

The interface provided by the typical connection pool component is :
int ConnectionPool::init(…);
Connection ConnectionPool::getConnection();
int ConnectionPool::putConnection(Connection t);
init What did you do ?
And downstream RPC-server( It's usually a cluster ), establish N individual tcp A long connection , The so-called connection “ pool ”.
getConnection What did you do ?
Connection from “ pool ” Take one of the links , Lock ( Set a flag ), Return to caller .
putConnection What did you do ?
Put an assigned connection back in the connection “ pool ” in , Unlock ( Also set a flag ).
How to realize load balancing ?
The connection pool is established with a RPC-server Cluster connectivity , When the connection pool returns to the connection , Need to be random .
How to achieve failover ?
The connection pool is established with a RPC-server Cluster connectivity , When the connection pool finds that the connection of a certain machine is abnormal , The connection of this machine needs to be removed , Back to normal connection , After the machine is restored , Add the connection back .
How to realize sending timeout ?
Because it's a synchronous blocking call , After getting a connection , Use the... With timeout send/recv It can realize sending and receiving with timeout .
in general , synchronous RPC-client The implementation of is relatively easy , Serialization component 、 The connection pool component matches the number of multi threads , Can be realized .
RPC-client How about the asynchronous callback architecture ?

So called asynchronous callback , Before we get the result , It won't be blocked , In theory, no thread is blocked at any time , So the asynchronous callback model , In theory, only a few worker threads and service connections are needed to achieve high throughput , As shown in the figure above :
The frame on the left , It's a small number of worker threads ( Just a few ) Make calls and callbacks
The pink frame in the middle , On behalf of RPC-client Components
Orange box on the right , On behalf of RPC-server
Six little blue frames , It's asynchronous RPC-client Six core components : Context manager , Timeout Manager , Serialization component , Downstream send and receive queue , Downstream transceiver thread , Connection pool components
White flow box , And the arrow number 1-17, Serial execution steps representing the entire worker thread :
1) Business code initiates asynchrony RPC call ;
Add(Obj1,Obj2, callback)
2) Context manager , The request , Callback , The context is stored ;
3) Serialization component , Serialize object calls into binary byte streams , It can be understood as a packet to be sent packet1;
4) Downstream send and receive queue , Put the message in “ Queue to be sent ”, The call now returns , Does not block worker threads ;
5) Downstream transceiver thread , Send messages from “ Queue to be sent ” Remove from , Get an available connection through the connection pool component connection;
6) By connecting connection Package packet1 Send to RPC-server;
7) Send packets over the network , issue RPC-server;
8) Response packets are transmitted over the network , Send back to RPC-client;
9) By connecting connection from RPC-server Collect response package packet2;
10) Downstream transceiver thread , Put the message in “ Queue accepted ”, By connecting the pool components , take conneciont Put it back in the connection pool ;
11) In the downstream mail queue , The message is taken out , The callback is about to start , Does not block worker threads ;
12) Serialization component , take packet2 Fan serialized as Result object ;
13) Context manager , Will result in , Callback , Take out the context ;
14) adopt callback Callback business code , return Result result , The worker thread continues down ;
If the request does not return for a long time , The process is :
15) Context manager , The request did not return for a long time ;
16) Timeout manager gets timeout context ;
17) adopt timeout_cb Callback business code , The worker thread continues down ;
Voice over : Please go through this process several times in conjunction with the architecture diagram .
The serialization component and the connection pool component have been described above , It is easy to understand the receiving and sending queue and the receiving and sending thread . The following highlights Context manager And Timeout Manager These two general components .
Why need context manager ?
Due to the sending of the request package , The callbacks of response packages are asynchronous , Not even in the same worker thread , A component is required to record the context of a request , Put the request - Respond to - Callback and so on some information match .
How to request - Respond to - Callback this information to match ?
It's an interesting question , Sent... Via a link to the downstream service a,b,c Three request packages , Asynchronously received x,y,z Three response packages :

How to know which request package corresponds to which response package ?
How to know which response package corresponds to which callback function ?
Can pass “ request id” To implement the request - Respond to - Series of callbacks .

The whole process is as follows , By request id, Context manager to respond to requests - Respond to -callback Mapping between :
1) Generate request id;
2) Generate request context context, The context contains the sending time time, Callback function callback Etc ;
3) Context manager records req-id And context context The mapping relation of ;
4) take req-id Type it in the request bag and send it to RPC-server;
5)RPC-server take req-id Return... By typing in the response package ;
6) By... In the response package req-id, Find the original context through the context manager context;
7) From the context context Get the callback function callback;
8)callback take Result Bring back , Drive further execution of business ;
How to realize load balancing , Fail over ?
Similar to the idea of synchronous connection pool , The difference is :
(1) The synchronous connection pool uses blocking mode to send and receive , Need a service with a ip Create multiple connections ;
(2) Asynchronous sending and receiving , One of the services ip Only a few connections need to be made ( for example , One tcp Connect );
How to realize overtime sending and receiving ?
Over time , It's different from the implementation of synchronous block transceiver :
(1) Synchronization block timeout , You can directly use the send/recv To achieve ;
(2) Asynchronous non blocking nio Network message sending and receiving , Because the connection won't wait for a packet to be returned all the time , Timeout is implemented by the timeout manager ;
How to implement timeout management in timeout Manager ?

Timeout Manager , It is used to implement the callback processing of request packet return timeout .
Each request is sent to the downstream RPC-server, Will save... In context manager req-id Information with context , A lot of information about the request is stored in the context , for example req-id, Call back , Timeout callback , Sending time, etc .
Timeout manager start timer In the context manager context scan , See if the request in the context took too long to send , If it's too long , No longer waiting for the package to be returned , Direct timeout callback , Drive the business process down , And delete the context .
If the timeout callback is executed , The normal return package arrives , adopt req-id Context not found in context manager , Just drop the request .
Voice over : Because it has timed out , Unable to recover context .
in any case , Asynchronous callbacks are compared to synchronous callbacks , In addition to serialization components and connection pool components , There will be more context managers , Timeout Manager , Downstream send and receive queue , Downstream transceiver thread and other components , And has an impact on the calling habits of the caller .
Voice over : Programming habits , From synchronization to callback .
Asynchronous callback can improve the overall throughput of the system , Which way to realize RPC-client, You can combine business scenarios to select .
summary
What is? RPC call ?
Just like calling a local function , Call a remote service .
Why RPC frame ?
RPC The frame is used to shield RPC Serialization during the call , Network transmission and other technical details . Let the caller focus only on the call , The service side only focuses on implementation calls .
What is serialization ? Why serialization is needed ?
The process of converting an object into a continuous binary stream , It's called serialization . Disk storage , Cache storage , Network transport can only operate on binary streams , So you have to serialize .
Sync RPC-client What are the core components of ?
Sync RPC-client The core component of is the serialization component 、 Connection pool components . It achieves load balancing and failover through connection pooling , Timeout processing is realized by blocking the receiving and sending .
asynchronous RPC-client What are the core components of ?
asynchronous RPC-client The core component of is the serialization component 、 Connection pool components 、 Send and receive queues 、 Transceiver thread 、 Context manager 、 Timeout Manager . It passes through “ request id” To associate request packages - Response package - Callback function , Use context manager to manage context , Use... In the timeout manager timer Trigger timeout callback , Push forward the timeout processing of business process .
Ideas More important than the conclusion .
Architect's way - Share technical ideas
research :
Which have you read RPC Source code of framework ?
边栏推荐
- mysql innodb 的 redo log buffer 中未 commit 的事务持久化到 redo log 后,万一事务 rollback 了怎么办?redo log 怎么处理这个事务操作?
- 高性能算力中心 — RDMA — 实现技术
- 必须知道的RPC内核细节(值得收藏)!!!
- Redis learning notes - traverse key
- Go language JSON processing
- 薄膜干涉数据处理
- [MRCTF2020]Ez_bypass
- NiO example
- RGB and CMYK color modes
- Chain implementation of stack -- linear structure
猜你喜欢
Redis learning notes - publish and subscribe

XML related interview questions

web--信息泄漏
[nanopi2 trial experience] the first step of bare metal
![[CISCN2019 华北赛区 Day2 Web1]Hack World](/img/bf/51a24fd2f9f0e13dcd821b327b5a00.png)
[CISCN2019 华北赛区 Day2 Web1]Hack World
Redis learning notes - data type: hash

云原生数据库-Amazon RDS

UEFI source code learning 4.1 - pcihostbridgedxe

Zone d'entrée du formulaire ionic5 et boutons radio

Use Base64 to show pictures
随机推荐
ICLR 2022 | 视频中的动态卷积TAdaConv以及高效的卷积视频理解模型TAdaConvNeXt
[网鼎杯 2020 青龙组]AreUSerialz
J. Med. Chem. | RELATION: 一种基于靶标结构的深度学习全新药物设计模型
map的下标操作符
UEFI learning 3.6 - ACPI table on ARM QEMU
mysql中innodb下的redo log什么时候开始执行check point落盘的?
Redis learning notes - Database Management
swagger UI :%E2%80%8B
Zone d'entrée du formulaire ionic5 et boutons radio
swagger UI :%E2%80%8B
Go 字符串比较
表单重复提交问题
Ionic5 form input box and radio button
[geek Challenge 2019] hardsql
基於STM32設計的寵物投喂器
Redis learning notes - data type: ordered set (Zset)
Go语言JSON 处理
Pet Feeder Based on stm32
Notes on using the coding code base
[ciscn2019 North China Day2 web1]hack world