当前位置：网站首页>The solution of distributed system: directory, message queue, transaction system and others

The solution of distributed system: directory, message queue, transaction system and others

2022-06-24 12:03:00 【Hua Weiyun】

One 、 A directory service （ZooKeeper）

A distributed system is a whole composed of many processes , Every part of the whole , Will have some state , For example, load conditions , The mastery of certain data and so on . And the data related to other processes , In recovery 、 It's very important to expand and shrink .

Simple distributed systems , You can record this data through a static configuration file ： Connection correspondence between processes , Their IP Address and port, etc . However , A highly automated distributed system , It is necessary to keep these state data dynamically . In this way, the program can do it by itself disaster and Load balancing The job of .

Some programmers write their own DIR service （ A directory service ）, To record the running status of processes in the cluster . The process in the cluster will interact with this DIR Automatic association of services , This is disaster recovery 、 Capacity expansion 、 When the load is balanced , It's automatically based on these DIR Data in service , To adjust the destination of the request , So as to avoid the fault machine 、 Or connect to a new server .

However , If you just use a process to do the job , So this process becomes the cluster's “ Single point ”—— It means , If the process fails , Then the whole cluster may not work ( A single point of failure ). Therefore, the directory service storing the cluster state also needs to be distributed . Fortunately, there is ZooKeeper This excellent open source software , It is a distributed directory server .

ZooKeeper You can simply start an odd number of processes , To form a small cluster of directory services . This cluster will be available to all other processes , It's a huge “ Configuration tree ” The ability of . This data will not just be stored in a Zookeeper In progress , It's based on a very secure algorithm , Let multiple processes host . This makes Zookeeper Become an excellent distributed data storage system .

because Zookeeper Data storage structure of , It is a tree system similar to file directory , So we often take advantage of its functions , Bind each process to one of them “ m. ” On , And then by checking these “ Branch ”, To forward the server request , We can simply solve the problem of request routing （ Who will do it ） The problem of . In addition, you can also use the “ Branch ” State of the payload of the marked process on , So load balancing is easy to do .

Directory service is one of the most important components in distributed system . and ZooKeeper It's a good open source software , It's just for this task .

Two 、 Message Queuing service （ActiveMQ、ZeroMQ、Jgroups）

If two processes want to communicate across machines , We almost all use TCP/UDP These agreements . But using the Internet directly API To write cross process communication , It's a very troublesome thing . In addition to writing a lot of the underlying socket Out of code , Also deal with things like ： How to find the process to interact with data , How to protect the integrity of data packets from loss , If the other process of communication fails , Or how the process needs to be restarted and so on . These questions include Capacity expansion and disaster 、 Load balancing And so on .

In order to solve the problem of inter process communication in distributed system , People have come up with an effective model , Namely “ Message queue ” Model . Message queuing model , It's the interaction between processes , Abstract processing of message by message , And for the news , There are some. “ queue ”, That's the pipe , To hold messages . Each process can access one or more queues , Reading messages from inside （ consumption ） Or write a message （ production ）. Because there is a cache pipeline , You can safely change the state of the process . When the process starts , It will automatically consume messages . And the routing of the message itself , It's also determined by the stored queue , In this way, the complex routing problem , It becomes a question of how to manage static queues .

General Message Queuing service , It's all about simple “ The delivery ” and “ collect ” Two interfaces , But the management of message queue itself is more complicated , Generally speaking, there are two kinds . Part of the Message Queuing service , Point to point queue management is advocated ： Between each pair of communication nodes , Each has a separate message queue . The advantage of this approach is that information from different sources , Can not affect each other , Not because there are too many messages in a queue , Crowding out the message cache space of other queues . And the program that processes the message can also define the priority of processing itself —— Charge first 、 Multiprocessing a queue , And less processing of other queues .

But this kind of peer-to-peer message queuing , A large number of queues will be added as the cluster grows , This is a complex matter for memory occupation and operation and maintenance management . So more advanced message queuing services , Start by allowing different queues to share memory space , And the address information of the message queue 、 Create and delete , It's all automated .—— These automations often need to rely on the “ A directory service ”, To register the queue ID The corresponding Physics IP And ports . For example, many developers use ZooKeeper To act as the central node of the Message Queuing service ; But similar Jgropus Such software , Maintain a cluster state to store the information of each node .

Another kind of message queue , It's like a public mailbox . A Message Queuing service is a process , Any user can post or receive messages in the process . This makes it easier to use message queues , Operation and maintenance management is also more convenient . No ** In this way , Any message from sending to processing , At least two interprocess communications , The delay is relatively high .** And because there's no scheduled delivery 、 Charge constraints , So it's easier to BUG.

No matter which Message Queuing service you use , In a distributed server-side system , Interprocess communication is a problem that must be solved , So as a server-side programmer , When writing distributed system code , The most used code is based on message queue driver , It also led directly to EJB3.0 hold “ Message driven Bean” Add to the specification .

3、 ... and 、 Transaction system

In a distributed system , Transaction is one of the most difficult technical problems to solve . Since a process may be distributed among different processes , Any process can fail , And this failure problem needs to cause a rollback . Most of this rollback involves many other processes . This is a diffusive multiprocess communication problem . To solve transaction problems on Distributed Systems , There must be two core tools ： One is a stable state storage system ; Another is a convenient and reliable broadcasting system .

The state of any step in a transaction , Must be visible throughout the cluster , And also have the ability of disaster recovery . This requirement , It's usually made up of clusters “ A directory service ” To undertake . If our directory service is robust enough , So we can put the processing state of each transaction , Write to the directory service synchronously .Zookeeper Once again, it can play an important role in this place .

If a transaction breaks , Need to roll back , So this process involves several steps that have been performed . Maybe this rollback only needs to be rolled back at the entrance （ Join there to save the data needed for rollback ）, You may also need to roll back on each processing node . If it's the latter , Then you need the abnormal nodes in the cluster , Broadcast one to all other related nodes “ Roll back ！ Business ID yes XXXX” Such news . The bottom layer of this broadcast is usually hosted by the Message Queuing service , But similar Jgroups Such software , Direct delivery of broadcasting services .

Now we're talking about transactional systems , But actually distributed systems often need “ Distributed lock ” function , It's also something that this system can do at the same time . So-called “ Distributed lock ”, That is, a constraint that allows each node to check before executing . If we have an efficient and atomically operated directory service , So this lock state is actually a kind of “ One step transaction ” State records of , The rollback operation defaults to “ Suspend operation , Try again later ”. such “ lock ” The way , It's simpler than transaction , So it's more reliable , So now more and more developers , Willing to use this “ lock ” service , Not to achieve a “ Transaction system ”.

Four 、 Automated deployment tools （Docker）

** Because of the biggest demand of distributed system , It's at run time （ There may be a need to interrupt service ） To change the service capacity ： Expand or shrink .** When some nodes in the distributed system fail , New nodes are also needed to get back to work . If these are still like the old-fashioned Server Management , By filling in the form 、 declare 、 Enter the machine room 、 Install the server 、 Deploy software …… This set of practices , That's definitely not efficient .

** In a distributed system environment , We usually use “ pool ” The way to manage services .** We will apply for a batch of machines in advance , Then run the service software on some machines , Others serve as backup . Obviously, our batch of servers can't serve only one business , Instead, it will provide multiple different service carriers . Those backup servers , It will become a common backup for multiple businesses “ pool ”. As business needs change , Some servers might “ sign out ”A Service and “ Join in ”B service .

This frequent service change , Rely on highly automated software deployment tools . Our operation and maintenance personnel , You should master the deployment tools provided by the developers , Instead of a thick manual , To do this kind of operation and maintenance .** Some experienced development teams , It will unify all the underlying business frameworks , With a view to most of the deployment 、 Configuration tools can be managed by a common system .** And the open source community , There are similar attempts , The most well-known is RPM Installation package format , However RPM It's still too complicated to pack , It doesn't meet the deployment requirements of server-side programs . So it came back later Chef Is the representative of the programmable universal deployment system .

After the advent of virtual machine technology ,PaaS The platform provides strong support for automatic deployment ： If we press a PaaS Platform specification to write the application , You can completely leave the application to the platform for deployment , Calculation of bearing capacity 、 Deployment planning is done automatically . The best in this field is Google Of AppEngine： We can use it directly Eclipse Develop a local Web application , And then upload it to AppEngine Inside , All deployment is complete .AppEngine Will be automatically based on this Web Application visits , To expand 、 Shrinkage capacity 、 Fault recovery .

However , A truly revolutionary tool , yes Docker Appearance . Although the virtual machine 、 Sandbox technology is not a new technology for a long time , But really use these technologies as *** Deployment tools *** The time is not long .Linux Efficient lightweight container technology , Provides ease of deployment —— It can be found in various libraries 、 Packaging applications in a collaborative software environment , And then randomly deploy in any one Linux On the system .

To manage a large number of distributed server-side processes , We really need to work a lot , Optimize their deployment management efforts . Unifies the server side process the movement standard , It is the basic condition to realize automatic deployment management . We can use “ operating system ” As a norm , use Docker technology ; It can also be based on “Web application ” As a norm , Adopt some PaaS Platform technology ; Or define some more specific specifications yourself , Develop a complete distributed computing platform .

5、 ... and 、 The log service （log4j）

Server side logs , It has always been an important and easily overlooked issue . A lot of teams at the beginning , Just think of logging as development debugging 、 exclude BUG The auxiliary tools of . But it will soon be discovered , After the service is operational , The log is almost the only effective means that the server-side system can use to understand the program situation at runtime .

Although we have all kinds of profile Tools , However, most of these tools are not suitable for opening on the service of formal operation , Because it will seriously reduce its performance . So we need to analyze it according to the log more often . Although logs are essentially lines of text information , But because of its great flexibility , So it will be very valued by the development and operation and maintenance personnel .

Log itself is a very vague thing in concept . You can open any file you like , And then write some information . But modern server systems , Generally, we will make some standardized requirements for logs ： The log must be line by line , This is more convenient for future statistical analysis ; Each line of log text , There should be some unified head , For example, date and time are basic requirements ; Log output should be hierarchical , such as fatal/error/warning/info/debug/trace wait , Programs can adjust the level of output at run time , So as to save the consumption of log printing ; The head of the log usually needs some similar users ID perhaps IP Header information like address , It is used to quickly locate and filter a batch of log records , Or there are some other fields that can be used to filter and narrow down the viewing range of logs , It's called dyeing function ; Log files also need to have “ Roll back ” function , That is, to keep multiple files of a fixed size , Avoid long-term operation , Fill up the hard disk .

Because of the above needs , So the open source community provides a lot of game log component libraries , For example, famous log4j, And a large number of log4X Family library , These are widely used and well received tools .

However, compared with the log printing function , The collection and statistics functions of logs are often ignored . As a distributed system programmer , I'm sure I want to start from a centralized node , Can collect statistics to the whole cluster log situation . And there are some log statistics , Even hope to get it repeatedly in a very short time , Used to monitor the health of the entire cluster . Do that , You have to have a distributed file system , It's used to store logs that keep coming in （ These logs tend to go through UDP Send the protocol ）. And on this file system , You need to have a similar Map Reduce The statistical system of Architecture , Only in this way can the massive log information , Fast statistics and alarms . Some developers will use it directly Hadoop System , Some use Kafka As a log storage system , Then build your own statistical program .

Log service is the dashboard of distributed operation and maintenance 、 Periscope . Without a reliable logging service , The whole system may run out of control . So no matter how many nodes you have in your distributed system , It takes a lot of effort and dedicated development time , To build a system for automatic statistical analysis of logs .