当前位置:网站首页>Message queue - function, performance, operation and maintenance comparison
Message queue - function, performance, operation and maintenance comparison
2022-06-26 06:11:00 【Impl_ Sunny】
One 、 function
1.1 Consumption push-pull mode

1.2 Delay queue
Delayed delivery of messages , When a message is generated and delivered to the message queue , Some business scenarios do not want consumers to receive messages immediately , It's about waiting for a certain time , Consumers can get this information to spend .
There are two types of delay queues , Message based latency and queue based latency :
Message based latency : Set different delay times for each message , When new messages enter the queue, they are sorted according to the delay time , Of course, this will have a great impact on the performance .
Queue based delay : Set up queues with different delay levels , The delay time of each message in the queue is the same , This eliminates the performance loss caused by delay time sorting , The timeout message can be delivered through a certain scanning strategy .
Usage scenarios of delayed messages, such as exception detection and retry , Order timeout, cancellation, etc , for example :
Service request exception , Exception requests need to be put on a separate queue , Partition 5 Try again in minutes ;
Users buy goods , But it's been unpaid , Users need to be reminded to pay regularly , The order will be closed if the time exceeds ;
Interview or meeting appointment , Half an hour before the interview or meeting , Send a notification to remind again .
Different MQ The support is as follows :
Kafka: Deferred messages are not supported
Pulsar: Support second level delay messages , All delayed messages will be Delayed Message Tracker Record the corresponding index,consumer At the time of consumption , Will go first Delayed Message Tracker Check , Whether there is a message due for delivery , If there is a message of expiration , From Tracker Take out the corresponding index, Find the corresponding message for consumption , If there is no expiration message , Then consume the normal news directly . For long delayed messages , Will be stored on disk , When the delay interval is approaching, it is loaded into memory .
RocketMQ: Open source version delay messages are temporarily stored in an internal theme , Arbitrary time precision is not supported , Support specific level, For example, timing 5s,10s,1m etc. .
RabbitMQ: One needs to be installed rabbitmq_delayed_message_exchange plug-in unit .
1.3 Dead letter queue
For some reason, messages can't be delivered correctly , In order to ensure that messages are not discarded for no reason , It is usually placed in a queue of special roles , This queue is generally called dead letter queue . There is also a corresponding “ Back out of the queue ” The concept of , Imagine if something goes wrong with the consumer , Then there will be no confirmation of this consumption (Ack), After the operation of rolling back the message, the message will always be placed at the top of the queue , And then it's constantly being processed and rolled back , Causes the queue to fall into an endless loop .
To solve this problem , You can set up a fallback queue for each queue , It and dead letter queue are a mechanism guarantee for exception handling . On the ground , The role of fallback queue can be played by dead letter queue and retry queue .
Different MQ The support is as follows :
Kafka: No dead letter queue , adopt Offset Record the offset of current consumption .
Pulsar: There's a retry mechanism , When some news is first consumed by consumers , Didn't get a normal response , Will enter retry Topic in , After a certain number of retries , Stop retrying , Deliver to dead letter Topic in .
RocketMQ: adopt DLQ To record all messages of consumption failure .
RabbitMQ: The dead letter queue is implemented in a form similar to the delay queue .
1.4 Priority queue
Priority queues are different from FIFO queues , High priority messages have the privilege of being consumed first , This can provide guarantees of different message levels for downstream users .
But this priority also needs to have a premise : If consumers consume faster than producers , And the message middleware server ( Generally, it is simply called Broker) There's no news pile up in , So it doesn't really make sense to set priorities for messages sent , Because the producer has just sent a message to be consumed by the consumer , Then it's equivalent to Broker There is at most one message in , For a single message, priority is meaningless .
Different MQ The support is as follows :
Kafka、RocketMQ、Pulsar Priority queues are not supported , Message priority can be achieved through different queues .
RabbitMQ: Support priority messages .
1.5 The message goes back
General messages are processed after consumption , After that, you can't consume the message again . Message backtracking is just the opposite , After the consumption message is completed , It can also consume information that has been consumed before .
For messages , The problem we often face is “ Lost message ”, It is difficult to trace whether the message middleware is lost due to the defect of message middleware or the misuse of the user , If the message middleware itself has the function of message backtracking , It can be reproduced through retrospective consumption “ Lost ” The information then finds out the source of the problem .
The role of backtracking is much more than that , For example, there is index recovery 、 Local cache rebuild , Some business compensation schemes can also be implemented by backtracking .
Different MQ The support is as follows :
Kafka: Support message backtracking , You can specify... Based on timestamp or Offset, Reset Consumer Of Offset So that it can be consumed repeatedly .
Pulsar: Support message backtracking by time .
RocketMQ: Support time backtracking , The principle of implementation is the same as Kafka Agreement .
RabbitMQ: Backtracking is not supported , Once the message is marked for confirmation, it will be marked for deletion .
1.6 Message persistence
Traffic peak clipping is a very important function of message middleware , And this function actually benefits from its message accumulation ability . In a sense , If a message middleware does not have the ability to stack messages , Then it can't be regarded as a qualified message middleware .
Message heap integration memory stack and disk heap . Generally speaking , The capacity of the disk will be much larger than that of the memory , For disk stacking, its stacking capacity is the size of the entire disk . On the other hand , Message stack also provides redundant storage function for message middleware .
Different MQ The support is as follows :
Kafka and RocketMQ: Directly brush the message into the disk file for persistence , All the messages are stored on disk . As long as the disk capacity is enough , Can achieve unlimited message accumulation .
RabbitMQ : It's a typical memory stack , But this is not absolute , After some conditions are triggered, there will be a page change action to page the messages in memory to the disk ( Page change will affect throughput ), Or simply use lazy queues to persist messages directly to disk .
Pulsar: Messages are stored in BookKeeper On the storage cluster , It's also a disk file .
1.7 Message confirmation mechanism
Message queuing needs to manage consumption progress , Confirm that the consumer has successfully processed the message , Use push The message queue component of the method is often to confirm a single message , For unconfirmed messages , Delay redelivery or enter the dead letter queue .
Different MQ The support is as follows :
Kafka: adopt Offset Way to confirm the message .
RocketMQ: And Kafka Similar will be submitted Offset, The difference is that consumers are not satisfied with the news of consumption failure , Can be marked as message consumption failure ,Broker Will retry delivery , If the accumulated multiple consumption fails , Will be delivered to the dead letter queue .
RabbitMQ: Consumer confirms a single message , Otherwise, it will be put back in the queue and wait for the next delivery .
Pulsar: Use special Cursor management . Cumulative confirmation and Kafka The effect is the same ; Provide single or selective confirmation .
1.8 news TTL
news TTL Indicates the lifetime of a message , If the message comes out , stay TTL There are no consumers to consume in the time , The message queue will delete the message or put it into the dead letter queue .
Different MQ The support is as follows :
Kafka: Delete messages according to the set retention period . It's possible that the news hasn't been consumed , Deleted after expiration . I won't support it TTL.
Pulsar: Support TTL, If the message is not in the configured TTL Used by any consumer during the time period , The message will be automatically marked as confirmed . Message retention period and message TTL The difference between them is : The message retention period applies to messages marked as acknowledged and set as deleted , and TTL Act on not ack The news of . The legend above illustrates Pulsar Medium TTL. for example , If subscribe B No active consumers , In the configuration TTL After the time period , news M10 Automatically mark as confirmed , Even if no consumer actually reads the message .
RocketMQ: Mention the news TTL There is less information , However, the interface seems to be supported .
RabbitMQ: There are two ways , One is to set... In the queue attribute when declaring the queue , Messages in the entire queue have the same validity period . You can also set the properties of the message when sending the message , You can set different bits for each message TTL.
1.9 Multi-tenant isolation
Multi tenancy refers to the ability to provide services to multiple tenants through a software instance . A tenant is someone who has the same “ View ” A group of users . In systems that do not support multi tenancy , It is often necessary to create multiple message queue instances for different users or different clusters to achieve physical isolation , This will bring higher operation and maintenance costs .
As an enterprise class message system ,Pulsar The multi tenancy capability of is designed to meet the following requirements :
Ensure strict SLA Can smoothly meet .
Ensure isolation between different tenants .
Enforce quotas for resource utilization .
Provide per tenant and system level security .
Ensure low-cost operation and maintenance and as simple management as possible .
Pulsar The above needs are met in the following ways :
By authenticating each tenant 、 Authorization and ACL( Access control list ) Get the security you need .
Enforce storage quotas for each tenant .
Define all isolation mechanisms in a policy way , Policies can be changed during operation , In order to reduce operation and maintenance costs and simplify management work .
1.10 Message sequencing
Message ordering is to ensure the order of messages . The order of message consumption is consistent with that of production .
Different MQ The support is as follows :
Kafka: It ensures that the messages in the partition are in order .
Pulsar: Support two consumption patterns , The flow mode of exclusive subscription only ensures the order of messages , The shared subscription queue model does not guarantee ordering .
RocketMQ: Locks are needed to ensure that a queue has only one consumer thread to consume at the same time , Keep the message in order .
RabbitMQ:RabbitMQ The sequence of the conditions are more stringent , Need a single thread to send 、 Single thread consumption , And do not use delay queue 、 Priority queue and other advanced functions .
1.11 Message query
In actual development , Always check MQ The content of the message in , For example, through some MessageKey/ID, Query to MQ Specific news about . Or link tracking messages , Know where the news comes from , Where to send it , Then quickly check and locate the problem .
Different MQ The support is as follows :
Kafka: The storage layer is implemented in the form of distributed submission logs , Each write operation is appended to the end of the log in sequence . Reading is also sequential reading . Retrieval function is not supported .
Pulsar: It can be done by message ID, Query the message content of a specific message 、 Message parameters and message tracks .
RocketMQ: Support press Message Key、Unique Key、Message Id Query the message .
RabbitMQ: Use an index based storage system . These keep the data in a tree structure , To provide the fast access needed to acknowledge a single message . because RabbitMQ The message will be deleted after confirmation , Therefore, only unconfirmed messages can be queried .
1.12 Consumption patterns
Different MQ The support is as follows :
Kafka: There are two consumption patterns , In the end, it will ensure that a partition has only 1 Consumers are consuming :
subscribe The way : When the number of topic partitions changes or consumer When the quantity changes , Will be carried out in rebalance; register rebalance Monitor , You can manage it manually offset Don't register listeners ,kafka Automatic management .
assign The way : Manual will consumer And partition Make a correspondence ,kafka It's not going to happen rebanlance.
Pulsar: There are four consumption patterns , Exclusive mode and disaster recovery mode are the same Kafka similar , For the flow model , Each partition has only 1 Consumer consumption , It can ensure the order of messages . Sharing mode and Key The sharing mode is queue model , Multiple consumers can increase the speed of consumption , But there is no guarantee of order .

Exclusive Exclusive mode ( The default mode ): One Subscription Only with one Consumer relation , Only this Consumer Can receive Topic All the news of , If it's time to Consumer In case of failure, consumption will stop .
Disaster recovery mode (Failover): When there are multiple consumer when , Will be sorted in dictionary order , first consumer Is initialized as the only consumer to receive messages . When the first one consumer When disconnected , All the news ( Not confirmed and subsequently entered ) Will be distributed to the next... In the queue consumer.
Sharing mode (Shared): Message through round robin Polling mechanism ( You can also customize it ) Distribute to different consumers , And each message will only be distributed to one consumer . When the consumer disconnects , All sent to him , But unconfirmed messages will be rescheduled , Distribute to other surviving consumers .
KEY Sharing mode (Key_Shared): When there are multiple consumer when , According to the message key distributed ,key The same message will only be distributed to the same consumer .
RocketMQ: There are two consumption patterns ,BROADCASTING( Broadcast mode ),CLUSTERING( Cluster pattern )
Broadcast consumption refers to : A message is sent by more than one consumer consumption , Even though these consumer Belong to the same ConsumerGroup, The news will be ConsumerGroup Each of the Consumer Once for all , Broadcasting in consumption ConsumerGroup The concept can be considered meaningless in terms of message partitioning .
Cluster consumption mode : One ConsumerGroup Medium Consumer The instance shares consumption messages equally . For example, a Topic Yes 9 Bar message , One of them ConsumerGroup Yes 3 An example ( May be 3 A process , perhaps 3 Taiwan machine ), Then each instance consumes only part of it , Consumed messages cannot be consumed by other instances .
RabbitMQ: Are all with Pulsar The sharing mode is similar to , The form of the queue , Increasing the number of consumers in a consumer group can improve the speed of consumption .
1.13 Message reliability
Message loss is a common point when using message middleware , The reliability of message behind it is also a key factor to measure the quality of message middleware . Especially in the field of financial payments , Message reliability is particularly important .
For example, when a service fails , Some news of successful production for producers , Whether it will be lost during high availability switching . Synchronous disk brushing is an effective way to enhance the reliability of a component , Message middleware is no exception ,Kafka and RabbitMQ Can support synchronous disk brushing , But in most cases , The reliability of a component should not be guaranteed by the extremely lossy operation of synchronous brush disk , Instead, it uses a multi replica mechanism to ensure that .
Different MQ The support is as follows :
Kafka: Can be configured by request.required.acks Parameter setting reliability level , Indicates how many copies of a message have been received after confirmation , Was successfully sent by the task .
request.required.acks=-1 ( Full synchronization confirmation , Strong reliability guarantee )
request.required.acks=1(leader Acknowledge receipt of , Default )
request.required.acks=0 ( Unconfirmed , But the throughput is high )
Pulsar: There is a heel Kafka A similar concept , It's called Ack Quorum Size(Qa),Qa It is the one that needs to reply and confirm after each write request is sent Bookie The number of , The larger the value, the longer it takes to confirm the success of the write , The upper limit of its value is the number of copies Qw. For consistency ,Qa Should be :(Qw+1)/2 Or more , That is, to ensure data security ,Qa The lower limit is (Qw+1)/2.
RocketMQ: And Kafka similar .
RabbitMQ: Is master-slave architecture , Multiple copies and strong consistency semantics are realized by mirroring ring queue . Multiple copies can be guaranteed in master The node can be promoted after abnormal downtime slave As new master And continue to provide services to ensure availability .
Two 、 performance
In performance testing , There are many clients 、 Server parameter settings 、 Machine performance, configuration, etc , Such as message reliability level , Compression algorithm and so on , It's hard to do “ Completely ” Test of fairness of control variables . But there are a few concerns :
RabbitMQ The delay is microsecond , The latency of other components is in milliseconds ,RabbitMQ Should be MQ Relatively low in the component .
Kafka Single instance in topic / When there are many partitions , Performance will be significantly reduced :
kafka It's a partition, a file , When topic Too much , The total number of partitions will also increase ,kafka There are too many files in , When swiping messages , There will be file contention disk , Performance degradation .
also Kafka Every consumer joining or exiting will be rebalanced , When there are many partitions, rebalancing may take a long time , In the stage of rebalancing, consumers can't consume news .
and Pulsar Due to the separation of storage and Computing , So that it can support millions of Topic Number .
Pulsar and Kafka Are widely used in various enterprises , Each has its own advantages , Can handle large traffic through basically the same number of hardware . Some users mistakenly think that Pulsar Many components are used , Therefore, many servers are needed to implement and Kafka Comparable performance .
This idea applies to some specific hardware configurations , But in most cases with the same resource allocation ,Pulsar More obvious advantages , Better performance can be achieved with the same resources .
for instance ,Splunk Recently shared their choice Pulsar give up Kafka Why , Mentioned “ Due to the layered architecture ,Pulsar Help them reduce costs 30%-50%, The delay is reduced 80%-98%, Operating costs have been reduced 33%-50%”.Splunk The team found Pulsar Better use of disk IO, Reduce CPU utilization , At the same time, better control of memory .
In a distributed system , Although the single machine performance index is also very important , The overall performance of the distributed system and its flexible expansion and contraction capacity 、 High Availability Disaster Recovery and other capabilities will also be an important reference for evaluation .MQ Specific performance indicators of middleware , We also need to act according to the actual situation , According to the cluster configuration and client parameters actually purchased , Perform pressure test tuning to evaluate .
3、 ... and 、 Operation and maintenance
It is inevitable that various abnormal situations will occur in the process of use , Such as downtime 、 The network jitter 、 Expansion, etc . Message queue has remote disaster recovery function , High availability architecture and other capabilities , It can avoid some computing nodes 、 Failure caused by unavailability of network and other infrastructure .
3.1 High availability
Different MQ The support is as follows :
Kafka: Solve the problem of high availability by partitioning multiple copies .
Pulsar:Pulsar Computing Cluster Broker It's stateless , It can flexibly expand and shrink the capacity , Storage nodes Bookie Through message partition and fragment copy on , Each tile has one or more copies , Guarantee in a certain Bookie After hanging up , There are other segments that can provide services .
RocketMQ and RabbitMQ: It's all master-slave architecture , When master After hanging up , The original slave node continues to provide services . The standby machine provides consumer services , Make sure you don't lose the message , But no write service .
3.2 Cross regional disaster recovery
Pulsar Native supports cross regional disaster recovery , In this picture , whenever P1、P2 and P3 The producers of Cluster-A、Cluster-B and Cluster-C Medium T1 topic When sending a message , These messages are quickly replicated in different clusters . Once the message is copied , consumer C1 and C2 Will consume this message from their respective clusters .
Supported by this cross regional disaster recovery design , firstly , We can easily distribute services to multiple computer rooms ; second , It can deal with machine room level failure , That is, when a computer room is not available , Services can be transferred to other computer rooms to continue to provide external services .
One sentence summary ,Pulsar Cross regional replication of , In fact, it is to create a local cluster Producer, Take remote clusters as this Producer The sending address of , Send messages from the local cluster to , And maintain a local Cusor To ensure message reliability and idempotency .

3.3 The cluster expansion
When the news volume suddenly rises , When the message queue cluster reaches the bottleneck , The cluster needs to be expanded , Expansion is generally divided into horizontal expansion and vertical expansion , Horizontal capacity expansion refers to adding nodes in the cluster , Vertical capacity expansion refers to raising the configuration of some nodes in the cluster , Increase processing power .
Different MQ The support is as follows :
Kafka:Kafka Cluster because the topic partition is physically stored in Broker nodes , The nodes of the newly added cluster do not have storage partitions , You can't provide immediate service , So we need to put some Topic The partition is allocated to the newly added node , This will involve a process of partition data balancing , Copy the data of some partitions to the new node . This process is related to the amount of data currently stacked in the partition 、Broker Performance , There may be problems arising from Broker Overload , The accumulated data is too large , Lead to longer data equalization time .
Pulsar: Take infinite distributed logging and sharding as the center , With extended log storage ( adopt Apache BookKeeper) Realization , Built in tiered storage support , Therefore, the shards can be evenly distributed on the storage nodes . Because with any given topic Related data will not be bundled with specific storage nodes , Therefore, it is easy to replace storage nodes or shrink or expand capacity . in addition , The smallest or slowest node in the cluster will not become a short board of storage or bandwidth .
RocketMQ: New nodes join the cluster directly , In the new broker Create a new topic And allocate queues , Or in the existing topic Allocate queues based on . And Kafka Is the difference between the ,Kafka The partitions are on different physical machines , and Rocketmq Is a logical partition , In the form of a queue , Therefore, there is no data imbalance .
RabbitMQ: Since there is not too much message persistence involved , Add nodes directly to the cluster .
Four 、 summary
| classification | Features | Kafka | Pulsar | RocketMQ | RabbitMQ |
| function | Consumption push-pull mode | pull | push | pull | push |
| Delay queue | * | ||||
| Dead letter queue | * | ||||
| Priority queue | * | * | * | ||
| The message goes back | * | ||||
| Message persistence | |||||
| Message confirmation mechanism | Offset | Offset+ Single | Offset | Single | |
| news TTL | * | ||||
| Multi-tenant isolation | * | * | * | ||
| Message sequencing | The division is orderly | Flow pattern order | Consumer lock | * | |
| Message query | * | ||||
| Consumption patterns | Flow mode | Flow mode + Queue mode | Broadcast mode + Cluster pattern | Queue mode | |
| Message reliability | request.required.acks | Ack Quorum Size(Qa) | And Kafka similar | Mirror mode | |
| performance | Single machine throughput | 605MB/S | 605MB/S | class Kafka | 38MB/S |
| Message delay | 5ms | 5ms | ms level | microsecond | |
| Number of supported topics | Dozens to hundreds | Millions of | Hundreds to thousands | Thousands of | |
| Operation and maintenance | High availability | Distributed architecture | Distributed architecture | Master slave architecture | Master slave architecture |
| Cross regional disaster recovery | One | One | One | ||
| The cluster expansion | Add node , Balance by copying data | Add node , By adding shard equalization | Add node | Add node |
Kafka Launched earlier , Various scenarios, such as logs 、 There are mature solutions for big data processing .
Pulsar As a rookie , The supported functions are better than Kafka Richer , And cross regional disaster recovery , Multi tenant and other functions , Solved a lot of Kafka Design defects and operation and maintenance costs , Stronger overall stability . There are many big companies at home and abroad Pulsar The practical cases of .
therefore , Some traditional logs 、 Big data processing scenarios , High throughput is required , The requirement for message reliability is not so high , May choose Kafka, There are many excellent documents on how to optimize parameters to improve performance . And some people are very sensitive to the reliability of messages 、 Disaster recovery requires better , Or there are high partitions 、 Demand scenarios such as delay queues , May choose Pulsar.
Reference material :
10 Minutes to understand , Comprehensive comparison of message queue selection
边栏推荐
猜你喜欢

解决在win10下cmder无法使用find命令

Household accounting procedures (First Edition)

在web页面播放rtsp流视频(webrtc)

Household accounting procedures (the second edition includes a cycle)

kolla-ansible部署openstack yoga版本

Unicloud cloud development obtains applet user openid

Spark source code analysis (I): RDD collection data - partition data allocation

String class learning

Hot! 11 popular open source Devops tools in 2021!

【群内问题学期汇总】初学者的部分参考问题
随机推荐
SSH keygen specifies the path
On site commissioning - final method of kb4474419 for win7 x64 installation and vs2017 flash back
423-二叉树(110. 平衡二叉树、257. 二叉树的所有路径、100. 相同的树、404. 左叶子之和)
The interviewer with ByteDance threw me an interview question and said that if I could answer it, other companies would have an 80% chance of passing the technical level
Matching environment of ES6
Logstash——Logstash向Email发送告警邮件
Redis底层数据结构
冒泡排序(Bubble Sort)
Machine learning 05: nonlinear support vector machines
NPM private server problem of peanut shell intranet penetration mapping
Soft power and hard power in program development
E-commerce seeks growth breakthrough with the help of small program technology
numpy.log
tf.nn.top_k()
04. basic data type - list, tuple
REUSE_ ALV_ GRID_ Display event implementation (data_changed)
如何设计好的技术方案
Record how to modify the control across threads
Selective search for object recognition paper notes [image object segmentation]
Cython入门