当前位置:网站首页>Message queue - function, performance, operation and maintenance comparison

Message queue - function, performance, operation and maintenance comparison

2022-06-26 06:11:00 Impl_ Sunny

One 、 function

1.1 Consumption push-pull mode

 picture

1.2 Delay queue


Delayed delivery of messages , When a message is generated and delivered to the message queue , Some business scenarios do not want consumers to receive messages immediately , It's about waiting for a certain time , Consumers can get this information to spend .

There are two types of delay queues , Message based latency and queue based latency :

Message based latency : Set different delay times for each message , When new messages enter the queue, they are sorted according to the delay time , Of course, this will have a great impact on the performance .
Queue based delay : Set up queues with different delay levels , The delay time of each message in the queue is the same , This eliminates the performance loss caused by delay time sorting , The timeout message can be delivered through a certain scanning strategy .
Usage scenarios of delayed messages, such as exception detection and retry , Order timeout, cancellation, etc , for example :

Service request exception , Exception requests need to be put on a separate queue , Partition 5 Try again in minutes ;

Users buy goods , But it's been unpaid , Users need to be reminded to pay regularly , The order will be closed if the time exceeds ;

Interview or meeting appointment , Half an hour before the interview or meeting , Send a notification to remind again .

Different MQ The support is as follows :

Kafka: Deferred messages are not supported

Pulsar: Support second level delay messages , All delayed messages will be Delayed Message Tracker Record the corresponding index,consumer At the time of consumption , Will go first Delayed Message Tracker Check , Whether there is a message due for delivery , If there is a message of expiration , From Tracker Take out the corresponding index, Find the corresponding message for consumption , If there is no expiration message , Then consume the normal news directly . For long delayed messages , Will be stored on disk , When the delay interval is approaching, it is loaded into memory .

RocketMQ: Open source version delay messages are temporarily stored in an internal theme , Arbitrary time precision is not supported , Support specific level, For example, timing 5s,10s,1m etc. .

RabbitMQ: One needs to be installed rabbitmq_delayed_message_exchange plug-in unit .

1.3 Dead letter queue


For some reason, messages can't be delivered correctly , In order to ensure that messages are not discarded for no reason , It is usually placed in a queue of special roles , This queue is generally called dead letter queue . There is also a corresponding “ Back out of the queue ” The concept of , Imagine if something goes wrong with the consumer , Then there will be no confirmation of this consumption (Ack), After the operation of rolling back the message, the message will always be placed at the top of the queue , And then it's constantly being processed and rolled back , Causes the queue to fall into an endless loop .

To solve this problem , You can set up a fallback queue for each queue , It and dead letter queue are a mechanism guarantee for exception handling . On the ground , The role of fallback queue can be played by dead letter queue and retry queue .

Different MQ The support is as follows :

Kafka: No dead letter queue , adopt Offset Record the offset of current consumption .

Pulsar: There's a retry mechanism , When some news is first consumed by consumers , Didn't get a normal response , Will enter retry Topic in , After a certain number of retries , Stop retrying , Deliver to dead letter Topic in .

RocketMQ: adopt DLQ To record all messages of consumption failure .

RabbitMQ: The dead letter queue is implemented in a form similar to the delay queue .

1.4 Priority queue


Priority queues are different from FIFO queues , High priority messages have the privilege of being consumed first , This can provide guarantees of different message levels for downstream users .

But this priority also needs to have a premise : If consumers consume faster than producers , And the message middleware server ( Generally, it is simply called Broker) There's no news pile up in , So it doesn't really make sense to set priorities for messages sent , Because the producer has just sent a message to be consumed by the consumer , Then it's equivalent to Broker There is at most one message in , For a single message, priority is meaningless .

Different MQ The support is as follows :

Kafka、RocketMQ、Pulsar Priority queues are not supported , Message priority can be achieved through different queues .

RabbitMQ: Support priority messages .

1.5 The message goes back


General messages are processed after consumption , After that, you can't consume the message again . Message backtracking is just the opposite , After the consumption message is completed , It can also consume information that has been consumed before .

For messages , The problem we often face is “ Lost message ”, It is difficult to trace whether the message middleware is lost due to the defect of message middleware or the misuse of the user , If the message middleware itself has the function of message backtracking , It can be reproduced through retrospective consumption “ Lost ” The information then finds out the source of the problem .

The role of backtracking is much more than that , For example, there is index recovery 、 Local cache rebuild , Some business compensation schemes can also be implemented by backtracking .

Different MQ The support is as follows :

Kafka: Support message backtracking , You can specify... Based on timestamp or Offset, Reset Consumer Of Offset So that it can be consumed repeatedly .

Pulsar: Support message backtracking by time .

RocketMQ: Support time backtracking , The principle of implementation is the same as Kafka Agreement .

RabbitMQ: Backtracking is not supported , Once the message is marked for confirmation, it will be marked for deletion .

1.6 Message persistence


Traffic peak clipping is a very important function of message middleware , And this function actually benefits from its message accumulation ability . In a sense , If a message middleware does not have the ability to stack messages , Then it can't be regarded as a qualified message middleware .

Message heap integration memory stack and disk heap . Generally speaking , The capacity of the disk will be much larger than that of the memory , For disk stacking, its stacking capacity is the size of the entire disk . On the other hand , Message stack also provides redundant storage function for message middleware .

Different MQ The support is as follows :

Kafka and RocketMQ: Directly brush the message into the disk file for persistence , All the messages are stored on disk . As long as the disk capacity is enough , Can achieve unlimited message accumulation .

RabbitMQ : It's a typical memory stack , But this is not absolute , After some conditions are triggered, there will be a page change action to page the messages in memory to the disk ( Page change will affect throughput ), Or simply use lazy queues to persist messages directly to disk .

Pulsar: Messages are stored in BookKeeper On the storage cluster , It's also a disk file .

1.7 Message confirmation mechanism


Message queuing needs to manage consumption progress , Confirm that the consumer has successfully processed the message , Use push The message queue component of the method is often to confirm a single message , For unconfirmed messages , Delay redelivery or enter the dead letter queue .

Different MQ The support is as follows :

Kafka: adopt Offset Way to confirm the message .

RocketMQ: And Kafka Similar will be submitted Offset, The difference is that consumers are not satisfied with the news of consumption failure , Can be marked as message consumption failure ,Broker Will retry delivery , If the accumulated multiple consumption fails , Will be delivered to the dead letter queue .

RabbitMQ: Consumer confirms a single message , Otherwise, it will be put back in the queue and wait for the next delivery .

Pulsar: Use special Cursor management . Cumulative confirmation and Kafka The effect is the same ; Provide single or selective confirmation .

1.8 news TTL


news TTL Indicates the lifetime of a message , If the message comes out , stay TTL There are no consumers to consume in the time , The message queue will delete the message or put it into the dead letter queue .

Different MQ The support is as follows :

Kafka: Delete messages according to the set retention period . It's possible that the news hasn't been consumed , Deleted after expiration . I won't support it TTL.

Pulsar: Support TTL, If the message is not in the configured TTL Used by any consumer during the time period , The message will be automatically marked as confirmed . Message retention period and message TTL The difference between them is : The message retention period applies to messages marked as acknowledged and set as deleted , and TTL Act on not ack The news of . The legend above illustrates Pulsar Medium TTL. for example , If subscribe B No active consumers , In the configuration TTL After the time period , news M10 Automatically mark as confirmed , Even if no consumer actually reads the message .

RocketMQ: Mention the news TTL There is less information , However, the interface seems to be supported .

RabbitMQ: There are two ways , One is to set... In the queue attribute when declaring the queue , Messages in the entire queue have the same validity period . You can also set the properties of the message when sending the message , You can set different bits for each message TTL.

1.9 Multi-tenant isolation


Multi tenancy refers to the ability to provide services to multiple tenants through a software instance . A tenant is someone who has the same “ View ” A group of users . In systems that do not support multi tenancy , It is often necessary to create multiple message queue instances for different users or different clusters to achieve physical isolation , This will bring higher operation and maintenance costs .

As an enterprise class message system ,Pulsar The multi tenancy capability of is designed to meet the following requirements :

Ensure strict SLA Can smoothly meet .

Ensure isolation between different tenants .

Enforce quotas for resource utilization .

Provide per tenant and system level security .

Ensure low-cost operation and maintenance and as simple management as possible .

Pulsar The above needs are met in the following ways :

By authenticating each tenant 、 Authorization and ACL( Access control list ) Get the security you need .

Enforce storage quotas for each tenant .

Define all isolation mechanisms in a policy way , Policies can be changed during operation , In order to reduce operation and maintenance costs and simplify management work .

1.10 Message sequencing


Message ordering is to ensure the order of messages . The order of message consumption is consistent with that of production .

Different MQ The support is as follows :

Kafka: It ensures that the messages in the partition are in order .

Pulsar: Support two consumption patterns , The flow mode of exclusive subscription only ensures the order of messages , The shared subscription queue model does not guarantee ordering .

RocketMQ: Locks are needed to ensure that a queue has only one consumer thread to consume at the same time , Keep the message in order .

RabbitMQ:RabbitMQ The sequence of the conditions are more stringent , Need a single thread to send 、 Single thread consumption , And do not use delay queue 、 Priority queue and other advanced functions .
 

1.11 Message query

In actual development , Always check MQ The content of the message in , For example, through some MessageKey/ID, Query to MQ Specific news about . Or link tracking messages , Know where the news comes from , Where to send it , Then quickly check and locate the problem .

Different MQ The support is as follows :

Kafka: The storage layer is implemented in the form of distributed submission logs , Each write operation is appended to the end of the log in sequence . Reading is also sequential reading . Retrieval function is not supported .

Pulsar: It can be done by message ID, Query the message content of a specific message 、 Message parameters and message tracks .

RocketMQ: Support press Message Key、Unique Key、Message Id Query the message .

RabbitMQ: Use an index based storage system . These keep the data in a tree structure , To provide the fast access needed to acknowledge a single message . because RabbitMQ The message will be deleted after confirmation , Therefore, only unconfirmed messages can be queried .

1.12 Consumption patterns

Different MQ The support is as follows :

Kafka: There are two consumption patterns , In the end, it will ensure that a partition has only 1 Consumers are consuming :

  • subscribe The way : When the number of topic partitions changes or consumer When the quantity changes , Will be carried out in rebalance; register rebalance Monitor , You can manage it manually offset Don't register listeners ,kafka Automatic management .

  • assign The way : Manual will consumer And partition Make a correspondence ,kafka It's not going to happen rebanlance.

Pulsar: There are four consumption patterns , Exclusive mode and disaster recovery mode are the same Kafka similar , For the flow model , Each partition has only 1 Consumer consumption , It can ensure the order of messages . Sharing mode and Key The sharing mode is queue model , Multiple consumers can increase the speed of consumption , But there is no guarantee of order .

  • Exclusive Exclusive mode ( The default mode ): One Subscription Only with one Consumer relation , Only this Consumer Can receive Topic All the news of , If it's time to Consumer In case of failure, consumption will stop .

  • Disaster recovery mode (Failover): When there are multiple consumer when , Will be sorted in dictionary order , first consumer Is initialized as the only consumer to receive messages . When the first one consumer When disconnected , All the news ( Not confirmed and subsequently entered ) Will be distributed to the next... In the queue consumer.

  • Sharing mode (Shared): Message through round robin Polling mechanism ( You can also customize it ) Distribute to different consumers , And each message will only be distributed to one consumer . When the consumer disconnects , All sent to him , But unconfirmed messages will be rescheduled , Distribute to other surviving consumers .

  • KEY Sharing mode (Key_Shared): When there are multiple consumer when , According to the message key distributed ,key The same message will only be distributed to the same consumer .

RocketMQ: There are two consumption patterns ,BROADCASTING( Broadcast mode ),CLUSTERING( Cluster pattern )

Broadcast consumption refers to : A message is sent by more than one consumer consumption , Even though these consumer Belong to the same ConsumerGroup, The news will be ConsumerGroup Each of the Consumer Once for all , Broadcasting in consumption ConsumerGroup The concept can be considered meaningless in terms of message partitioning .

Cluster consumption mode : One ConsumerGroup Medium Consumer The instance shares consumption messages equally . For example, a Topic Yes 9 Bar message , One of them ConsumerGroup Yes 3 An example ( May be 3 A process , perhaps 3 Taiwan machine ), Then each instance consumes only part of it , Consumed messages cannot be consumed by other instances .

RabbitMQ: Are all with Pulsar The sharing mode is similar to , The form of the queue , Increasing the number of consumers in a consumer group can improve the speed of consumption .

1.13 Message reliability

Message loss is a common point when using message middleware , The reliability of message behind it is also a key factor to measure the quality of message middleware . Especially in the field of financial payments , Message reliability is particularly important .

For example, when a service fails , Some news of successful production for producers , Whether it will be lost during high availability switching . Synchronous disk brushing is an effective way to enhance the reliability of a component , Message middleware is no exception ,Kafka and RabbitMQ Can support synchronous disk brushing , But in most cases , The reliability of a component should not be guaranteed by the extremely lossy operation of synchronous brush disk , Instead, it uses a multi replica mechanism to ensure that .

Different MQ The support is as follows :

Kafka: Can be configured by request.required.acks Parameter setting reliability level , Indicates how many copies of a message have been received after confirmation , Was successfully sent by the task .

  • request.required.acks=-1 ( Full synchronization confirmation , Strong reliability guarantee )

  • request.required.acks=1(leader Acknowledge receipt of , Default )

  • request.required.acks=0 ( Unconfirmed , But the throughput is high )

Pulsar: There is a heel Kafka A similar concept , It's called Ack Quorum Size(Qa),Qa It is the one that needs to reply and confirm after each write request is sent Bookie The number of , The larger the value, the longer it takes to confirm the success of the write , The upper limit of its value is the number of copies Qw. For consistency ,Qa Should be :(Qw+1)/2 Or more , That is, to ensure data security ,Qa The lower limit is  (Qw+1)/2.

RocketMQ: And Kafka similar .

RabbitMQ: Is master-slave architecture , Multiple copies and strong consistency semantics are realized by mirroring ring queue . Multiple copies can be guaranteed in master The node can be promoted after abnormal downtime slave As new master And continue to provide services to ensure availability .

Two 、 performance

In performance testing , There are many clients 、 Server parameter settings 、 Machine performance, configuration, etc , Such as message reliability level , Compression algorithm and so on , It's hard to do “ Completely ” Test of fairness of control variables . But there are a few concerns :

  • RabbitMQ The delay is microsecond , The latency of other components is in milliseconds ,RabbitMQ Should be MQ Relatively low in the component .

  • Kafka Single instance in topic / When there are many partitions , Performance will be significantly reduced :

  1. kafka It's a partition, a file , When topic Too much , The total number of partitions will also increase ,kafka There are too many files in , When swiping messages , There will be file contention disk , Performance degradation .

  2. also Kafka Every consumer joining or exiting will be rebalanced , When there are many partitions, rebalancing may take a long time , In the stage of rebalancing, consumers can't consume news .

  • and Pulsar Due to the separation of storage and Computing , So that it can support millions of Topic Number .

Pulsar and Kafka Are widely used in various enterprises , Each has its own advantages , Can handle large traffic through basically the same number of hardware . Some users mistakenly think that Pulsar Many components are used , Therefore, many servers are needed to implement and Kafka Comparable performance .

This idea applies to some specific hardware configurations , But in most cases with the same resource allocation ,Pulsar More obvious advantages , Better performance can be achieved with the same resources .

for instance ,Splunk Recently shared their choice Pulsar give up Kafka Why , Mentioned “ Due to the layered architecture ,Pulsar Help them reduce costs 30%-50%, The delay is reduced 80%-98%, Operating costs have been reduced 33%-50%”.Splunk The team found Pulsar Better use of disk IO, Reduce CPU utilization , At the same time, better control of memory .

In a distributed system , Although the single machine performance index is also very important , The overall performance of the distributed system and its flexible expansion and contraction capacity 、 High Availability Disaster Recovery and other capabilities will also be an important reference for evaluation .MQ Specific performance indicators of middleware , We also need to act according to the actual situation , According to the cluster configuration and client parameters actually purchased , Perform pressure test tuning to evaluate .

3、 ... and 、 Operation and maintenance

It is inevitable that various abnormal situations will occur in the process of use , Such as downtime 、 The network jitter 、 Expansion, etc . Message queue has remote disaster recovery function , High availability architecture and other capabilities , It can avoid some computing nodes 、 Failure caused by unavailability of network and other infrastructure .

3.1 High availability

Different MQ The support is as follows :

Kafka: Solve the problem of high availability by partitioning multiple copies .

Pulsar:Pulsar Computing Cluster Broker It's stateless , It can flexibly expand and shrink the capacity , Storage nodes Bookie Through message partition and fragment copy on , Each tile has one or more copies , Guarantee in a certain Bookie After hanging up , There are other segments that can provide services .

RocketMQ and RabbitMQ: It's all master-slave architecture , When master After hanging up , The original slave node continues to provide services . The standby machine provides consumer services , Make sure you don't lose the message , But no write service .

3.2  Cross regional disaster recovery

Pulsar Native supports cross regional disaster recovery , In this picture , whenever P1、P2 and P3 The producers of Cluster-A、Cluster-B and Cluster-C Medium T1 topic When sending a message , These messages are quickly replicated in different clusters . Once the message is copied , consumer C1 and C2 Will consume this message from their respective clusters .

Supported by this cross regional disaster recovery design , firstly , We can easily distribute services to multiple computer rooms ; second , It can deal with machine room level failure , That is, when a computer room is not available , Services can be transferred to other computer rooms to continue to provide external services .

One sentence summary ,Pulsar Cross regional replication of , In fact, it is to create a local cluster Producer, Take remote clusters as this Producer The sending address of , Send messages from the local cluster to , And maintain a local Cusor To ensure message reliability and idempotency .

3.3 The cluster expansion  

When the news volume suddenly rises , When the message queue cluster reaches the bottleneck , The cluster needs to be expanded , Expansion is generally divided into horizontal expansion and vertical expansion , Horizontal capacity expansion refers to adding nodes in the cluster , Vertical capacity expansion refers to raising the configuration of some nodes in the cluster , Increase processing power .

Different MQ The support is as follows :

Kafka:Kafka Cluster because the topic partition is physically stored in Broker nodes , The nodes of the newly added cluster do not have storage partitions , You can't provide immediate service , So we need to put some Topic The partition is allocated to the newly added node , This will involve a process of partition data balancing , Copy the data of some partitions to the new node . This process is related to the amount of data currently stacked in the partition 、Broker Performance , There may be problems arising from Broker Overload , The accumulated data is too large , Lead to longer data equalization time .

Pulsar: Take infinite distributed logging and sharding as the center , With extended log storage ( adopt Apache BookKeeper) Realization , Built in tiered storage support , Therefore, the shards can be evenly distributed on the storage nodes . Because with any given topic Related data will not be bundled with specific storage nodes , Therefore, it is easy to replace storage nodes or shrink or expand capacity . in addition , The smallest or slowest node in the cluster will not become a short board of storage or bandwidth .

RocketMQ: New nodes join the cluster directly , In the new broker Create a new topic And allocate queues , Or in the existing topic Allocate queues based on . And Kafka Is the difference between the ,Kafka The partitions are on different physical machines , and Rocketmq Is a logical partition , In the form of a queue , Therefore, there is no data imbalance .

RabbitMQ: Since there is not too much message persistence involved , Add nodes directly to the cluster .

Four 、 summary

classification Features KafkaPulsarRocketMQRabbitMQ
function Consumption push-pull mode pullpushpullpush
Delay queue *
Dead letter queue *
Priority queue ***
The message goes back *
Message persistence
Message confirmation mechanism OffsetOffset+ Single Offset Single
news TTL*
Multi-tenant isolation ***
Message sequencing The division is orderly Flow pattern order Consumer lock *
Message query *
Consumption patterns Flow mode Flow mode + Queue mode Broadcast mode + Cluster pattern Queue mode
Message reliability request.required.acksAck Quorum Size(Qa) And Kafka similar Mirror mode
performance Single machine throughput 605MB/S605MB/S class Kafka38MB/S
Message delay 5ms5msms level microsecond
Number of supported topics Dozens to hundreds Millions of Hundreds to thousands Thousands of
Operation and maintenance High availability Distributed architecture Distributed architecture Master slave architecture Master slave architecture
Cross regional disaster recovery One One One
The cluster expansion Add node , Balance by copying data Add node , By adding shard equalization Add node Add node

Kafka Launched earlier , Various scenarios, such as logs 、 There are mature solutions for big data processing .

Pulsar As a rookie , The supported functions are better than Kafka Richer , And cross regional disaster recovery , Multi tenant and other functions , Solved a lot of Kafka Design defects and operation and maintenance costs , Stronger overall stability . There are many big companies at home and abroad Pulsar The practical cases of .

therefore , Some traditional logs 、 Big data processing scenarios , High throughput is required , The requirement for message reliability is not so high , May choose Kafka, There are many excellent documents on how to optimize parameters to improve performance . And some people are very sensitive to the reliability of messages 、 Disaster recovery requires better , Or there are high partitions 、 Demand scenarios such as delay queues , May choose Pulsar.

Reference material :

10 Minutes to understand , Comprehensive comparison of message queue selection

原网站

版权声明
本文为[Impl_ Sunny]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/177/202206260600328584.html