Many businesses need to consider the order of message delivery:

(1) Single chat message delivery, to ensure that the sender’s sending order is consistent with the receiver’s display order;

(2) Group chat message delivery to ensure that all recipients show the same order;

(3) Recharge the payment message to ensure that the request initiated by the same user is executed in the server in a consistent sequence;

Message sequencing is a very difficult problem in distributed system architecture design, what are the common optimization practices?

Trade-off: The timing of the client or server shall prevail

No matter what the situation, a ruler is needed to measure the order of the time series, which can be based on the business scenario, based on the time of the client or the server, for example:

(1) The order of email display is actually based on the time sent by the client;

Voice-over: As long as the sender adjusts the time in the mail agreement to 1970 or 2970, it can always “pin” or “bottom” the recipient after receiving the message.

(2) The judgment of the time of the second kill activity must be based on the time of the server, and it is impossible for the client to modify the local time, so that the second kill can be made in advance;

Trade-off two: The server generates a monotonically increasing id as a timing basis

For strictly time-sequencing business scenarios, you can use the seq/auto_inc_id of a single point write db to generate monotonically increasing ids to ensure sequentiality.

Voice-over: This single point of generating an id can easily become a bottleneck.

Trade-off three: If the business can accept a trend of increment id with little error

Message sending, post release times, and even spike times don’t have such precise timing requirements:

(1) The timing of the chat messages released within the same 1s is chaotic, and it is okay;

(2) The ranking of posts published within 1s is not correct, it is fine;

(3) With the spike launched within 1s, due to the error in the time between multiple servers, it falls to the Spike of the A server; Success, the second sale to the B server has not yet begun, and the business is also acceptable (the user does not perceive)

Therefore, in most businesses, the timing of the long-term trend increase can meet the business needs, and the timing error of a very short period of time can be accepted to a certain extent.

Therefore, the distributed id generation algorithm can be used to generate ids as a timing basis.

Trade-off four: the use of single-point serialization can ensure that multiple machines are in the same order

In order to ensure high availability, data needs to be redundant, the same data is stored in multiple places, how to ensure that the modification messages of these data are consistent?

“Single point serialization” is possible:

(1) Serialize the operation on a machine first;

(2) The operation sequence is then distributed to all machines to ensure that the operation sequence of multiple machines is consistent and the final data is consistent;

Typical scenario 1: Database master-slave synchronization

The master-slave architecture of the database, the upstream respectively initiated the op1, op2, op3 three operations, the master of the master to serialize all the SQL write operations op3, op1, op2, and then send the same sequence to the slave slave to ensure the consistency of all database data, is to use the “single point serialization” idea.

Typical scenario 2: Consistency of GFS Chinese

GFS (Google File System) In order to ensure the availability of files, a file to store multiple copies, in multiple upstream to the same file write operations, but also by a master chunk-server first serialized write operations, and then the serialized operations sent to other chunk-servers to ensure data consistency of redundant files.

One-on-one chat, how to ensure that the sending order is consistent with the receiving order?

For the demand of single person chat, sender A sends out three messages of msg1, msg2, and msg3 to receiver B in turn, can these three messages ensure the consistency of the display timing (the order of sending and displaying)?

The scheme design ideas are as follows:

(1) If you use the server single point serialization timing, it may occur that the timing of the message received by the server is msg3, msg1, msg2, which will be inconsistent with the issuing sequence.

(2) The business does not need global message consistency, only for the same sender A, ta to B message timing is consistent, common optimization scheme, in the message A to B sent out, plus an absolute timing of sender A locally, to represent the display timing of receiver B.

There may be a problem: if receiver B receives msg3 first, msg3 will appear first, and then after receiving msg1 and msg2, it will appear in front of msg3.

How to ensure that the order of receipt of group chat messages is consistent?

The demand for group chat messages, N group friends chat in a group, how to ensure that all group friends receive messages displayed in the same sequence?

The scheme design ideas are as follows:

(1) Assume that as with the single chat message, the sender’s seq is used to ensure the timing, because the sender is not a single point, and the seq cannot be generated uniformly, and there may be inconsistencies.

(2) Thus, you can use the single point of the server to do serialization.

As shown in the above figure, the sending process of the group chat at this time is:

(1) sender1 emits msg1 and sender2 emits msg2;

(2) msg1 and msg2 through the access cluster, service cluster;

(3) The service layer takes a unique seq to the bottom layer to determine the receiver display timing;

(4) The seq of the service to get msg2 is 20, and the seq of msg1 is 30;

(5) Messages will be sent to multiple group members through the delivery service, and the group friends can be displayed according to seq even if the time of receiving msg1 and msg2 is different;

This method can be implemented, and the message display timing of all group friends is the same.

The downside is that a service that generates a globally incremented sequence number can easily become a system bottleneck.

Are there any further optimization methods?

In fact, group messages do not need to ensure that the global message sequence is orderly, but as long as the messages in a group are orderly, in this case, “id serialization” has become a good idea.

In this solution, the service layer no longer needs to go to a unified backend to take the global seq, but to do a small transformation at the service connection pool level to ensure that the messages of a group fall on the same service, and the service can use the local seq to serialize all the messages of the same group, ensuring that the timing of the messages seen by all the group friends is the same.

At this point, using the local clock to generate seq will work, isn’t it clever?


(1) To be “orderly”, there must be a yardstick to measure “order”, which can be a client ruler or a server-side ruler;

(2) Most businesses can accept a wide range of trends in order and a small range of errors; Absolutely ordered business, you can rely on the ability of the server absolute timing;

(3) Single-point serialization is a common method to ensure the unity of multi-machine timing, typical scenarios have db master-slave consistency, gfs multi-file consistency;

(4) One-on-one chat, just ensure that the timing of the sending is consistent with the timing of the reception, you can use the client seq;

(5) Group chat, just to ensure that all receiver message timing is consistent, you need to use the server seq, there are two methods, one is a single point absolute timing, the other is id serialization;

The idea is more important than the conclusion, and I hope that everyone will have a harvest.

Related recommendations:

“10,000 attributes, 10 billion data, 100,000 throughput per second, how to design the architecture?” 》