click “zhisheng” above to follow, star or pin to grow together

Flink goes from beginner to proficient Series of articles

The flash sale system I believe many people have seen, such as Jingdong or Taobao’s second deal, Xiaomi mobile phone second deal, so how is the background of the flash kill system realized? How do we design a Lightning Kill system? What issues should be considered for the Lightning Sale system? How to design a robust Lightning Kill system? In this article, we will explore this problem.

What issues should be considered in Lightning Deals

> oversold problem

Analyzing the business scenario of the flash sale, the

most important point is the overselling problem, if the stock is only 100, but the final oversold is 200, generally speaking, the price of the flash sale system is relatively low, if oversold will seriously affect the company’s property interests, so the first thing is to solve the problem of overselling goods.
High concurrency
Lightning deals have the characteristics of short time and large concurrency, the duration of flash sales is only a few minutes, and general companies will attract users at a very low price in order to create a sensational effect, so there will be many users participating in the rush. There will be a large number of requests pouring in in a short period of time, and how to prevent cache breakdown or invalidation caused by high concurrency on the backend and collapse the database are all issues that need to be considered.
Interface brush
protection Now most of the second sales will come out for the corresponding software of the second sale, such software will simulate constantly initiating requests to the background server, hundreds of times a second are very common, How to prevent repeated invalid requests of such software and prevent continuous requests also need our targeted consideration.
Lightning Kill URL
For ordinary users, what they see is only a relatively simple flash sale page, before the specified time is reached, the flash sale button is gray, once the specified time is reached, the gray button becomes clickable. This part is for white users, if it is a user with a little computer skills, you will see the URL of the flash sale through F12 to see the network of the browser, and you can also achieve the second sale through specific software to request. Or people who know the flash kill URL in advance will directly achieve the flash kill as soon as they request. We need to consider this problem.
Database design
Lightning Kill has the risk of bringing down our server, and if it is used in the same database as our other businesses and coupled together, it is likely to implicate and affect other businesses. How to prevent this kind of problem from occurring, even if the flash sale has downtime and server stuck problems, it should also allow him to try not to affect the normal business of the online business.
Lots of request issues
According to the consideration of “high concurrency”, even the use of cache is not enough to cope with the impact of high-concurrency traffic for a short time. How to carry such a huge traffic volume while providing stable and low-latency service guarantee is a major challenge to face. Let’s calculate an account, if you use Redis cache, the QPS that a single Redis server can afford is about 4W, if a flash sale attracts enough users, a single QPS may reach hundreds of thousands, and a single Redis is still not enough to support such a huge request volume. The cache will be broken down and directly penetrated into the DB, thereby crushing MySQL, and a large number of errors will be reported in the background.
Design and technical solutions of the Lightning Kill system


database of the second-kill system In view of

the problem of the Lightning Deal database

raised by “Database Design”, a separate Lightning Deal database should be designed to prevent the entire website from being dragged down due to high concurrent visits due to Lightning Sale activity. Only two tables are needed here, one is the Lightning Sale Order Table, and the other is the Lightning Sale Goods Table.
In fact, there should be several tables, commodity tables: you can associate goods_id find specific product information, product images, names, usual prices, flash sale prices, etc., and user tables: according to the user’s user_id, you can query the user’s nickname, user mobile phone number, delivery address and other additional information, this specific will not give examples.
The design of

the flash sale URL

In order to prevent people with program access experience from directly accessing the background interface through the URL of the order page to flash up the goods, we need to make the URL of the flash sale dynamic, even the person who develops the entire system cannot know the URL of the flash sale before the start of the second sale. The specific method is to encrypt a string of random characters as the URL of the flash sale through MD5, and then the front-end access the background to obtain the specific URL, and the second sale can continue after the background verification is passed.


Deal page static
The description, parameters, transaction records, images, evaluations, etc. of the goods are all written to a static page, and the user request does not need to access the back-end server, does not need to go through the database, and is directly generated in the foreground client, which can minimize the pressure on the server. The specific method can use the freemarker template technology to create a web page template, fill in the data, and then render the web page.

Upgrading a single Redis to a cluster Redis flash sale is a scenario where more reads and fewer writes are used, and it is only appropriate

to use Redis
as a cache. However, considering the cache breakdown problem, we should build Redis clusters and adopt sentinel mode, which can improve the performance and availability of Redis.
Using Nginx Nginx
is a high-performance web server with a concurrency capacity of tens of thousands, while Tomcat has only a few hundred. Mapping client requests through Nginx and redistributing them to a cluster of back-end Tomcat servers can greatly improve concurrency.
Streamline SQL
A typical scenario is when deducting inventory, the traditional practice is to check the inventory first and then update. In this way, two SQL statements are required, and in fact we can do it with one SQL. You can do it like this: update miaosha_goods set stock =stock-1 where goos_id ={#goods_id} and version = #{version} and sock>0; In this way, you can ensure that the inventory will not be oversold and update the inventory at once, and note that the optimistic lock of the version number is used here, which has better performance than the pessimistic lock.
Redis pre-reduced inventory
Many requests come in, and you need to query the inventory in the background, which is a scene that is frequently read. You can use Redis to pre-reduce inventory, you can set the value in Redis before the start of the flash sale, such as redis.set(goodsId, 100), where the pre-placed inventory is 100 can be set as a constant, after each successful order, Integer stock = (Integer)redis.get(goosId); Then judge the value of sock, and subtract 1 if it is less than the constant value; However, note that when canceling, you need to increase inventory, and when increasing inventory, you must also pay attention to not be greater than the total inventory set between (querying inventory and deducting inventory requires atomic operations, at this time, you can use the lua script) When you place an order and get inventory again, you can check it directly from Redis.

The ultimate essence of

interface throttling

is database

updates, but there are a lot of invalid requests, and what we ultimately need to do is how to filter out these invalid requests to prevent infiltration into the database. If you limit the current, there are many aspects that need to be started:
front-end current
limiting The first step is to limit the flow through the frontend, the user initiates a request after the flash kill button is clicked, then it cannot be clicked in the next 5 seconds (by setting the button to disable). This small initiative is small to develop, but effective.
The same user rejects

repeated requests within xx seconds

The specific number of seconds depends on the actual business and the number of people in the flash sale, and is generally limited to 10 seconds. The specific method is to use Redis’s key expiration policy, first for each request from String value = redis.get(userId); If the value is empty or null, it means that it is a valid request, and then the request is released. If it is not empty, it means that it is a repeated request, and the request is simply discarded. If valid, use redis.setexpire(userId,value,10).value can be any value, generally put business attributes is better, this is to set the userId as the key, 10 seconds of expiration time (after 10 seconds, the value corresponding to the key is automatically null).
Token bucket algorithm throttling
There are many strategies for interface throttling, and we use the token bucket algorithm here. The basic idea of the token bucket algorithm is that each request tries to acquire a token, the backend only handles requests that hold tokens, and we can limit the speed and efficiency of producing tokens ourselves, and Guava provides RateLimter’s API for us to use. Here is a simple example, note the need to introduce Guava:

public class TestRateLimiter { public static void main(String[] args) { //1 second to generate 1 token final RateLimiter rateLimiter = RateLimiter.create(1);

        for (int i = 0; i < 10; i++) {

//This method blocks the thread and does not continue executing until the token can be retrieved in the token bucket.            double waitTime= rateLimiter.acquire();

System.out.println("Task execution" + i + "wait time" + waitTime);


System.out.println("End of execution");



idea of the above code is to limit our token bucket to generate 1 token per second through RateLimiter (the production efficiency is relatively low), and loop 10 times to execute the task. Acquire blocks the current thread until it gets a token, i.e. if the task does not get a token, it waits forever. Then the request will be stuck in our limited time before we can continue to go down, and this method returns the specific time that the thread waits. Execute as follows:

You can see that during the execution of the task, the first one does not need to wait, because the token has already been produced in the first second of the start. The next task request must wait until the token bucket generates a token before proceeding. If it is not obtained, it will block (there is a pause process). However, this method is not very good, because if the user requests on the client, if there are more, the direct background in the production token will be stuttered (poor user experience), it will not abandon the task, we need a better strategy: if it is not obtained for more than a certain time, directly reject the task. Let’s take another example:


class TestRateLimiter2 { public static void main(String[] args) { final RateLimiter rateLimiter = RateLimiter.create(1);

        for (int i = 0; i < 10; i++) {

            long timeOut = (long) 0.5;            boolean isValid = rateLimiter.tryAcquire(timeOut, TimeUnit.SECONDS);

System.out.println("task" + i + "execution is valid:" + isValid);

            if (!isValid) {
                continue;            }

System.out.println("task" + i + "execute");





The tryAcquire method is used, the main function of this method is to set a timeout period, if it is estimated within the specified time (note that it is estimated and will not really wait), it will return true if it can get the token, and false if it cannot be obtained. Then we let the invalid one skip directly, here we set to produce 1 token per second, let each task try to get the token in 0.5 seconds, if it can’t be obtained, skip this task directly (put it in the flash sale environment is to directly abandon this request). The actual program runs as follows:

Only the first one obtained the token and executed it smoothly, and the following are basically directly abandoned, because within 0.5 seconds, the token bucket (1 in 1 second) will definitely not be able to get false before it is produced.
How efficient is this throttling strategy? If our concurrent request is 4 million instantaneous requests, the efficiency of token generation is set to 20 per second, and the time of each attempt to obtain tokens is 0.05 seconds, then the final test result is that only about 4 requests will be allowed each time, and a large number of requests will be rejected, which is the excellence of the token bucket algorithm.
Asynchronous order


In order to improve the efficiency of order placement and prevent the failure of the order service. The order placement operation needs to be processed asynchronously. The most common approach is to use queues, and the three most significant advantages of queues are asynchrony, peak shaving, and decoupling. Here you can use RabbitMQ, after the background has passed the current limit, inventory verification, the flow to this step is a valid request. It is then sent to a queue, where the queue accepts the message and places the order asynchronously. After placing an order, there is no problem with warehousing, you can notify the user by SMS that the second sale is successful. If it fails, you can use the compensation mechanism and try again.
Service degradation
If a server goes down or the service is unavailable during the flash sale, you should do a good job of backup. In the previous blog, it was introduced that service circuit breakers and downgrades through Hystrix can develop an alternate service, and if the server is really down, directly give the user a friendly prompt to return, rather than directly stuck, server errors and other blunt feedback.

Lightning Sale flowchart:

This is the flash

sale flow chart I designed, of course, different flash sales volume is different for the technical selection, this process can support hundreds of thousands of traffic, if it is tens of millions of more than 100 million, it has to be redesigned. For example, the database is divided into tables and queues to increase the number of clusters with Kafka and Redis. Through this design, the main thing is to show how we deal with high concurrency processing, and start trying to solve it, thinking more and doing more at work can improve our ability level, come on! If there are any errors in this blog, please take the trouble to point them out, which is appreciated.
Original link:

public number (zhisheng) reply to Face, ClickHouse, ES, Flink, Spring, Java, Kafka, Monitoring < keywords such as span class="js_darkmode__80"> to view more articles corresponding to keywords.

Like + watch, less bugs 👇

Buy Me A Coffee