The Alipay card pack holds the user’s membership card and coupon. Whether it is the card coupon cell, or the card coupon details, it is finally presented to the end user through static template configuration plus dynamic variable data.

The following [Figure 1] shows the form of card coupon data in the C-end user, and [Figure 2] shows the C-end data assembly process.

In Figure 2, for example, there are two variables in the template, availableAmount and voucherName, which have corresponding values in the dynamic variable data. Replace the two corresponding variables in the template with dynamic values, and finally assemble them into “100 yuan red envelope name”. When this red packet is used once and 30 yuan is consumed, the value of availableAmount in the dynamic data will become 70. When the user enters the red packet details page again, the display data will become “70 yuan red packet name” after reassembly.

In the recent process of doing the project, the card coupon assembly rendering logic was sorted out well, and the template variable substitution logic of [Figure 3] was carefully studied. It’s an old piece of code that has been around since the birth of the card pack product, almost a decade ago. Its role is to replace the variables in the template with dynamic data. At first glance, there is no problem with the logic of this code, that is, the variables between the two $ (inclusive) in the template are replaced with dynamic data. Considering that this is a very core and high-frequency call logic, let’s see if there is room for performance optimization.

After clarifying the replacement logic, the first feeling is that this code has room for performance improvement. There are two main points:

1. Each while loop performs two indexOf operations

2. Each while loop performs a substring operation

So, there are the following two questions:

1. Can indexOf and substring operations be reduced?

2. Do you really have to do template variable lookup every time?

With the above two questions in mind, perform performance optimization and testing step by step.

A total of 5 versions were iterated throughout the optimization process, and the performance improvement was achieved by more than 10 times. The following describes the implementation and performance comparison of different versions.

This version removes the indexOf and substring operations and uses another alternative in favor of another alternative.

The previous replacement logic looped through the template content string from beginning to end, and the variable between $ was replaced, and the indexOf and substring operations were required continuously. The new implementation method is to use the double pointer to extract all the variables in the template by looping the template content string before replacing the variables, and then looping the variable collection to replace the variables in the template content in turn.

Static template configuration generally does not change. This means that the variables corresponding to the same template are fixed. You can cache the template id and template variable collections one-to-one to reduce variable fetching before each replacement.

Before deciding to use caching, think about how to implement caching. There are two things to note:

1. Replace TBase with local cache to reduce the pressure on TBase in high-traffic scenarios

2. How to control the effective number of local caches and maximize cache efficiency under limited memory usage

The caching logic can be implemented with the help of the cache class of the Google Guava library, and the sample code is shown in Figure 5.

After doing the above two steps, the performance test was performed, and the performance pair was shown in [Figure 7].

Through performance comparison, it is found that the V1 version has performance improvement compared with the original version, and the V2 version with cache also has performance improvement compared with the V1 version without cache. However, as the traffic increases, the performance optimization effect gradually decreases. It shows that the point of time-consuming optimization of V1 and V2 versions does not account for a high proportion of the time-consuming replacement of the entire template variable. It also shows that there are other more time-consuming points in the entire template variable replacement logic.

Looking back at the variable substitution logic again, I suddenly realized that a “big problem” was missing. This is the String.replace method, which has two time points:

1. Each time replace will be compiled with a template

2. Replace is to create a new object to return

And each replace is followed by a reassignment of the variable.

On the basis of the V2 version, remove the replace method and use StringBuilder to implement it.

One thing to note during the implementation of StringBuilder. In the V2 version, the extracted variable returns a Set collection. The order in which variables appear in the returned collection will not match the order of variables in the template, and only the first occurrence variable will be replaced if there are multiple identical variables in the template. Therefore, the result returned by the variable extraction should be replaced with an ordered and repeatable list to ensure the correctness of the logic.

After the V3 version of the optimization, the performance improvement is obvious, which proves that the String.replace method is the most time-consuming point in the entire template variable substitution logic. So in the original method, only StringBuilder was used to replace String.replace it, and the V4 version was obtained.

From [Figure 11], it can be clearly seen that after the implementation of StringBuilder, the performance is improved by more than 10 times, and the effect is very obvious.

The V4 version actually takes less time than the V3 version with caching, indicating that the V3 version extracts the variables first and then assembles the StringBuilder process, which is relatively more time-consuming. However, the code readability of the V4 version is not as good as the V3 version, and the V3 version and the V4 version can be combined to eliminate the cache dependency and produce a V5 version with the best code readability and performance.

Extract the variables first, remove the cache dependency, and replace String.replace with StringBuilder to increase code readability.

Through the performance optimization of the above 5 versions, the performance has been improved by more than 10 times.

The order of performance from high to low is V4 > V3 > V5 > V2 > V1 > the original version that is not optimized. Among them, the performance of V3, V4, and V5 versions is significantly better than that of V1 and V2, which proves that the most time-consuming point of this template replacement logic is String.replace, V3 > V5 and V2 > V1 show that the introduction of cache is still helpful for performance improvement. In terms of code readability, V4 is inferior to V3 and V5.

The whole optimization is summarized by two main points:

1, String.replace method involves template compilation and new string generation, compare and eat resources

2, StringBuilder instead of String.replace, in addition to shortening the call time, in the space can also reduce the resource occupation. Because StringBuilder.append can reduce the creation and destruction of a large number of String objects in the middle compared to String.replace, it can reduce the pressure on the GC, thereby reducing the load on the CPU.

The obvious benefit of performance optimization is the ability to save machine resources. If an application with 2,000 servers improves overall performance by 10 percent, theoretically, it would be equivalent to saving 200 machines. In addition to saving machine resources, applications with good performance are less likely to reach the performance bottleneck of the machine when dealing with sudden increase in traffic, and when expanding the machine in the same traffic scenario, fewer machines are needed, so that the expansion and emergency operation can be completed faster. Therefore, applications with good performance are also better in terms of stability than applications with poor performance.

Finally, back to the topic of this article: What makes a 20-line piece of code 10 times more performant?

My answer is: StringBuilder yyds!

WeChat 8.0 will let go of friends to ten thousand, small partners can add my size, first-come, first-served, and then full is really gone

Scan the QR code below to add me WeChat, 2022, hug the group for warmth, and work together to be bullish.