The organization’s cloud office-related systems do not have a mature smooth release scheme, resulting in each release being released directly, and changes in DLL files or configuration files will cause site restarts.
There are 10,000+ permanent users of the cloud office system, and even if it is just over half a minute, you will receive a bunch of complaints. Based on this, we have sorted out a set of smooth release schemes.
Implementation scheme
1, agree with the nginx proxy server for a health check interface 2, through the http status code returned by the interface to let ngx
divert user requests (this is the technical department of our unit has a standard practice)
3, according to the interface of this service
health check: nginx
judges as long as the interface of an instance returns a 5xx status code, That is, take the instance offline (nginx will not forward traffic to the instance)
The purpose of the
release process
is mainly to be able to release smoothly when publishing, so QA and developers follow the following steps when publishing
:
1. Open the nginx list management page of the system: [/publish/ngxconfig]
2. Remove an instance (assuming that the system cluster has A, B, and C instances), such as A instance
3. Check whether the removal is successful: This is the health check interface we agreed with nginx, which is 200 statu in the normal online state, and after switching offline, this interface returns the statu of 401.
Online availability:
Offline:
4. Observe the monitoring site until the Req and Connnectiuon traffic under the instance disappear
5、 Release the version under this instance6
,
open Fidller, host to the instance to be released, and then judge whether the release is successful (when releasing the dll and configuration file, the IIS site will restart briefly)
7. QA students go through the grayscale A instance server to ensure that it runs normally, and so on until all servers are released.
Further optimization of ABTesting
Background
After the smooth release is done, it really brings me great convenience, there is no need to make announcements every time it is released, and unimportant or non-functional content is released.
However, after a long time, after the number of customers went up, there was another problem, that is, every major business change, large-scale releases are directly released to production, which may be risky. Designer design functions, users may not be fully accepted, once the new version is online
,
receive a lot of complaints, are users, if you can carry out grayscale trials in a small range of people, complete the smooth transition and use feedback, after optimization and then to production will be better.
Therefore, it is
necessary to think and design a unified technical solution here, and in the future, whether it is cloud office or other business systems, it can be experienced and verified in a small range that can be specified through grayscale release.
Based on the
above smoothing, we worked on the Nginx reverse proxy server and asked nginx to help us do ABTesting. Here are a few scenarios we tried
:
1. Nginx Reverse Proxy: Incoming IP Policy
process
step
1. Enter the cloud office system, enter the Nginx
anti-generation server
2, Nginx read the AB list of incoming IPs
3, and forward traffic according to the IP AB list (list A goes to specific instances, list B goes to the original cluster instance of cloud office)
server {
listen 80;
server_name officecloud.com;
access_log officecloud.com/logs main;
ip_list 192.168.254.4, 192.168.254.170
set $group default;
if ($remote_addr in iplist) {
set $group ACluster;
}
location / {
proxy_pass http:// $group;
proxy_set_header Host $host ;
proxy_set_header X-Real-IP $ remote_addr;
proxy_set_header X-Forwarded-For $ proxy_add_x_forwarded_for;
index index.html index.htm;
}}
Advantages
and disadvantages
1. Simple configuration, The grayscale upgrade of the original resource platform is to divide the design upgrade according to
the IP
list2. Many external computers are non-fixed IP, which is suitable for implementation in the company’s intranet, such as just configuring the IP of the company’s intranet.
2, Nginx reverse proxy: $. Cookies Policy
Process
step
1. Enter the cloud office system, enter the Nginx anti-generation server
2, Nginx read the version information of Cokie of the HTTP request (it can also be other keys
)
3. Forward traffic according to the version of the key (such as Version 1.1 to go to a specific cluster, Version 1.0 to go to a general cluster instance)
server {
listen 80;
server_name officecloud.com;
access_log officecloud.com/logs main;
ip_list 192.168.254.4, 192.168.254.170
set $group default;
if ($http_cookie ~* "version=V1.0"){
set default;
}
if ( $http_cookie ~* "version=V1.1"){
set $group ACluster;
}
location / {
proxy_pass http:// $group;
proxy_set_header Host $host ;
proxy_set_header X-Real-IP $ remote_addr;
proxy_set_header X-Forwarded-For $ proxy_add_x_forwarded_for;
index index.html index.htm;
}}
Advantages
and disadvantages
1. Simple configuration, According to Nginx’s $COOKIE_version attribute to judge
2, relatively stable, for users who need to open the list, add a specific version to the cookie header, the application only needs a small amount of development 3,
the
first visit to the static page may not generate cookies
Note: This is the best Nginx proxy solution in the team, in the same way, User-Agent and Header can make this type of judgment, but the header needs to intrude into the underlying HttpRequest to add business, which is not recommended.
3. AB cluster + business agent mode
Process
steps
1. Enter the cloud office system, two ways to enter the system, one is login page login: ~/login, one is default page with uckey login: ~/default?usertoken=
#usertoken#2, when logging in and
when usertoken comes in Routing proxy module, user information verification, according to different personnel and departments (personnel and department configuration belongs to the AB list) Divert traffic to two
different AB
clusters3. Jump to the specific instance cluster domain name according to forwarding (AB clusters can be configured to have different domain names, which is easier to distinguish)
Advantages and disadvantages
1. Split from Nginx, do not rely on the company’s general platform and technical department implementation
2. You need to apply for AB cluster, and AB cluster has different domain names.
3. If it is a separation of the
front and back end, it is necessary to ensure that the static site and the service site are all applied for AB cluster
4, all entrances need to be unified as agents, and there is a certain amount of development
application
At present, the 2 systems on hand have achieved reference https://github.com/CNSRE/ABTestingGateway according to the scheme
ABTestingGateway
ABTestingGateway is a dynamic routing system open source for Sina. ABTestingGateway is a grayscale release system that can dynamically set the diversion strategy, working at the 7th layer
,
developed based on nginx and ngx-lua, using redis as the triage policy database, which can realize the dynamic scheduling function.
If
you like this article, please click on the upper right corner to share the article to the circle
of friends If you want to know the technical points of learning, please leave a message to Ruofei to arrange a sharing
end
public number (zhisheng) reply to Face, ClickHouse, ES, Flink, Spring, Java, Kafka, Monitoring < keywords such as span class="js_darkmode__148"> to view more articles corresponding to keywords.
like + Looking, less bugs 👇