In recent years, with the continuous rise of customers and business volume in the securities market, as well as the rise of Internet finance and the development of financial technology, securities companies have formulated strategic goals for digital transformation. In order to grasp the wave of the new round of digital technology revolution, enterprise information system architecture is constantly upgrading and changing, and many traditional software systems within enterprises have begun to transform into microservice architectures, through service splitting, reducing system coupling, achieving “high cohesion and low coupling” to provide more flexible service support.
As developers decouple and split the system, it becomes very difficult to effectively control a large number of microservice instances and improve the service quality of the system operation. In this context, in order to conform to the trend of the Internet + era and respond to the rapidly updated business needs, Orient Securities urgently needs to carry out system construction with a unified and service-oriented thinking, build a service governance platform, and achieve unified service governance between new systems and existing systems by analyzing service call relationships and topologies, optimizing service quality, and formulating service agreement specifications, so as to achieve light application (business-oriented, realize agile construction of business applications, and respond to market demand in a timely manner). The overall enterprise technology architecture transformation goal of heavy platform (transforming data and core applications into platform services and becoming the core of the entire architecture) and servitization (building a core service network to simplify application development and deployment) to achieve application lifecycle management.
Traditional information systems mostly use monolithic architectures, and monolithic architecture applications package all functions in a separate unit and are developed, tested, and deployed as a whole [1]. Java Web application is a typical monolithic architecture application, the project is packaged into a WAR package deployed in the same WEB container, which includes the data access layer of the DAO object, the business logic layer of the modules, the presentation layer of the UI and other functions. The advantage of the single architecture is that the development, debugging, and deployment are simple and convenient, and in the early stage of business development, the scale of the information system is small, and the use of the traditional single architecture can effectively support the development of the business. However, with the explosive growth of business and the continuous increase in the scale of application systems, the single architecture will bring huge problems to the development, maintenance, and deployment of business systems.
First, the development efficiency continues to decline, the huge code scale and intricate business coupling greatly increase the difficulty of developing new functions, developers must not only master the modules they are responsible for, but also need to understand the logic of the entire application system, otherwise the code modification may cause conflicts or other module errors;
Second, there are obstacles to continuous iteration, and any small modification of non-core functions requires the entire project to be redeployed, which significantly increases the risks associated with release in system operation and maintenance;
Third, the system reliability becomes poor, the traditional monolithic architecture will be all applications deployed in the same process, if an interface in the application fails, it will affect the ability of the entire system to provide services normally, in the huge instantaneous traffic impact, it is easy to cause a system avalanche;
Fourth, the extensibility is inherently insufficient, the application of the single architecture can only be extended in one dimension, but different modules may have different resource demand attributes, such as some functions are computationally intensive, and some are IO-intensive, because they run in an instance, so they cannot be extended to specific modules;
Fifth, technology rigidity cannot be refactored, and the technology stacks used by individual services have to be bundled with the technology stacks of the entire application, making it difficult to update the SDK version or use new technical frameworks.
Since the single architecture can no longer meet the needs of modern enterprise information systems, microservice architecture has been widely respected in recent years, and has been practiced and implemented in more and more securities companies. The microservices architecture is gradually evolved from the traditional monolithic architecture [2], which splits a large monolithic application into multiple services that can run independently and have a single function according to the design of business functions, and communicates with other services through a unified protocol [3][4].
Microservice architecture can solve many problems under the monolithic architecture: First, the huge monolithic application is split into smaller services, and the logic within the service is simple, highly cohesive, and easy to develop and maintain; Second, each microservice is deployed independently, and after the function is modified, it can be released for specific parts, so that each microservice system can be continuously deployed and the iteration speed is accelerated; Third, when a single service system fails, it is only necessary to repair the failed service offline and will not cause cascading failures of the entire system; Fourth, according to the access volume and resource requirements of different microservice systems, horizontal expansion and vertical expansion can be dynamically realized, which greatly improves the utilization rate of the system; Fifth, each R&D team can choose a programming language and technology stack according to their own needs, with greater flexibility.
Although microservice architecture has obvious advantages, the problem of system heterogeneity common to securities companies has also brought great challenges to the landing of microservice architectures.
(1) The business interface standards are not uniform, and the management and control risks are large
The core system of the securities industry is composed of traditional suppliers, taking the core system of Orient Securities brokerage business as an example, which is composed of Jinshida, Xinyi, Hang Seng, Verbox, Flush and other vendor architectures, SPX, T2, Rest, WebService and other types of service interfaces exist within Orient Securities enterprises, the problem of multi-business collaborative adaptation is prominent, and the diversity of services has put forward technical requirements for synchronous, asynchronous and streaming data, and it is difficult to unify; Lack of effective critical business traffic control technology; The global platform is full of difficulties in coordination and scheduling, and lacks a global perspective to unify the management of internal services.
(2) The self-research system faces many difficulties in going online
With the in-depth development of financial technology, the securities industry has begun to develop its own core system, but because of the lack of a unified development framework, each business research and development team in the specific development process in addition to business analysis, but also need to pay attention to a lot of technical details, such as relying on service interface docking, development of language skills, flexible and scalable architecture support, customer service governance assurance, external service agreement selection, service fault positioning, request traffic control, service security configuration, configuration management, traffic control, etc. The self-developed business also faces many practical problems.
(3) There are shortcomings in the traditional gateway model
The traditional core system basically adopts the gateway mode for external services, access control by the gateway, which generally has functions such as identity authentication, routing configuration, load balancing, etc. It can play a better role in clients like mobile phones, but there are obvious deficiencies in the internal service call of the core computer room.
Using the gateway mode, the channel side must encapsulate the TCP SDK by itself, switch the gateway, all traffic will hit the single gateway node, and the gateway itself will often become a bottleneck;
With the gateway model, scale-out is often achieved by deploying multiple gateway nodes, which will increase the workload and consume resources in the O&M deployment.
The use of gateway mode is equivalent to one more network jump, increasing network time, reducing the concurrent capacity that the system as a whole can withstand in the same deployment mode, and increasing system latency;
Using the gateway mode, the internal microservices of the system use the gateway external services, which cannot play the advantages of automatic registration and automatic discovery of microservices, and new services often need to modify the gateway configuration for discovery, and the overall architecture degenerates into the traditional architecture mode.
Ideally, business personnel care about business carding and scene definition, developers convert related business into service definitions, automatically generate interface code with the help of code generation tools, and finally implement the internal logic of the interface according to the business. The development framework and external tools are responsible for the implementation of a series of non-business-related functions such as architecture extensibility, service governance, and configuration management, so as to decouple the business and the framework and improve development efficiency.
Perfect service governance scheme is the cornerstone of stable operation of microservice architecture applications, Orient Securities with technical precipitation and practical experience in the field of service governance, on the basis of the gRPC framework to add service governance features, the construction of gRPC-Nebula service governance framework and star service governance platform, so as to achieve unified management of internal and external services, build service call relationships and topologies, optimize and improve service quality, Figure 1 shows the overall architecture of Orient Securities service governance projects.
Figure 1 The overall structure of the Orient Securities Service Governance Project
The Orient Securities Service Governance Plan mainly includes the following modules:
(1) Registration Center
A registry is a distributed, highly available configuration maintenance system for registration and subscription of services, which holds all service description information and service interface information. In the microservice framework system, the number of services and interfaces is very large, and due to the dynamic adjustment of the system, the number of instances running by the service is also dynamically changing, and the registry center can effectively optimize the perception and management of service consumers to the service provider by managing the service uniformly, avoiding hard-coding address information.
(2) Service consumer (client)
The consumer of the service obtains the service registration information through the registration center and initiates a call to the server based on the service registration information; At the same time, the information of the call side is collected and sent to the data processing engine for analysis and processing, providing client data for call chain analysis.
(3) Service provider (server)
The service provider publishes the service information to the outside world through the registration center and responds to the service call request of the consumer; At the same time, in response to configuration management operations initiated by the console, etc., configure management of service quality, security policies, data collection, etc.
(4) Information collector
Independently deployed services collect service calls, service responses, service exceptions, service times, call links, internal queue lengths, security events and other information generated by service providers and service consumers during service calls, and send them to the data processing engine for processing after collection.
(5) Data processing engine
Data processing engine, real-time analysis and processing of the information event stream sent by the information collector, including performance statistics, dependency analysis, threshold alarms, related clustering, status tracking, and available new analysis; At the same time, the data is stored in the performance management database for further analysis operations.
(6) Service governance portal
The Service Governance Portal summarizes operational data and management systems in the field of service governance and is a comprehensive platform for company-wide service governance. In the service governance portal, you can query key information and data such as instance information, interface information, service status, dependencies and dependencies, data statistics, and service call tracking records of each microservice, showing the panorama of the enterprise service governance ecosystem. At the same time, the service governance platform supports the management functions of black and white list, traffic control, weight configuration, main and standby configuration, and uplink and offline status, and supports the monitoring and alarm functions of objects such as call volume, performance, service quality, reliability, and fault events.
Orient Securities investigated the current popular open source microservices frameworks, including Alibaba’s Dubbo[5], Facebook’s Thirft[6], Google’s gRPC[7], and Spring Cloud projects developed from the Spring Boot framework[8], all of which have good connectivity, robustness, scalability and scalability, but Dubbo and Spring Cloud frameworks do not support multilingualism, Dubbo The open source community did not maintain updates for a while, and only recently restarted the update.
Because of historical reasons, the original core system of the securities industry has a variety of language development status, such as the core trading system and the same flower Shun online trading system using C++ language framework development, accounts, products, asset allocation, apps and self-developed systems mostly using the Java language framework for development, in order to solve the securities industry naturally exists cross-language scenarios, in the end we choose to develop gRPC-Nebula service governance framework based on the gRPC framework, Provide service governance overall solutions for business parties.
Compared to several other frameworks, gRPC has the following advantages:
Full multi-language support, gRPC supports multiple languages, including C, C++, Java, Python, PHP, Node.js, C#, Objective-C, Go, Ruby, Dart, etc. At present, the brokerage’s online trading and core trading systems are C++ architecture, while other self-developed systems are mostly Java and Python architectures, and gRPC can effectively solve the problem of cross-language calls to services;
With the strong support of Google and the majority of open source enthusiasts, gRPC is currently active and updated frequently, and has been put into production in many large technology companies around the world;
gRPC uses Google’s open source Protobuf 3.0 protocol to define interface services, Protobuf is a platform-independent, language-independent, scalable and lightweight and efficient serialization data structure protocol, widely used in network communication and data storage, technicians familiar with Protobuf to help gRPC technology in the enterprise promotion;
gRPC transmission uses the HTTP/2 standard, supports synchronous, asynchronous, two-way streaming, supports SSL and custom authentication, supports iOS, Android, Windows, Linux and other platforms, and can simply implement client-to-background multiplexing and RPC calls.
Compared with the native gRPC framework, the gRPC-Nebula service governance framework introduces ZooKeeper as the registry, as shown in Figure 2, integrating service registration discovery, load balancing, black and white lists, dynamic grouping, cluster fault tolerance, traffic control and other service governance mechanisms, and the technical solutions of these service governance mechanisms will be described in detail later in this chapter.
Figure 2 Orient Securities gRPC-Nebula Service Governance Framework Architecture Diagram
We performed performance tests on Dubbo, native gRPC, and gRPC-Nebula frameworks, respectively, and as shown in Table 1, the performance of the gRPC-Nebula framework is only about 1% lower than that of Dubbo and native gRPC frameworks, meeting the needs of high-performance service frameworks.
Table 1 Multi-frame performance test comparison
Service registration discovery is the most core mechanism in the field of service governance, the service provider registers the service information it provides with the registry at startup, and the service consumer obtains the address list of the service provider from the registry, as shown in Figure 3. The gRPC-Nebula Service Governance Framework uses ZooKeeper as the registry and has the following features:
Figure 3 Service Registration Discovery Mechanism
(1) High availability. When any node in the registry is down, the service can automatically switch to connect to other normal nodes; When the registry is all down, it only affects the release of new services and the offline of published services, and does not affect the normal operation of services, and service consumers continue to call using the locally cached list of service addresses.
(2) Ensure data consistency. The list of service addresses obtained by all service consumers from different nodes of the registry at the same time is the same data, and there can be no inconsistencies in reading or writing data. ZooKeeper uses the ZAB protocol as its core algorithm for data consistency [10], a distributed coordination service with the ability to strictly access control.
(3) Service change proactive. Service consumers only need to pull a full service address list from the registration center at startup, and then subscribe to the registration center to change the relevant services, once the service address list changes, the registration center will take the initiative to push the changed content to the service consumer, and the service consumer will immediately adjust the service address invoked.
(4) Real-time awareness of service status. The registry establishes a long connection with the service, and through the heartbeat detection mechanism, it can periodically detect the health status of the service, and when the service process is unexpectedly terminated or the server goes down, the registration center can immediately push the service offline notification to the service consumer to achieve fault isolation.
In a production environment, microservices are multi-instance deployments, and service routing determines how the service consumer selects the service provider from the list of service addresses to make calls. The gRPC-Nebula Service Governance Framework consists of the following three mechanisms for service routing:
(1) Load balancing mechanism
The gRPC-Nebula service governance framework supports two modes: connection load balancing and request load balancing, and the default connection load balancing provides four load balancing algorithms to choose from: random policy, polling policy, weight configuration priority policy, and consistent hash policy.
The random strategy is to randomly select the service provider to make the call; The polling policy is to traverse the list of service addresses, and select a service provider to call each time it is called; The weight configuration priority policy selects the service provider based on the weight ratio configured by the configuration file or the management portal for each service node; In the consistency hash policy, network requests with the same parameters are always handled by the same service provider, and when a service provider’s node goes down, the system selects other nodes based on the consistency hashing algorithm;
Figure 4 gRPC-Nebula Server Load Balancing Configuration
(2) Black and white list mechanism
By setting the blacklist and whitelist of server-side instances, you can dynamically transfer the request process and control the access control of server-side instances. If an IP is blacklisted for a service, service consumers deployed on that IP cannot obtain the address list of the service from the registry.
Figure 5 Black and White List Settings
(3) Weight configuration
gRPC-Nebula can set different service weights on service provider instances, and the framework distributes traffic according to the weights, which can achieve dynamic balancing according to different back-end resource capabilities.
Figure 6 List of service weights
Figure 7 Service Weight Settings
(4) Dynamic grouping mechanism
Each microservice instance has a grouping property stored in the registry, which can be preset through a configuration file or dynamically configured through the management platform. By grouping a microservice cluster, it can be divided into multiple collections, service consumers can call several specific grouped services by priority, and the dynamic grouping mechanism can flexibly realize scenarios such as same room invocation and service isolation.
Server configuration:
Client configuration:
Figure 8 Dynamic Grouping Configuration
In today’s data room security is more and more important, multi-computer room disaster recovery solutions are widely used by all kinds of enterprises, but the high time consumption of cross-computer room calls may cause the capacity of the system to decrease. As shown in Figure 8, assuming that all service instances are deployed in two remote computer rooms A and B, service consumers want to prioritize the service provider belonging to the same computer room, and the policy flexibility and scalability of using IP segments to define the data room can effectively meet this demand. For example, the service provider of room A is defined as a group, and the service consumer of computer room A is configured to call the node of a group first, and the service of room B is similarly configured. In this way, the service consumer of computer room A will give priority to calling the service provider of computer room A, avoiding time-consuming cross-room calls, and when Server1 and Server2 are all down, the service consumer of computer room A will automatically switch the request to Server3 and Server4 of computer room B.
Figure 9 The machine room call scenario
Service isolation scenario, service isolation is a common strategy to avoid the avalanche effect of microservice systems, when a service provider is called by multiple consumers, the traffic surge of individual consumers may cause the entire service provider cluster to overload the operation, thereby affecting the call of all consumers. The service grouping function of the service governance platform can divide the service provider cluster into independent collections, and consumers only call the instances of specific groups, so that each consumer’s calls are isolated from each other and do not affect each other, which can effectively ensure the high availability of the entire system. As required in Figure 9, the backend service can provide different service level guarantees for transaction access, wealth sales center, Oriental Rui and other systems by setting tc, wmp, and matrix groups, so as to achieve business isolation requirements.
Figure 10 Service isolation scenario
When the service provider cannot provide services to consumers normally, such as connection rejection, request timeout, background service exception, etc., the service framework needs to perform cluster fault tolerance, rerouting and calling, and the gRPC-Nebula service governance framework supports two strategies: Failfast and Failover:
Fast failure means that if the service provider returns an exception, the consumer does not have to retry the error directly. This strategy is suitable for some non-core services and can save valuable resources for important core services.
Failover is when a service call is abnormal, re-routing and finding the next available service provider to initiate a new call request. When calling a node fails multiple times in a row or the failure rate exceeds the limit in a period of time, the framework believes that the node is no longer suitable for external services, and the service consumer will remove it from the service address list to ensure that the abnormal node will not be called for a period of time. The purpose of this mechanism is to reduce the sensitivity of the system to network jitter, not to adjust the traffic distribution due to an accidental call failure, and to maintain the stability of the server load. As with the service grouping properties, thresholds for consecutive failures and failure rate over time can be configured through configuration files and management platforms.
Figure 11 Cluster Fault Tolerant Configuration
In history, the core system accidents of securities companies are caused by traffic shocks, when the network traffic grows explosively in an instant, the preemption of server CPU and IO resources will cause the system to bottleneck, the service error rate rises rapidly, and the upstream or user retrial further increases the network traffic, and finally makes the service completely collapse and difficult to recover for a long time, which is the avalanche effect. In order to prevent avalanches, it is necessary to control the service call process, cut traffic through some policies, and ensure that the requests received by the background service are within the affordable range.
The gRPC-Nebula service governance framework dynamically implements flow control management of each service interface by setting the limit on the number of requests and connections. The limit on the number of requests is to discard excess requests when there are too many requests per unit of time; The number of connections limit is to control the number of connections per IP connection to the service provider, within the framework the service calls through the gRPC HTTP/2 protocol to maintain a long connection, when the number of connections reaches the threshold, the service provider will reject the request to establish a new connection.
Figure 12 Flow Control Configuration
The access protection status function is a way for the service governance platform to control the up-and-down of server-side nodes, which can be used in scenarios such as lossless release and rapid removal of faulty nodes. For example, when the system is online and updated, the shutdown or restart of the service will lead to an increase in the failure rate of invocation in a period of time, in order to avoid failure, the operating instance can be set to inaccessible through the service governance platform, and the registry will notify all consumers that they will no longer call the inaccessible node. After confirming that there are no invocation requests for the server-side instance, the O&M personnel implement the update operation, and then set the instance back to be accessible after the update is successful. Traffic is smoothly removed and returned from the instance during the update process without incurring call failures.
Figure 13 Access Protection Flow
Figure 14 Access Protection Settings
Due to security control requirements, many enterprises divide their networks into multiple network segments, and each network segment deploys a separate registry. The GRPC-Nebula framework supports registering services with multiple registries at the same time, and custom server-side IP ports can be registered with registries to accommodate possible IP address mapping or proxy scenarios between multiple CIDR segments.
Figure 15 Multi-registry support
The gRPC-Nebula framework supports active-standby services and enables the setup of primary and standby servers on instances. When the primary server is available, the client can only call the primary server, not the standby server; When all primary servers are unavailable, clients automatically switch to the standby server for service calls.
Figure 16 Schematic of the active and standby services
Figure 17 Active/standby service settings
The gRPC-Nebula framework supports different types of grpc services for the same project with different visibility. The interface implemented by the service provider can be divided into two types of services, and for gRPC invocation services between internal projects, such services are not exposed to the outside world, so the external projects should be avoided; For the gRPC service provided externally by the project, you need to allow external systems to be visible.
Figure 18 Internal and External Services
When a gRPC connection to the server fails, it is often desirable not to retry immediately (to avoid flooding network traffic or large service requests), but to do some form of exponential backoff algorithm. The reference links are as follows:
https://github.com/grpc/grpc/blob/master/doc/connection-backoff.md
But this form often causes the server to fail, the client continues to degenerate the reconnection time, a long time will degenerate into a very large time, when the server restarts successfully, the client can not connect successfully for a long time, so gRPC-Nebula modified the native framework, the client can configure the maximum reconnection time to avoid such risks.
Figure 19 Backoff algorithm settings
https://github.com/grpc/grpc/blob/master/doc/keepalive.md。
However, in the actual production network environment, the firewall is usually set to 15 minutes will actively disconnect the TCP connection without request, and the characteristics of the securities industry cause the service request to be mainly concentrated in the 9:15-15:30 time period, so that there will be a large number of TCP connections disconnected during non-trading hours, so we modified the gRPC framework so that the client and server can configure the heartbeat time by themselves.
Figure 20 Client Heartbeat Settings
Figure 21 Server heartbeat settings
Due to the actual needs of the business and technological development, development departments and suppliers often choose different microservice frameworks as needed, presenting a variety of options. How to manage these services has become a problem for R&D and O&M departments. If these frameworks and services can be docked to a unified service governance platform, the cost of collaborative development can be greatly reduced and the overall version iteration efficiency can be improved, so the construction goals of the Orient Securities Star Service Governance Platform include:
General governance capabilities: Introduce middle-layer design, compatible with multi-framework general governance capabilities, use distributors and governance components to coordinate work to unify multi-framework common governance capabilities, send tasks from distributors to different governance components, complete the platform management multi-framework by rational components, and complete the governance tasks issued by distributors. Platform self-service: The platform itself adopts microservice architecture and container platform integration to provide deeper governance functions, providing platform application lifecycle, component deployment management, grayscale, elasticity, and unified configuration support.
Multi-framework compatible application management: Compatible with three microservice frameworks based on gRPC, Spring Cloud, and Dubbo to help customers quickly deploy or migrate microservice applications.
Business service architecture anti-corrosion: Through the service registration center, the service strength and weakness dependencies are analyzed, combined with the runtime service call chain relationship analysis, the unreasonable dependencies and call paths are sorted out, and the service-oriented architecture is optimized to prevent code corruption.
Fast fault demarcation and positioning: Through service call chain logs, service performance KPI data, service interface logs, and operation logs, real-time summary and analysis, realize automatic fault discovery, automatic analysis, and online condition retrieval, and facilitate fault diagnosis by O&M personnel.
Service micro-control: Operational service governance, including flow restriction and downgrade, service migration in and out, service timeout control, intelligent routing, unified configuration, etc., through a series of fine-grained governance policies, can be multi-pronged, online adjustment, and rapid recovery of business when a fault occurs. Service lifecycle management: including on-line approval, off-line notification, online upgrade of services, and up-and-down and off-line, automatic auto scaling, and resource expansion.
The Star Service Governance Platform contains the following functional modules:
(1) Service governance
The Star Service Governance Platform supports the management of gRPC, Spring Cloud, and Dubbo microservice frameworks, as shown in Figure 5, supports querying the service instance information maintained by the registration center, and supports configuring the registration center, access control, main standby, grouping, black and white list, traffic control, circuit breaker and other information through the console.
Figure 22 Service Governance Features
Figure 23 Inter-service calls
(2) Service map
The service map shows the invocation relationship and invocation volume between items and items, services and services in the form of a topology diagram, as shown in Figure 6. The system architect can extract the company’s core system topology map from the service map to find out the unreasonable ring call chain; O&M personnel can grasp the upstream systems on which the core system depends from the service map, and give the core system the same level of key support. When it is expected that traffic will surge, the service map can also be used for traffic estimation, because the traffic is most accurate from the portal such as the client, and then the traffic allocated by each microservice can be calculated along the sinking of the service map to assist the background system in formulating the expansion plan.
Figure 24 Service Map of the Star Service Governance Platform
(3) Link tracking
In a microservice architecture, a user operation involves the coordination of multiple microservices to complete, and any one microservice on the service call link will cause an exception or network timeout, which will cause failure. Through link tracing, we can easily see the time consumption and exceptions of each link of each request, which helps us optimize the system. The link tracking feature of Xingchen is based on Google’s DApper paper [11], and after the system portal receives the user’s request, the user’s request is assigned a TraceID to uniquely identify the call chain. The TraceID follows the remote call message to the downstream service until the entire link node has a TraceID, through which the complete call link for this request can be threaded.
Figure 25 Call Chain Relationships
(4) Document Center
The documentation center automatically parses the ProtoBuf format interface definition files and provides technicians with the ability to query each service annotation information and interface definition. We think of the document center as a service bazaar, and technicians can come to the document center to find the details of the interface like a bazaar before implementing a function involving a common module or a third-party application. In the future, we will also strengthen the interactive communication function of the document center, increase the function of Q&A and comment, and open up the communication channels of the upstream and downstream of each service.
Figure 26 The Document Center
(5) Statistical analysis
The statistical analysis module supports performance monitoring of services, instances, and endpoints, including response time, availability, and throughput. Support data screen, panoramic display of the current operation status of all services; Record the processing time of user requests and analyze the performance of the called service; Record the service response time and show the several services with the longest response time, that is, the list of slow services.
Figure 27 Platform Statistical Analysis
(6) Alarm center
Alarm Center supports alarm rule settings based on monitoring data and issues alarm notifications in a customized manner.
Figure 28 Alarm settings
(7) The framework is unified and managed
In order to better support the transformation of enterprise architecture and facilitate the choice of more microservice frameworks within each business system, Chenchen Service Governance Platform has unified the management of multiple microservice frameworks at the same time, and added dubbox and Spring Cloud frameworks on the basis of supporting gRPC services.
Figure 29 Multi-frame support
Over the years, with the continuous development of various businesses of Orient Securities and the online operation of hundreds of sets of services and support systems, complex dependencies have begun to appear between various application systems, and the complexity of system operation and maintenance has increased dramatically. Especially due to the influence of factors such as the construction of the system in the past, there are a large number of heterogeneous business systems within Orient Securities, and the interfaces exposed to the outside world also take various forms, each manufacturer has its own private protocol, and there are various types of heterogeneous interfaces such as SPX, T2, Web Service, REST, TCP, etc., which further increases the difficulty of system development and operation and maintenance. For these reasons, Orient Securities has made changes in corporate strategy and technology management. At the beginning of 2018, the company put forward the strategic goal of “digital transformation” and listed “enhancing the application of fintech” as one of the company’s six strategic development tasks, which prompted us to develop technology with a more positive and open mind.
At the same time, Orient Securities has simultaneously formulated a large and middle office strategy in the enterprise technology architecture, which is also one of the three core strategies in the company’s digital transformation (the other two: core independent research and development, remodeling of the enterprise technology architecture), aiming to lay a solid foundation for the long-term development of the company’s science and technology work through the architecture transformation. In order to promote the structural transformation work, in early 2019, Orient Securities established an architecture committee with Chief Information Officer Shu Hongzong as the director, aiming to focus on the work of enterprise structure transformation through the architecture committee, and at the same time created the Fintech Innovation Research Institute, which integrates the technical resources of Orient Securities and its subsidiaries and advocates research-led technology application as the main task.
In order to build the ability of each business party to follow the standard for development, so that the enterprise architecture construction has a basis to rely on, the Orient Securities Architecture Committee has formulated the decision-making process and mechanism of the architectural standard, and constrains the corresponding development practices of each system through the architectural standard, in May 2019, we have identified the core framework of the transformation of the enterprise technology architecture through the service governance platform access specification of the normative architectural standard, and each system adopts a unified interface call mode, requiring inter-system calls to provide services using gRPC, and the system can be adopted internally The three frameworks of gRPC/dubbo/Spring Cloud support the comprehensive transformation of Orient Securities’ IT technology architecture from the traditional architecture to the modern service-oriented architecture with microservices as the core.
Figure 30 Service governance architecture decisions
In order to achieve digital transformation, Orient Securities has formulated a three-year architecture plan, through the architecture planning and implementation, to create an industry-leading IT architecture and first-class IT team, the formation of the company’s competitiveness in scientific and technological development, there are seven key tasks in the architecture plan, one of which is to build a strong business sharing middle office, refine the core business processes, provide business sharing capabilities for the front-end product line, provide strong “artillery support” capabilities to the front desk, and ensure the successful landing of “thick middle office, thin application”.
According to the characteristics of the securities industry, we divide the entire middle office field into products, accounts, wealth, transactions, assets, quotes, information, logs, certifications and other centers, and the service governance framework has become the core infrastructure of the entire business middle office.
Figure 31 Orient Securities Big Middle Office Strategy
Orient Securities gRPC-Nebula[9], itself is standing on the shoulders of the giants of open source gRPC development, in order to better feedback the community, in mid-June 2019, Orient Securities announced the open source gRPC-Nebula service governance framework, open source address: https://github.com/grpc-nebula, the community has built a community decision-making committee, the initial plan is to set up 7 members, including 1 committee chairman, with a special person to carry out GitHub code tracking, maintaining, and resolution. At the same time, the committee will regularly organize discussions and normalized communication, community technical exchanges, coordinate development forces for community development, community fundraising, review versions and feature sets of version maintainers, and conduct community committee elections. The community will uphold the principle of financial technology innovation and external technology output, and strive to become the first open source community in the industry under the gRPC gRPC framework, and won the 2019 OSCAR Peak Open Source Technology Innovation Award (community-based open source secondary development) and the first prize of the 4th China Excellent Cloud Computing Open Source Case.
At the same time, the entire securities industry has not had a unified standard in the field of technical architecture for a long time, each manufacturer has its own corresponding technical framework, resulting in the heterogeneity problem that the entire industry has been facing, gRPC-Nebula open source is also designed to provide reference suggestions for the entire industry, at the same time, we have also actively joined the Shenzhen Stock Exchange Technology Product Alliance to promote the use of gRPC-Nebula in the industry, hoping to reach an industry consensus, form a unified technical standard, and greatly reduce the cost of docking between various systems in the industry.
Since the beginning of 2019, Orient Securities has carried out the research and development of service governance framework, and by September 2020, the gRPC-Nebula framework Java language has been iterated on 14 versions, the C++ language has been iterated for 8 versions, and the platform has been iterated for 4 versions, which better supports the various needs of the business.
At the same time, Orient Securities has vigorously promoted the use of service governance framework and platform internally, and as of September 2020, a total of 46 applications of various types, 99 projects, 369 services, 3125 methods have been accessed, and the daily carrying volume of various requests has reached tens of millions.
From the perspective of system dimensions, the governance framework of Orient Securities Services has been connected to the internal Oriental Winner App, private bank App Tongdaxin online trading, Flush Institutional Edition, Oriental Rui, Oriental Brain, Robo, large operating platform, business middle office (wealth, trading, products, accounts, assets, quotes, information, etc.), asset allocation and other core systems, from the perspective of manufacturers, Microsoft, Hang Seng, Jinshida, Xinyi, Apex, Tongdaxin, Core manufacturers and self-research teams such as Flush have also accessed the service governance framework and platform in accordance with the architectural standards, and the practical results are very obvious.
Figure 32 Service Governance Platform Practices
This paper discusses the key technologies in the field of enterprise architecture, and introduces in detail the construction results and practical experience of the Orient Securities Service Governance Project. Orient Securities has formulated a large and medium-office strategy at the level of enterprise structure, aiming to lay a solid foundation for the long-term development of the company’s technology work through business structure transformation. As a core component of the strategy of the big middle office, the construction of service governance projects is an important breakthrough for the company to improve the core competitiveness of financial technology. The gRPC-Nebula framework and the Star Service Governance Platform have been applied to dozens of projects such as the Orient Securities business middle office (Wealth Center, Trading Center, Account Center, Product Center, etc.), Oriental Winner App, Private Bank App and Oriental Rui Institutional Trading Product Line, and at the same time realize the unified management of various microservice frameworks (gRPC-Nebula/dubbox/Spring Cloud), providing more development framework choices for business line development. With the continuous optimization and development of the platform ecosystem, it will be fully promoted internally in the future, serving more product lines and users, and making more contributions to the company’s IT governance specifications and systematic architecture construction.
[1] Fowler M, Lewis J. Microservices. Viittattu, 2014, 28: 2015
[2] Fotis Aisopos, Konstantinos Tserpes, Theodora Varvarigou. Resource management in software as a service using the knapsack problem model. International Journal of Production Economics. 2013 (2)
[3] Thones J. Microservices. Software IEEE, 2015, 32(1): 116-116
Li Zhenhao. Development and Impact Analysis of Microservices Architecture. Information Systems Engineering, 2017(1): 154-155
[5]https://dubbo.io/
[6]https://thrift.apache.org
[7]https://grpc.io/
[8]https://spring.io/projects/spring-cloud/
[9]https://github.com/grpc-nebula
[10] Hunt P,Konar M,Junqueira F P, et a1. ZooKeeper:Wait-free Coordination for Interact-scale Systems[C]// USENIX Annual Technical Conference. 2010, 8: 9.
[11] Sigelman, Benjamin H., et al. DApper, a Large-Scale Distributed Systems Tracing Infrastructure. Technical report, Google, 2010
Past Recommendations:
Technical trivia
Based on the idea of distributed design, architecture and system, it also discusses the bits and pieces related to R&D, not limited to code, quality system and R&D management. This number is maintained by the technical team of veteran drivers.
Long press to scan the code to follow