id=”js_tags” class=”article-tag__list”> included in the collection #数据中台
Guide: Looking back on 2021, due to technological changes, industry concept innovation, and changes in policy trends, many hot concepts have emerged in the field of data middle office. We have taken stock of the top ten hot keywords in the data middle office industry in the past year, including hot technologies such as cloud native and privacy computing; market trends such as domestic information innovation and PBC; There are also popular business directions such as lightweight data middle office and CDP. While enriching the existing model, these concepts also bring new enlightenment for the future development of the industry. Computing Yan Shu, Deputy Director of the Department of Cloud Computing and Big Data and Blockchain of the China Academy of Information and Communications Technology , commented that security is the basic guarantee, and performance determines whether it is available. Relevant backgroundOn September 1, 2021, China’s first special law on data security, the Data Security Law of the People’s Republic of China, was officially implemented, which has become an important cornerstone in the field of data security and the development of the digital economy. The Data Security Law clearly points out that equal emphasis is placed on development and security, and data application and circulation are encouraged on the basis of emphasizing data security, and research and application of related technologies are encouraged. As a key technology that can take into account data application and security protection, privacy computing has become the focus of industry attention. According to relevant statistics, in 2021, privacy computing products will grow rapidly, and the industrial explosion is becoming a reality. At present, in the field of big data, the application of privacy computing is mainly embedded in the big data platform, and the interface is shared, which has become a key component in the big data platform. Definition of terms Privacy computing mainly refers to a type of information technology that realizes data analysis and computing under the premise of protecting the data itself from external disclosure, which is mainly divided into two major fields: trusted hardware and cryptography. Data releases value through flow, and privacy computing is of vital significance in the context of market-oriented allocation of data elements becoming a national strategy. Q : What is the technical route of privacy computing? At present, there are two main technical routes for privacy computing, one is multi-party secure computing, and multi-party collaborative computing is carried out safely in the absence of a trusted third party. In a distributed network, multiple participating entities each have a secret input to jointly complete the calculation of a function, and each participating entity is required to obtain no input information from other participating entities except the calculation result. At present, multi-party secure computing is mainly used in the financial field, and scenarios are mostly common in multi-party joint risk control and precision marketing. Another technical route is federated learning, which is also the most concerned technical route at present, according to statistics, about 42% of the current products are using federated learning architecture. But due to its safety issues, the industry is still controversial. Xu Xibin, CTO & Dean of Shulan Research Institute , commented that there is no pseudo-concept, only real demand, and in the cloud era, cloud native is needed. Related background In September 2020, Warren Buffett, the “stock god” who has never participated in the IPO in the past 54 years, uncharacteristically bet on cloud-native data warehouse vendor Snowflake. Snowflake rose more than 110% on the day of its listing, and since then, its share price has soared, and its market value has soared to $120 billion, surpassing the established IT giant IBM. According to 36Kr, the rapid growth of Snowflake’s stock price comes from capital’s optimism about the concept of “cloud native”. “It’s not so much that Buffett is betting on Snowflake, but on a cloud-native future that previous data warehouses can’t match.” Under the Snowflake boom, in 2021, many “cloud native” concept manufacturers have appeared in China, and some data middle office startups have shouted the slogans of “cloud native data middle office” and “China’s Snowflake”. The term Cloud Native refers to a theory of building and running applications on the “cloud”. Literally, cloud native can be split into two parts: “cloud” and “native”. “Cloud” is relative to “local”, traditional storage, computing are run on local servers, and in the era of cloud computing, gradually moved to the cloud. “Native” means native and based on the original environment. Q : Will the cloud-native data middle office be the future trend of the industry? It is safe to say that cloud computing must be the key to the future, and likewise, cloud native must also be the trend of industry development. However, compared with peers who promote the concept of cloud-native data middle office, it is worth calmly thinking about what is the cloud computing penetration rate of Chinese enterprises, borrowing a data – “Tmall Double 11 in 2021 is the first 100% Double 11 on the cloud.” This also means that even industry pioneers have only completed full cloudification, and most traditional companies have less than 20% of the cloud process, so at this stage, I and Shutan Technology advocate the concept of “multi-cloud adaptation” to solve the increasingly complex and cloud-based IT system architecture in the process of enterprise digitalization. batch integration white pine, the deputy general manager of Shulan Technology R&D Center commented that stream and batch integration will be a necessary choice for enterprises to build a data middle platform. The traditional Lambda architecture data development requirements of Popular Events are based on the output of two sets of compute engines, stream and batch, and this separate architecture will not only bring two sets of development costs, but also make it difficult to align the data logic and caliber. Therefore, the industry has proposed the technical concept of stream-batch integration, the original intention is to enable developers to use the same set of code to implement stream computing and batch computing of big data, so as to ensure the consistency of the processing process and the results. In the past two years, as major data vendors have successively proposed their own solutions, to a certain extent, the idea of flow and batch integration has been widely recognized in the industry. At present, the flow and batch integration technology has moved from theory to implementation, and Flink flow batch integration has been successfully verified in many business scenarios, and it is believed that it will usher in a period of rapid development in the future. Moreover, as more large Internet companies successfully land and export their experience to the industry, more small and medium-sized enterprises will choose to follow up and try the integrated flow and batch architecture. The same set of interfaces is used to realize stream computing and batch computing of big data, so as to ensure the consistency of the processing process and the results. Q : What are the main aspects of the core of the concept of flow and batch integration? Stream-batch integration is mainly reflected in the following four aspects: unified metadata, unified computing engine, unified data storage, and unified IDE. Unified metadata, offline and real-time metadata unified storage, the industry generally chooses Hive MetaStore. Unified computing engine, offline and real-time computing adopts a unified computing engine, and uses a logical set of code to cover offline and real-time scenarios. Unified data storage, unified storage of offline and real-time data, to avoid data inconsistency, duplicate storage and double calculation and other problems. A unified IDE provides a unified set of development tools that support data development, O&M, and management in the integrated flow and batch mode. Shaochun, chairman of the board of directors & CEO of Kingdee Group commented that huge things always have a small beginning.
With the intensification of Sino-US competition, China’s high-tech enterprises have been blocked by the United States’ technology, chips, basic software and middleware have basically been monopolized by American enterprises, and “lack of core and soul” has become a major problem in the development of China’s information industry, which has attracted widespread attention throughout the country.
In this context, the concept of “localization of information and innovation” was proposed, shouldering the important mission of developing China’s information and innovation ecological industry and strengthening China’s information technology security, controllability and open innovation.
Definition of terms Localization
refers to the realization of comprehensive localization substitution in the information technology application innovation industry chain (including chips, system platforms, middleware and application software, etc.).
05 PBC (Encapsulatible Capability
)
Gartner (*quoted from Gartner Public Report)
Hot Event
Gartner predicts that by 2023, 60% of mainstream enterprises will have composable enterprises as a strategic goal and will use more and more packaging business capabilities.
Sun Xin, research director of Gartner, said in an interview: “The ultimate purpose of building a middle office is for users to use, and only when users continue to assemble ‘building blocks’ to achieve more innovative and data-monetized applications can we maximize the value of the middle office.” “
Packaged Business Capability (PBC) is a packaged business capability, defined by software, and most of the time delivered in the form of API services and modularity.
PBC has four attributes.
Highly modular components can be quickly applied to enterprise business scenarios;
Sufficient autonomy to be self-sufficient, ensuring sufficient flexibility when assembling;
Can be orchestrated, “building blocks” can be packaged into a certain process according to a certain process and event interface, or packaged into a certain process through the API;
Components can be discovered, ensuring that the business capabilities of the package are semantically clear and easily accessible.
Q
: What type of PBC is needed for the data middle office?
The data middle office requires the following types of PBC.
Encapsulated analytical capabilities PBC to aid decision making.
Self-service capabilities, traditional IT-driven report reports can no longer meet user needs, need to be packaged, and self-service analysis capabilities are delivered to the business, users can do exploratory drag-and-drop analysis in the form of low-code or no-code.
Predictive analytics powered
by machine learning to make analytics more effective with better rehearsals and predictions.
Zhao Donghui, head of data intelligence business of Shulan Technology,
commented that
data intelligence will become the key to the digital transformation of enterprises.
Relevant background
data intelligence is a new concept born around 2018 and has received widespread attention from the industry in 2021.
The proposal of data intelligence is mainly based on two major reasons. After a large amount of data online and digitized in previous years, Chinese companies now have a lot of data assets. Second, with the deepening of digital transformation, enterprises need more diversified technologies to solve problems in complex business scenarios. The emergence of data intelligence is taking on this function.
The definition
of the term data intelligence refers to the use of data
as a production factor, through the integration of data processing, data analysis and mining, data visualization, machine learning and other big data and artificial intelligence technologies to achieve data-driven intelligent analysis and decision-making.
: How is data intelligence different from AI?
The core of data intelligence lies in “big data driven and driven by business scenarios”. Artificial intelligence technology is biased towards general scenarios applicable to various industries, such as face recognition, intelligent customer service, etc., while data intelligence must combine big data and industry business scenarios, integrate business knowledge, and then develop and optimize technology.
Data intelligence combines the advantages of artificial intelligence, intelligent self-learning capabilities, and also focuses more on business scenarios, which can help enterprises solve complex and deep business problems.
>07 CDP (Customer Digital Operations Platform)
is on the slope, and the author of the best-selling book “Data Middle Office: Making Data Work”
commented that
CDP will bring a new background to the MarTech ecosystem for a new technological revolution
The drastic changes in the Internet marketing environment, the traffic dividend is gradually drying up, the external data that can be used is becoming more and more limited, and the cost of enterprise customer acquisition is getting higher and higher. The focus of enterprises has changed from how to quickly obtain traffic in large quantities to how to improve the full life cycle value of consumers and build refined operation capabilities around user data collection and precipitation, user label system construction and user operation platform, thereby bringing overall growth, CDP has become the most popular MarTech option nowadays.
According to Markets and
Markets, CDP’s global market size will continue to expand at a compound annual growth rate of 29.3%, reaching $2.5 billion in 2020 and expected to reach $10.4 billion in 2025. The compound annual growth rate (CAGR) during the forecast period is 34.2%.
David Raab, founder of CDP Institute, first proposed the concept of Customer Data Platform (CDP) in 2013: CDP is a software package that creates a continuous, unified consumer database that can be accessed by other systems. Capture, cleanse, and integrate data from different sources to generate a single consumer portrait.
At present, the definition of CDP in the industry is different, but the core lies in the acquisition, management and application of private domain traffic consumer behavior data throughout the life cycle, all of which are aimed at helping enterprises make better use of data, apply data to marketing activities and optimize consumer experience management, ultimately bringing business growth. The core functions of CDP can be roughly divided into four modules: data collection and cleaning, creating Super ID (one ID), label classification, and data servitization.
Q
: Which companies need CDP (Customer Digital Operations Platform)?
The enterprise owns two or more various digital platforms (WeChat, website, SMS, APP, etc.), and there are more users participating or interacting with them;
Enterprises need to design different marketing plans for different types of customers, potential customers or users, and provide different marketing
experiences for these different types of people based on these designsEnterprises need marketing
planning, or advertising is closer to the target consumer group to obtain better user response rate, sales conversion rate and brand recognition
Enterprises often carry out remarketing advertising (secondary targeted advertising/retargeting advertising) delivery. In the hope of expanding the scale of remarketing, advertising companies in a more segmented user base
need to adopt a large number of marketing automation market operation methods (especially automated marketing based on WeChat official accounts, mini programs, SMS and email).
Data Middle Office
Li Yan, Guiyang Economic Development Zone Urban Data Middle Office Project Expert
commented on
the
urban data middle platform to help the digital transformation of the city.
Related backgroundUrban
data is scattered
in hundreds of systems owned by governments, enterprises, residents and other entities, and it is difficult to obtain information, fragmented data resources, and scattered transmission, resulting in inefficient use of massive urban data, which cannot fully support urban renewal and upgrading, and the emergence of urban data middle platform effectively solves this dilemma.
The urban data middle platform is composed of data production factors, data production tools, and data production environment, with urban data as the core and data security
as the basis, focusing on the whole life cycle of data generation, intervention, processing, analysis, application and destruction, etc., it can quickly perceive urban operation, build a closed-loop management based on data security, accurately monitor, make accurate decisions and accurately implement policies for economic and social development, and realize continuous iterative upgrading of urban operation and operation.
The
urban data middle platform of the term definition contains two levels, one refers to the connection of various information systems in the government affairs system, that is, we often see the so-called aggregation and sharing of government data.
In addition, the urban data
middle platform also contains the data collection of the entire city’s global IoT sensing network, and gathers the sensing data from the whole region by building a unified IoT device management platform to achieve unified scheduling and command of public affairs and urban governance. For example, the terminal data of face recognition in various communities that have been popularized in recent years is also the demand for the IoT management module in the data middle office.
data middle office is
outstanding, and the general manager of the OLAP product department of Alibaba Cloud Database
commented that
the
data middle office is a very important part of the information and innovation ecological industry chain.
Related
backgroundThe digital transformation of enterprises drives increasing market demand, the data middle office industry has obvious growth momentum, and the market scale is expanding rapidly. At the same time, the data middle office has gradually landed in the leading enterprises in the industry, and it has become a market trend to provide standardized and lightweight overall solutions for small and medium-sized enterprises with relatively simple data middle office capabilities.
On October 21, 2021, Alibaba Cloud X Data Computing Technology jointly released the “Lightweight Data Middle Office Solution” at the Cloud Native Data Warehouse AnalyticDB Technology and Practice Summit of the Yunqi Conference.
The
lightweight data middle office solution is based on the data middle office construction suite of Shulan Technology as the core and integrates Alibaba Cloud’s self-developed cloud native data warehouse AnalyticDB. Based on the characteristics of AnalyticDB stability, high real-time response, and high throughput, online processing of petabyte-level data has great advantages, which can greatly improve data processing efficiency, and provide strong support for customers in finance, government and enterprise, and pan-Internet to build data assets and release data value.
At present, the “lightweight data middle office solution” has been deeply applied in scenarios such as management and finance integration, receivables and payables, human resource topics, time series data analysis, multi-dimensional real-time data monitoring, and digital twins.
Q
: Compared with the traditional data middle office, where is the “lighter”?
Agile development: integration of storage, design, computing and services, precipitation of industry-oriented business models, agile customization, and rapid deployment;
Lightweight and efficient: simplifies the traditional middle office architecture, replaces the hadoop ecosystem with ADB, reduces the complexity of enterprise resource storage and computing, and expands the cost storage level, and provides high-performance database storage to provide guarantee for multi-dimensional analysis and decision-making.
Scenario matching: large scenes, small cuts, fast applications, large traction, bottom-up, build data assets around the scene, realize data modularization, plug-in, and scenario, quickly match needs, and get close to value.
Peng Wenhua, an expert of the Data Resource Service Innovation Committee of the China Electronic Chamber of Commerce,
commented
on the management of data in one sentence, the key lies in people.
Relevant backgroundIn
May 2021, the General Office of the Guangdong Provincial Government issued the Guangdong Province Chief Data Officer System Pilot Work Plan, encouraging pilot units to take the lead and pilot projects, strengthen cross-departmental, cross-level and cross-field coordination mechanisms, and accumulate replicable and scalable experience and practices for the full implementation of the chief data officer system. Subsequently, Shaoxing City, Zhejiang Province and Binjiang District of Hangzhou City have successively launched the chief data officer system, opening a new chapter in digital government reform.
The
position of Chief Data Office (CDO) was first created by enterprises, and its main responsibility is to provide strategic suggestions and opinions on the future business development and operation of enterprises through data mining, processing and analysis according to the business needs of enterprises.
Today, the special position of chief data officer is also used in the digital reform of government. According to the definition of the chief data officer system in Shaoxing, Zhejiang Province, the chief data officer is the first person responsible for the overall management of public data resources, component resources and application resources in the unit (department), and is the technical gatekeeper for the overall construction of multi-cross-scenario applications.
Q
: Why is there a Chief Data Officer in a government unit?
The use of big data to improve the modernization level of national governance and promote the construction of digital government has become the general trend. According to relevant statistics, 203 cities (prefectures and leagues) in 17 provinces (municipalities directly under the central government) in China have set up special big data management institutions. However, in the actual process, the channels for sharing data resources between governments are not smooth, government data is often deposited by government ministries and commissions at all levels or directly subordinate relevant units, the functions of the established big data management institutions are limited, and the aggregation, integration, and sharing of data often need to be actively promoted from a higher level from top to bottom.
Secondly, problems such as the large amount of public data, inconsistent data standards, and poor scheduling of data resources make it difficult to effectively develop and utilize a large number of valuable government data resources, so it is necessary for the government to set up a chief data officer system.
end
public number (zhisheng ) reply to Face, ClickHouse, ES, Flink, Spring, Java, Kafka, Monitor keywords such as to view more articles corresponding to keywords. like + Looking, less bugs 👇