Xiaohongshu is a life recording and sharing platform for young people, users can record life bits and pieces and share their lifestyles through short videos, pictures and texts. After 2017, with the explosive growth of business types and user volume, the demand for various data analysis and application systems has rapidly emerged, such as: business intelligence analysis, data application reports, user behavior analysis, algorithm strategy data, etc. The Xiaohongshu big data team has gradually introduced a variety of OLAP analysis engines to better meet the needs. DorisDB adopts comprehensive vectorized computing technology and is a very powerful new generation MPP database. By introducing DorisDB, Xiaohongshu has built a new unified data service platform, greatly reducing the complexity of data link development and improving the ability of high-concurrency and extremely fast query.
Author: Wu Haoliang

,

Little Red Book Big Data Team, Data Warehouse Architect

The evolution of OLAP engine in Little Red Book

The first stage, before 2017, the total amount of data is not particularly large, this stage uses AWS’s Redshift, at this time the data warehouse system has not been fully established, many data requirements are implemented with short, flat and fast, Chimney-style development way to meet. Data ETL, data warehouse model to the final report end, in Redshift one-stop completion.
However, with the increasing complexity of the business and the rapid growth of data volumes, this model quickly hit a bottleneck. There are mainly the following problems:
  • Redshift cannot elastically scale without affecting the performance of online queries, and once it involves scaling, it will involve data redistribution, which will affect the performance and availability of the cluster.

  • ETL tasks seriously affect cluster availability. When ETL tasks are performed at the same time in Redshift, a large number of resources will be preempted, which will affect the efficiency of data analysis, resulting in query timeouts or even the entire cluster crashing and unavailability due to excessive cluster load.

  • Without good storage and computing separation, there is a bottleneck in data storage capacity, which cannot meet the data storage requirements that grow rapidly with the business.

Phase II With the data warehouse being built and improved on the Hadoop/Hive system, ETL tasks are all transferred to the Hadoop cluster, and Presto is used to complete OLAP analysis at this stage. Presto naturally shares metadata information with Hive and uses a common physical data store, plug and play. Flexible queries of a large number of logarithmic bins are done using Presto.
In the third stage, the real-time service performance is enhanced, the requirements for query performance continue to increase, and many data applications are generated. In this phase, ClickHouse was introduced to build a data analysis platform with stronger performance and shorter response time to meet real-time requirements.
In the fourth stage, the Xiaohongshu big data team carried out the overall design and construction of the real-time data warehouse, and at the same time built a data service platform to provide data interfaces for each business team, and connected multiple internal or To B service application systems. It not only needs to do complex queries with low latency, but also has high requirements for concurrency. At this stage, we introduced DorisDB according to the scenario to meet the above requirements.

Xiaohongshu data analysis architecture

Status quo of Xiaohongshu OLAP system
Xiaohongshu’s entire data analysis system consists of data collection, data storage processing/data sharing and application layers.

Data collection
server logs or App logs collect tracking logs through Flume, and the data is distributed to offline storage S3 and real-time storage kafka. The online business database collects information such as MySQL binlog in real time through Canal.
Data storage

and processing offline data processing

: Hive/Spark’s highly scalable batch processing capabilities are used to undertake all the ETL and data model processing work of offline data warehouses.
Real-time data processing: Flink completes the ETL of real-time side data (including dimension richness, dual-stream join, and real-time summary); The offline table is synchronized to ClickHouse/DorisDB through the scheduling platform, and Flink implements the sink connector of ClickHouse and DorisDB, landing to DorisDB or ClickHouse.
The

data

sharing layer mainly provides the underlying data storage of external services, and writes offline or real-time data to related database components, providing query capabilities for multiple services and different scenarios.
The data sharing layer mainly includes TiDB/Hbase/ClickHouse/DorisDB. Through the high-speed OLAP query capabilities provided by DorisDB and ClickHouse, it undertakes the reporting platform on the application side, provides an ad-hoc analysis platform, provides data interfaces for the development side, and implements multiple data products (such as traffic analysis platform and user labeling platform).

The application layer of the

application layer
is mainly for management and operation personnel, with concurrency, delay, frequent demand update and other requirements, and ad-hoc query for data analysts requires support for complex SQL processing, massive data query and other capabilities.
Comparison of OLAP analysis tools
Clickhous e

:

Advantages:
  • strong single-table query performance, suitable for flexible ad-hoc query based on large and wide tables.

  • Contains a rich MergeTree Family with support for pre-aggregation.

  • Ideal for large-scale log detail write analysis.

Cons:
  • does not support real deletion and update.

  • The join way is not very friendly.

  • The concurrency capacity is relatively low.

  • The MergeTree merger is incomplete.

DorisDB

:

Pros:
  • single-table query and multi-table query performance are strong, and can support both wide table query scenarios and complex multi-table query scenarios.
  • Supports high concurrency queries.
  • Supports real-time data micro-batch ETL processing.
  • Both streaming and batch data writes are strong.
  • Compatible with MySQL protocol and standard SQL.
Cons

:

TiDB/TiFlash:
Pros
  • supports updating/deleting.
  • Taking into account the needs of OLTP.
  • Supports Flink ExactlyOnce semantics and idempotency.
Disadvantages

< ul class="list-paddingleft-2"> query

  • performance is weak, and OLAP query scenarios cannot be well supported.
  • Real-time pre-aggregation is not supported.
  • TiFlash does not support all SQL writing methods and functions for the time being.
  • DorisDB application practices in advertising data centers

    There are two core data of the advertising business: one is the exposure clickstream of the advertisement, that is, the exhibition and sales information of all advertising units; The second is the advertising effect attribution data, such as order conversion in Xiaohongshu site, related form submission, likes, favorites, and attention of notes.
    Based on these data, according to the needs of different business scenarios, relevant business statistical indicators are summarized in real time and query and analysis services are provided to the outside world.

    Technical architecture
    Before the introduction of DorisDB, It is to use a large number of Flink tasks to write MySQL/Redis/HDFS/ClickHouse to achieve data landing.
    There are several types of core processing logic in Flink:
      the

    • front-end user ad display information event stream and the back-end algorithm recommendation stream are associated and deduplicated to improve the advertising information.

    • Access anti-cheats to eliminate cheating events.

    • Summarize the results according to the requirements of different business scenarios and write them to different database components.

    The original architecture of

    technical pain points
    mainly has the following problems:

      > The data logic is not well integrated and merged, the maintenance workload is large, and new requirements cannot be responded to quickly.

    • Clickhouse’s insufficient concurrency capacity and scaling complexity will become the bottleneck of the overall advertising system in the foreseeable future.

    • Because the logic of the Flink

    • layer is scattered and composed of a large number of small Flink tasks, the entire architecture cannot meet the high availability requirements, and as long as any task has a problem, it will affect the online business.

    Therefore, we hope to optimize the original system, the core idea is to use an OLAP engine to unify this layer, and the requirements for the OLAP engine are relatively high:
    • can support the data writing requirements of large throughput.

    • It can support flexible queries with multi-dimensional combination, and TP99 is below 100ms.

    • It has the ability to summarize and roll up in real time, improve query performance, and support QPS to reach tens of thousands of requirements.

    • Binlog synchronizes MySQL data in real time and encapsulates the data in a timely manner.

    • It is better to support multi-table association.

    After a lot of research, DorisDB fits the overall requirements of advertising data centers. Based on DorisDB’s efficient query capabilities and support the characteristics of high QPS, it can provide integrated services for advertising algorithm strategy, real-time advertising billing, and real-time data reporting of advertising platforms. 
    The new architecture has the following advantages:
      clear structure, Flink focuses on data cleaning, business logic

    • computing is migrated from Flink to DorisDB, and DorisDB is the end point of data business logic.

    • It can maintain a unified data caliber, a data input, and a set of advertising statistical caliber output.

    • DorisDB active-standby is implemented in the bottom layer to better support high-QPS scenarios.

    Data table designData
    model design
    DorisDB itself provides three data models: detail model/aggregation model/update model. For Xiaohongshu advertising business, three data models are used to the fullest:

      write the ad exposure clickstream into the aggregation model, and design all dimensions of

    • aggregation according to the dimensions required by the business, such as advertiser, ad type, creative, ad unit, search term, region, user attributes, etc., and aggregate according to the required indicators.

    • There are many online MySQL in the back-end of the advertising side, and MySQL is connected to MySQL through the DorisDB update model for real-time table updates.

    • In the Hadoop offline data warehouse, some data reports are regularly synchronized to DorisDB, and these data are used in DorisDB’s detailed model.

    Data partitioning/binning
    The data partitioning function provided by DorisDB can improve the performance of queries in advertising scenarios. For example, a common query scenario for ad-side queries is to query data in the past for a certain period of time, and we can partition according to time in DorisDB to filter out unnecessary partition data. In addition, the advertising query will be filtered according to the advertiser, we use the advertiser ID as the forefront of the sorting key, you can quickly locate the advertiser’s data, DorisDB also supports hash buckets according to the advertiser ID, reducing the amount of data of the entire query for rapid positioning, which is also of great significance to high concurrency scenarios, minimizing the data range covered by the query statement and improving concurrency.
    Materialized viewWe
    use DorisDB materialized view to be able to build in real time and in batches, flexibly increase the characteristics of deletion and transparency, and establish an advertiser granularity, based on user feature granularity, based on advertising unit granularity, A materialized view based on specific creative granularity. Based on these materialized views, queries can be greatly accelerated.

    There

    are two types of real-time data import:

    • >

      If there are ETL processing requirements, Flink will be used to convert ETL logic and Flink DorisDB Connector will be used to write DorisDB.

    • In the real-time data warehouse public layer, configure the Routine Load task to write data in batches of 10s to the DorisDB table.

    Offline data reports are imported into the > DorisDB: