Calcite, as the most commonly used SQL parsing engine in the field of big data, supports SQL parsing for large-scale projects such as Flink, hive, kylin, druid, etc., and if you want to study Flink SQL source code in depth, Calcite is also one of the necessary skills, which is worth learning

We also use its self-developed SQL engine internally to support correlation query of any number of heterogeneous data sources through a set of SQL (eg: mysql table join on the HBase table is doing an aggregation calculation).

Because calcite has more functions, this article mainly starts with the important main process source code of calcite, and mainly focuses

on sorting

out the stages of Calcite SQL execution

on the VolcanoPlanner optimizer

To sum up


1. Parse the incoming SQL into a lexical tree through the Parser parser, with SqlNode as the node of the tree

2. Do lexical validation validate, type verification, metadata verification, etc

. 3. Convert the verified SqlNode tree into the corresponding relational algebraic expression, which is also a tree, with RelNode as the node

4. The RelNode relational algebraic expression tree, through the built-in two optimizers Volcano, Hep optimizes the relational algebraic expression to obtain a tree of optimal logical algebra, which is also RelNode

5. The optimal logical algebraic expression (RelNode) will be converted into a corresponding executable physical execution plan (the conversion logic varies according to the framework), like Flink will be converted to its Operator


run to see each stage in detail

1. The stage of parsing SQL

statements into syntax trees (SQL – > SqlNode)


not actually implemented by calcite, but calcite itself defines a set of SQL syntax parsing rule templates, implemented through the framework of javaCC

Pull the code to take a look

The Parser.jj in the source code is the syntax template of the calcite core, for example, what syntax we want to add to flink sql such as count window will be modified here to define what


token is the specific logic of how to return sqlNode

Take an example

"select ID, NAME from MYHBASE. MYHBASE where ID = '1' "


be parsed into such a sqlNode tree

 Without going into detail here, javacc can refer to the official website (

2 . Syntax validation validator stage

Here through the validator to calibrate, here is not expanded, not the point

3. Turn sqlNode into a logical expression tree for relNode (sqlNode – > relNode).

Here calcite has the default sql2rel converter org.apache.calcite.sql2rel.SqlToRelConverter


not be expanded here

4. Relational Algebraic Tree Optimization (relNode – > relNode)

is the focus of the medium emphasis here!!! The reason why so many frameworks choose Calcite is because its SQL optimization

passes through 3 stages we get a relNode tree, but this tree here is not the optimal solution, and Calcite gets an optimized best tree through its two optimizer planners

Here is the core of the whole calcite, calcite provides two optimizers

HepPlanner rule optimizer (simply understood as defining many rules Rule, As long as the tree nodes that can meet the optimization rules are converted according to the rules to get a rule-optimized tree, which is relatively simple


VolcanPanner cost optimizer (based on cost cost, the tree will iterate according to the rule, constantly calculating the generation value of the root relnode node to find the optimal tree)

Let’s look at

the select ID first, NAME from a where ID = '1'

So what does a SQL-converted RelNode tree look like


 It can be seen that many nodes are named after Logical, because this is a logical node converted by calcite default converter (SqlToRelConverter) in stage 3, and logical nodes cannot run without physical properties

Next step into calcite’s cost optimizer VolcanoPlanner for optimization

returns the best cost Solve

the calcite optimization method

First calcite will set the relNode we got from the previous stage to the root of our cost Volcano optimizer and go

to it org.apache.calcite.plan.volcano.VolcanoPlanner.registerImpl() In method

The breakpoint will first register the relnode’s input

in the ensureStered method

during the register process

You can see that there is a loop back to the registerImpl() method, that


, the tree child nodes deep traversal is registered first

Next, let’s take a look at what the registration process

does after going back to the VolcanoPlanner.registerImpl() method that you just looked at since it is a deep traversal

You can see that the rule is about to be triggered, and here we have to intersperse a concept, Rule in calcite

From the class description, we can know that rules can convert one expression into another, what does it mean, let’s see what abstract methods there are

What does that mean? To sum up, the two core methods

matches() return whether the current relnode matches this rule

onMatch() When this rule is matched, this method is called, where the transformTo() method can be called, which converts one relNode into another

Rules are the core of the entire calcite, in fact, all SQL optimization is composed of corresponding rules, implement the SQL optimization logic as the corresponding rule, let the corresponding relNode tree nodes do the corresponding transformations to get the optimal best execution plan

OK back to our main process, Continue with the volcanoPlanner.fireRule() method above to see how the rule is triggered

The logic here is relatively simple, that is, when the relnode satisfies the rule, it calls the match() method of volcanoRuleCall

But one thing to note is that classOperands here contains relNode and all the rules that may match the relnode, and can be up or down


Let’s say I have a RelNode for a LogicFilter and then define two rules

, RuleA

operand(Logicalfilter.class, operand(TableScan.class))


operand(Logicalproject.class, operand(Logicalfilter.class))

Then these two rules will enter the mapping relationship classOperands on this possible match, and

then continue to look at the code after matching the rule

Then I walked to the onMatch of VolcanoPlanner.DeferringRuleCall

Here is to add this rule to the

ruleQueue in IterativeRuleDriver, this queue is specially used to store the matched rules, it is not difficult to find that these matching rules only exist in the queue, but these rules have not been executed

So how long will it be executed?

Going back to the main process when all the relnode children in setRoot register

Will take the specific planner findBestExp() method, from the name can be seen to find the optimal


here to say in advance, claicte’s optimization principle is that it assumes that if an expression is optimal, then its local is also optimal, then the current relNode best we only need to care, from

1. Add up all the best of the

child node

2. Add up all the rules that you can match, as well as the best of the remaining parts

What is obtained from this comparison is the optimal solution of the current relnode

Quote a diagram

If A can only match these two rules, then we only need to consider these few situations when enumerating the optimal solutions

for those who do not know much about the principle, you can take a look at this article

Then look at findBestexp().

Here is the main loop of the whole optimization to find the optimal solution bestExp


constantly take the rule from the queue, run the rule, and do not exit until all the rules have been executed

That’s right, the queue here is mentioned earlier, when the default relnode is registered, it will put the matching rule in this queue and naturally

have a question here, as mentioned earlier, the relNode node will be changed when the rule is running, that is, the equivalent node of adding relndoe,

Then the structure

change of the tree here will cause the previous rule that cannot be matched to change the structure of the tree to match, then the rule that can be matched here will not be missed, then look at the method used to convert equivalent nodes in the onMatch() of the rule, transformTo().

The new node of the transformation is again executed in the transformTo method

  That is to say, the new node will also go through, the default relNode registration process, when the new node is registered as an equivalent node will have a new rule matching, and will add this rule to the rulequeu to wait for the next execution of the


in addition, when this relnode node will be converted by the rule rule, The generated new relnode will be set to be added to the relnode’s equivalent node

  Add an equivalent node, and in the propagateCostImprovement method

Calculating whether the current equivalent node will make the cost of the current relnode decrease, and if it falls, then update the best cost of the current relnode and bubble up to modify the optimal best cost of the parent relnode

while true Keep triggering the pull ruleQueue in

the rule until the rule is empty

and then the rule will add a

new equivalent node If the cost is better, Updating the best Relnode of the entire tree

The new equivalent relnode will match the new rule, and the new rule will be added to the rulequeue

Enter the next loop until there are no rules to match, so that bestexp can return the optimized

optimal relnode,

and then according to this optimal relnode, different frameworks translate into their own APIs

Calciet has finally finished speaking, and then you can start parsing the source code of Flink SQL

Local address:


public number (zhisheng) reply to Face, ClickHouse, ES, Flink, Spring, Java, Kafka, Monitoring < keywords such as span class="js_darkmode__148"> to view more articles corresponding to keywords.

like + Looking, less bugs 👇