当前位置:网站首页>How is a Clickhouse query completed?
How is a Clickhouse query completed?
2022-06-24 05:58:00 【felixxdu】
Clickhouse SQL FUNCTION Introduce
Clickhouse The functions in can be roughly divided into three categories :
- Ordinary function It can also be called One line function , Detail function , from IFunction Interface definition . For the queried table or view Each row returns a result value . Common digital operation functions , Type conversion functions , Conditional function , Comparison functions, etc . see clickhouse There are as many detailed functions supported 600 Multiple , And the number of iterations supported is increasing . If you need to add support for new functions , At present, the only way is source code Medium hard code . Not yet
create udf return type as ...Allied udf The function of . You can use the following SQL Query supported function:select * from system.functions where "is_aggregate"=0select * from mysql('host:port','database', 'table', 'user','password') -- see mysql Data in the database select * from numbers() limit 10,1000000; -- Single threaded generation 10~1000010 Between the numbers select * from numbers_mt() limit 10,1000000; -- Multithreaded generation 10~1000010 Between the numbers - polymerization function from IAggregateFunction Interface definition , To rowset group ( A collection of rows ) Do aggregate calculations , Aggregate functions can only return one value per group . Common are sum,avg Functions, etc , The state of aggregate functions supports serialization and deserialization , So it can be transmitted between distributed nodes , To achieve incremental computing . Query supported aggregations function:
select * from system.functions where "is_aggregate"=1 - surface function Common ones are tables function Yes
mysqlurlnumbersremoteetc. , As a data source (storage) Use , With the from After Clause . Common use :
For the introduction of all functions, see : Official documents
AST The structure of the tree
Parser and Interpreter Are two very important sets of interfaces :Parser Responsible for creating AST object ,Interpreter The interpreter is responsible for explaining AST, And further create the execution of the query pipeline. They are associated with IStorage Together , The whole data query process is concatenated .
Parser Take one SQL The statement is recursively parsed into AST The form of the grammar tree . Different SQL sentence , Through different Parser Implement class parsing . Based on the current community master Branch version ,parser Has as many subclasses as 170 Multiple . The main one is src/parser Next , be responsible for clickhouse class sql Syntax parsing ;mysql Some of the following parser Mainly responsible for clickhouse It can be used as mysql Syntax parsing of the client side of .
They have implemented the two main interfaces according to their respective responsibilities :getName() And parseImpl(). It's responsible for parsing DDL Query statement ParserRenameQuery、ParserDropQuery and ParserAlterQuery Parser , There are also people who are responsible for parsing INSERT Of the statement ParserInsertQuery Parser , And responsible for SELECT Of the statement ParserSelectWithUnionQuery etc. .
This parser The way of working is to expand in a hierarchical way , One SQL Come here , First construct a parserQuery Of root parser , At the root parser The first category to judge the attribution , Then there are large categories of parserImpl Will be called to multiple secondary categories parser... And so on .
root / Class A parser(ParserQuery) There are the following two levels in parser( Then there are function notes )(ClickHouse/src/Parsers/ParserQuery.cpp):
ParserQueryWithOutput query_with_output_p; // The most common SQL Statements will match this parser ParserInsertQuery insert_p(end); // insert sentence ParserUseQuery use_p; // use db sentence ParserSetQuery set_p; // set key1 = value1 sentence ParserSystemQuery system_p; // system Opening statement https://clickhouse.tech/docs/en/sql-reference/statements/grant/#grant-system ParserCreateUserQuery create_user_p; // CREATE USER or ALTER USER ParserCreateRoleQuery create_role_p; // CREATE ROLE or ALTER ROLE ParserCreateQuotaQuery create_quota_p; // CREATE USER or ALTER USER ParserCreateRowPolicyQuery create_row_policy_p; // Implement row level permission control ParserCreateSettingsProfileQuery create_settings_profile_p; // CREATE SETTINGS PROFILE or ALTER SETTINGS PROFILE ParserDropAccessEntityQuery drop_access_entity_p; // DROP USER|ROLE | QUOTA ParserGrantQuery grant_p; // GRANT or REVOKE Table and column level permission control ParserSetRoleQuery set_role_p; // SET ROLE ParserExternalDDLQuery external_ddl_p; //EXTERNAL DDL FROM external_source(...) DROP|CREATE|RENAME
The most important secondary parser ParserQueryWithOutput Then there are the following parser...
ParserShowTablesQuery show_tables_p; // be responsible for show [tables /databases/...] Syntax parsing ParserSelectWithUnionQuery select_p; // be responsible for select Query syntax parsing entry , There are more inside parser ParserTablePropertiesQuery table_p; // (EXISTS | SHOW CREATE) [TABLE|DICTIONARY] [db.]name [FORMAT format] ParserDescribeTableQuery describe_table_p; // (DESCRIBE | DESC) ([TABLE] [db.]name | tableFunction) [FORMAT format] ParserShowProcesslistQuery show_processlist_p; // SHOW PROCESSLIST ParserCreateQuery create_p; // CREATE|ATTACH TABLE ... ParserAlterQuery alter_p; // ALTER TABLE [db.]name ParserRenameQuery rename_p; // RENAME TABLE [db.]name TO [db.]name, [db.]name TO [db.]name ParserDropQuery drop_p; // DROP|DETACH|TRUNCATE TABLE [IF EXISTS] [db.]name ParserCheckQuery check_p; // CHECK [TABLE] [database.]table ParserOptimizeQuery optimize_p; // OPTIMIZE TABLE [db.]name [PARTITION partition] [FINAL] [DEDUPLICATE] ParserKillQueryQuery kill_query_p; // KILL QUERY WHERE ... [SYNC|ASYNC|TEST] ParserWatchQuery watch_p; // WATCH [db.]table EVENTS Function is introduced :https://clickhouse.tech/docs/en/sql-reference/statements/watch/ ParserShowAccessQuery show_access_p; // SHOW ACCESS ParserShowAccessEntitiesQuery show_access_entities_p; // SHOW USERS; SHOW [CURRENT|ENABLED] ROLES; SHOW [SETTINGS] PROFILES etc. ParserShowCreateAccessEntityQuery show_create_access_entity_p; // SHOW CREATE USER [name | CURRENT_USER] ParserShowGrantsQuery show_grants_p; // SHOW GRANTS [FOR user_name] ParserShowPrivilegesQuery show_privileges_p; // SHOW PRIVILEGES ParserExplainQuery explain_p; // EXPLAIN AST|PLAN|SYNTAX|PIPELINE SELECT...
And so on .parser At the end of the day, a Ast Syntax tree . They have a common interface IAST, Inheritance system and parser Very similar .
Lexical and grammatical analysis
Two concepts are introduced :
Token: Represents a meaningful... Composed of several characters ” word “,token There's a lot of type, see src/Parsers/Lexer.h Macro definition under .
Lexer: Lexical parser , Input sql sentence , Spit out one by one token. And finally put these token Add some meaningful information and organize it according to the rules Ast Trees .
AST Tree analysis Function The process of
Among them function Most relevant parser The entrance ParserExpressionList, Final parse Realize in ParserLambdaExpression in parseImpl. stay parser Stage , Can't test function Whether there is . First, we'll build a ASTIdentifier, And then, with the parameters, we build ASTFunction; stay pipeline The existence of the parameter will be verified only when it is actually executed .
Interpreter To pipeline Implementation
Interpreter The interpreter works like Service The service layer is the same , Aggregate the resources required by each operator and concatenate the entire query process . First, it will parse AST object , And then execute “ Business logic ”( For example, branch judgment 、 Set up Parameters 、 Call interface, etc ), Eventually return IBlock object , Set up a query execution in the form of thread pipeline.
One Query The processing flow is generally :
stay clickhouse in ,transformer Is the concept of operator . all transformer Arranged into a pipeline (pipeline), And then to pipelineExecutor stream perform , Every execution of a transformer A batch of data sets in will be processed and output , All the way downstream sinker.
Clickhouse A series of basic transformer modular , see src/Processors/Transforms, such as :
- FilterTransform – WHERE filter
- SortingTransform – ORDER BY Sort
- LimitByTransform – LIMIT tailoring
- ExpressionTransform - Expression execution
When we execute :
SELECT age + 1 FROM t1 WHERE id=1 ORDER BY time DESC LIMIT 10 about ClickHouse Of QueryPipeline Come on , It will be arranged and assembled in the following way :
QueryPipeline::addSimpleTransform(Source) QueryPipeline::addSimpleTransform(FilterTransform) QueryPipeline::addSimpleTransform(SortingTransform) QueryPipeline::addSimpleTransform(LimitByTransform) QueryPipeline::addSimpleTransform(ExpressionTransform) QueryPipeline::addSimpleTransform(Sinker)
When QueryPipeline Conduct transformer When the choreography , There is also a need for a lower level DAG Connected construction .
connect(Source.OutPort, FilterTransform.InPort) connect(FilterTransform.OutPort, SortingTransform.InPort) connect(SortingTransform.OutPort, LimitByTransform.InPort) connect(LimitByTransform.OutPort, ExpressionTransform.InPort) connect(ExpressionTransform.OutPort, Sinker.InPort)
In this way, the data flow relationship is realized , One transformer Of OutPort Docking with another InPort. meanwhile , Different transformer The operator of , If it can be executed in parallel ( such as filter,expression Can be executed in parallel ), There will be more fission transformer , Achieve a parallel acceleration effect .
边栏推荐
- A network box that can adjust the outlet according to the router antenna position
- Tensorflow daily essay (I)
- What is the difference between a white box test and a black box test
- Spirit information development log (4)
- What is the reason why the list of channels on the left side of easycvr video Plaza displays garbled codes?
- Detailed explanation of IPv6 theory and concept
- The joint network security laboratory of runlian technology and Tencent security was officially unveiled
- Tencent Youtu presents a number of AI technologies at the 2021 global digital economy conference
- How to resolve the domain name? How to choose a domain name?
- The company is worried about customer frustration and brand damage caused by DDoS Attacks
猜你喜欢
随机推荐
How to apply for a primary domain name? Is primary domain name good or secondary domain name good?
How to solve the problem that easynvr calls the video download interface of the specified time period to display "being synthesized" and does not generate video?
One line of keyboard
How to use ffmpeg one frame H264 to decode yuv420p in audio and video development?
"Adobe international certification" confused me: what is Pantone?
Load balancing on Tencent cloud
Oceanus practice consumption CMQ subject model data source
How to register a Chinese domain name? Is it necessary to register a Chinese domain name?
How to register domain name and web address? What is the domain name and URL?
The basic concept of network is the relationship among services, protocols, processes and ports.
A power modem that can adjust the bending range of cable
A plate processing device of network separator which can adapt to different line port positions
Tidb massive region cluster tuning practice
Adobe international certification wants to design! Understanding the style guide is your best introduction design
How to apply for a domain name? How much does it cost to apply for a domain name?
How do users in the insurance upgrade industry choose?
Spirit information development log (4)
How to buy a network domain name? Is the domain expensive
A high-end router antenna connection mechanism that can simultaneously deploy and store antennas
How about the work domain name? Does the work domain name need real name authentication?


