当前位置:网站首页>Tableapi & SQL and example module of Flink
Tableapi & SQL and example module of Flink
2022-06-21 19:38:00 【Dragon man who can't fly】
The official introduction
Flink Medium API
Flink It's streaming / The development of batch applications provides different levels of abstraction .

- Flink API The lowest level of abstraction is stateful real-time stream processing . Its abstract implementation is Process Function, also Process Function By Flink The framework is integrated into DataStream API For us to use . It allows users to freely handle events from a single stream or multiple streams in an application ( data ), And provide a state with global consistency and fault tolerance guarantee . Besides , Users can register event times in this layer of abstraction (event time) And processing time (processing time) The callback method , This allows programs to perform complex calculations .
- Flink API The second level of abstraction is Core APIs. actually , Many applications don't need to use the lowest level abstraction mentioned above API, It can be used Core APIs Programming : It includes DataStream API( Applied to bounded / Unbounded data flow scenario ) and DataSet API( Applied to bounded dataset scenarios ) Two parts .Core APIs The streaming API(Fluent API) It provides a general module component for data processing , For example, various forms of user-defined transformation (transformations)、 join (joins)、 polymerization (aggregations)、 window (windows) And status (state) Operation etc. . This layer API Each programming language has its own class for the data types processed in .
Process Function This kind of underlying abstraction and DataStream API The integration allows users to choose to use lower level abstractions API To meet your needs .DataSet API Some additional primitives are provided , For example, circulation / iteration (loop/iteration) operation .
- Flink API The third level of abstraction is Table API.Table API It's a table (Table) Declarative programming centered on (DSL)API, For example, in the streaming data scenario , It can represent a dynamically changing table .Table API follow ( Expand ) relational model : That is, the table has schema( It's similar to... In a relational database schema), also Table API It also provides operations similar to those in the relational model , such as select、project、join、group-by and aggregate etc. .Table API A program defines the logical operations that should be performed in a declarative way , Instead of specifying exactly what code the program should execute . Even though Table API It is simple to use and can be extended by various types of user-defined functions , But it's better than Core API I'm not good at expressing myself . Besides ,Table API Before execution, the program will also use the optimization rules in the optimizer to optimize the expressions written by users .
Table and DataStream/DataSet You can switch seamlessly ,Flink Allows users to write applications with Table API And DataStream/DataSet API A mixture of .
- Flink API The top-level abstraction is SQL. This layer of abstraction is similar to... Both semantically and programmatically Table API, But its program implementation is SQL Query expression .SQL Abstract and Table API The relationship between abstractions is very close , also SQL The query statement can be in Table API Execute on the table defined in .
Table API and SQL
Apache Flink There are two types of relationships API For unified processing of stream and batch :Table API and SQL.
- Table API Is used for Scala and Java Language query API, It can be combined in a very intuitive way 、 Filter 、join Equirelation operator .Flink SQL Is based on Apache Calcite To achieve the standard SQL. These two kinds of API The query in is for batch (DataSet) And flow (DataStream) The input of has the same semantics , The same calculation results will be produced .
- Table API and SQL Two kinds of API Is tightly integrated , as well as DataStream and DataSet API. You can be in these API Between , And some based on these API Easy switching between Libraries . such as , You can use CEP from DataStream Do pattern matching in , And then use Table API To analyze the matching results ; Or you can use SQL To scan 、 Filter 、 Aggregate a batch table , Then run another Gelly Graph algorithm To process the preprocessed data .
Be careful :Table API and SQL It is still in the active development stage , Not all features have been fully implemented yet . Not all [Table API,SQL] and [ flow , batch ] All combinations are supported .
Official documents
https://ci.apache.org/projects/flink/flink-docs-release-1.11/zh/dev/table/
TableAPI&SQL Development
From the beginning of this article , increase TableAPI&SQL The content of the demonstration , On the basis of the original project , Expand one tableapi modular ; This module will demonstrate the of the following components TableApi And SQL Easy to use ;
- elasticsearch
- kafka
- jdbc (mysql)
newly added tableapi modular
In the current project , Create name as tableapi Of maven Engineering module
pom.xml
<artifactId>tableapi</artifactId>
<dependencies>
<!-- https://mvnrepository.com/artifact/org.apache.flink/flink-connector-jdbc -->
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-connector-jdbc_2.12</artifactId>
<version>1.11.1</version>
<!--<scope>provided</scope>-->
</dependency>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-table-common</artifactId>
<version>1.11.1</version>
<!--<scope>provided</scope>-->
</dependency>
<!-- flink-connector-kafka -->
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-connector-kafka_2.11</artifactId>
<version>1.11.1</version>
</dependency>
<!-- mysql Drive pack -->
<dependency>
<groupId>mysql</groupId>
<artifactId>mysql-connector-java</artifactId>
<version>5.1.47</version>
</dependency>
<!-- elasticsearch6 rely on -->
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-connector-elasticsearch6_2.11</artifactId>
<version>${flink.version}</version>
</dependency>
<!--<dependency>-->
<!--<groupId>org.apache.flink</groupId>-->
<!--<artifactId>flink-sql-connector-elasticsearch6_2.11</artifactId>-->
<!--<version>${flink.version}</version>-->
<!--</dependency>-->
</dependencies>Refresh project maven, Download the relevant function dependent component package ;
Engineering module

Follow up on TableAPI&SQL The demo examples are here tableapi Module on the basis of development ;
Source download
边栏推荐
- 使用uniapp框架搭建浙里办微应用(单点登录、埋点、适老化、RPC网关)
- The GLM function of R language is used to build a binary logistic regression model (the family parameter is binomial), and the coef function is used to obtain the model coefficients and analyze the me
- 6月25日PMP考前指南,你需要做好这些
- Nepal graph has settled in Alibaba cloud computing nest to help enterprises build a super large-scale map database on the cloud
- 转发提醒 MetaMask小狐狸钱包安全公告 如何应对拓展程序潜在的私钥泄露
- Notes on writing questions in C language -- find s=a+aa+aaa+aaaa+aa Value of a
- After Hongmeng, Huawei announced that it would donate to Euler again. What impact is expected to be brought to the industry by the donations of Hongmeng and Euler?
- 【面试高频题】难度 1/5,难度较低的链表面试题
- 企评家全面解读:【国家电网】中国电力财务有限公司企业成长性
- 【力扣10天SQL入门】Day1
猜你喜欢

From "village run enterprise" to "ten billion group", why did red star industry complete the "butterfly transformation"?

Medical expense list can be entered at a second speed, and OCR recognition can help double the efficiency

11 Beautiful Soup 解析库的简介及安装

华为又发新品?这几款功能太优秀了

Shang Silicon Valley Shang Silicon Valley | what is Clickhouse table engine memory and merge
Must the database primary key be self incremented? What scenarios do not suggest self augmentation?

This humble doctor's thesis is very popular: looking back, I feel sorry for countless mountains

MFC界面库BCGControlBar v33.0 - 桌面警报窗口、网格控件升级

Notes on writing questions in C language -- find s=a+aa+aaa+aaaa+aa Value of a

WMS仓库仓储管理系统源码
随机推荐
插入类排序法
2022年6月25日PMP考试通关宝典-5
线上开期货户是否安全啊?不去线下可以开户吗?
Is there a Hongmeng version of the song ordering system? Lao Wang was the first to experience it
Delete the specified screen
Yolov5 trains its own data set to report error records
数据库主键一定要自增吗?有哪些场景不建议自增?
如何在Chrome浏览器中临时修改SameSite=None和Secure
R语言dist函数计算dataframe数据中两两样本之间的距离并返回样本间距离矩阵,将距离矩阵输入给hclust函数进行层次聚类分析,method参数指定两个组合数据点间的距离计算方式
The R language uses the follow up The plot function visualizes the longitudinal follow-up chart of multiple ID (case) monitoring indicators, and uses line Col parameter custom curve color (color)
homeassistant addons
Hongmeng version of "Tiktok" is a great experience
From "village run enterprise" to "ten billion group", why did red star industry complete the "butterfly transformation"?
期货开户的流程是什么?网上开户安全吗
[interval and topic prefix sum] line segment tree (dynamic open point) application problem
Two problems that may occur in the use of ThreadLocal and thread pool
系统集成项目管理工程师(软考中级)怎么备考?
轻松入门自然语言处理系列 专题6 代码实战──基于语言模型的拼写纠错
Enabling developers of shengteng scientific research innovation enabling program Huawei computing provides three dimensional support
如何在Chrome浏览器中模拟请求或修改请求的域名