当前位置:网站首页>Spark Learning: build SQL to meet the specified optimization rules
Spark Learning: build SQL to meet the specified optimization rules
2022-07-24 09:37:00 【I love evening primrose a】
win10 compile Spark Source code
by SparkSql Add custom commands
structure Sql Meet the specified optimization rules

One 、 Connect local oracle database
import org.apache.spark.sql.DataFrame
val dfOracle:DataFrame = spark.read.format("jdbc").option("driver", "oracle.jdbc.driver.OracleDriver").option("url", "jdbc:oracle:thin:@192.168.0.101:1521:orcl").option("user", "scott").option("password","scott").option("numPartitions", 20).option("dbtable", "EMP").load()
// Connect scott Among users emp Table to test
// Create a temporary table
dfOracle.createTempView("t")
// bring apply Which rules are printed
spark.sql("SET spark.sql.planChangeLog.level=WARN")
Two 、 Rule three
1、 Rule understanding
- CombineFilters: Merge two adjacent Filters, A form of predicate pushdown , Put the... Of the same table Filter Merge together , Such as 1000 < sql < 2000
- CollapseProject: Merge two adjacent project, I understand the same as column pruning , Unnecessary columns do not project
- BooleanSimplification: Boolean expression simplification , The expression in the condition satisfies true, You can omit , Such as sql in 1 = 1
2、 Code implementation
val query:String = "select a.ename from (select ename,sal from t where 1 = 1 and sal > 1000) a where a.sal < 2000)
spark.sql(query).explain(true)
2、 Log information 

3、 ... and 、 Rule five
1、 Rule understanding
- ConstantFolding: Constant collapse , Constants pre evaluate or remove expressions from constants , As in the example 1 = 1
- PushDownPredicates: Predicate push-down ,Filter Conditions are pushed to the data source , Reduce the amount of data , As in the example sal > 1000
- ReplaceDistinctWithAggregate: take distinct De duplication to group by Aggregation operation , As in the example distinct
- ReplaceExceptWithAntiJoin: take Except Difference set into Left-Anti Join Left half connection , As in the example except
- FoldablePropagation: Foldable propagation , It's not clear yet …
2、 Code implementation
//oracle Create a new table in the library
create table emp1 as select * from emp where deptno = 20;
// establish DataFrame
val dfOracle1:DataFrame = spark.read.format("jdbc").option("driver", "oracle.jdbc.driver.OracleDriver").option("url", "jdbc:oracle:thin:@192.168.0.101:1521:orcl").option("user", "scott").option("password","scott").option("numPartitions", 20).option("dbtable", "EMP").load()
// Create a temporary table
dfOracle1.createTempView("t1")
val query:String = "select distinct a.job from (select * from t where sal > 1000 and 1 = 1) a where a.deptno = 10 except select job from t1 where sal > 2000"
spark.sql(query).explain(true)
3、 Log information 


边栏推荐
- Problem: filesystemversionexception: you have version null and I want version 8
- Huawei wireless device security policy configuration command
- Android Version Description security privacy 13
- 来阿里一年后我迎来了第一次工作变动....
- With 8 years of product experience, I have summarized these practical experience of continuous and efficient research and development
- Android system security - 5.3-apk V2 signature introduction
- It is reported that the prices of some Intel FPGA chip products have increased by up to 20%
- [don't bother to strengthen learning] video notes (III) 3. SARS (lambda)
- NVIDIA set persistent mode
- web安全入门-开源防火墙Pfsense安装配置
猜你喜欢

程序的编译与链接

& 和 &&、| 和 || 的区别

What if path is deleted by mistake when configuring system environment variables?

Gnuplot software learning notes

OPENCV学习DAY5

It is reported that the prices of some Intel FPGA chip products have increased by up to 20%

Android system security - 5.2-apk V1 signature introduction

Account 1-3
![[assembly language practice] solve the unary quadratic equation ax2+bx+c=0 (including source code and process screenshots, parameters can be modified)](/img/5e/782e5c33accc455994aae044970431.png)
[assembly language practice] solve the unary quadratic equation ax2+bx+c=0 (including source code and process screenshots, parameters can be modified)

Cloud primordial (12) | introduction to kubernetes foundation of kubernetes chapter
随机推荐
Foreign lead operation takes one month to collect money, and the sideline still needs it
Getting started with sorting - insert sorting and Hill sorting
How to judge and analyze NFT market briefly through NFT go?
Common evaluation indexes of medical image segmentation
来阿里一年后我迎来了第一次工作变动....
SDUT compilation principle experimental code
PHP Basics - PHP magic method
Replace the function of pow with two-dimensional array (solve the time overrun caused by POW)
如何通过NFT GO,来简要判断、分析NFT市场?
CodeBlocks shortcut key operation Xiaoquan
[don't bother with reinforcement learning] video notes (I) 3. Why use reinforcement learning?
Cess test online line! The first decentralized storage network to provide multiple application scenarios
获取所有股票历史行情数据
JS locate Daquan to get the brother, parent and child elements of the node, including robot instances
[Luogu p3426] SZA template (string) (KMP)
[Luogu p5410] [template] extend KMP (Z function) (string)
PHP debugging tool - how to install and use firephp
力扣300-最长递增子序列——动态规划
Nuxt 路由切换后 asyncData 跨域报错
Practice 4-6 number guessing game (15 points)