当前位置:网站首页>Spark Learning: build SQL to meet the specified optimization rules
Spark Learning: build SQL to meet the specified optimization rules
2022-07-24 09:37:00 【I love evening primrose a】
win10 compile Spark Source code
by SparkSql Add custom commands
structure Sql Meet the specified optimization rules

One 、 Connect local oracle database
import org.apache.spark.sql.DataFrame
val dfOracle:DataFrame = spark.read.format("jdbc").option("driver", "oracle.jdbc.driver.OracleDriver").option("url", "jdbc:oracle:thin:@192.168.0.101:1521:orcl").option("user", "scott").option("password","scott").option("numPartitions", 20).option("dbtable", "EMP").load()
// Connect scott Among users emp Table to test
// Create a temporary table
dfOracle.createTempView("t")
// bring apply Which rules are printed
spark.sql("SET spark.sql.planChangeLog.level=WARN")
Two 、 Rule three
1、 Rule understanding
- CombineFilters: Merge two adjacent Filters, A form of predicate pushdown , Put the... Of the same table Filter Merge together , Such as 1000 < sql < 2000
- CollapseProject: Merge two adjacent project, I understand the same as column pruning , Unnecessary columns do not project
- BooleanSimplification: Boolean expression simplification , The expression in the condition satisfies true, You can omit , Such as sql in 1 = 1
2、 Code implementation
val query:String = "select a.ename from (select ename,sal from t where 1 = 1 and sal > 1000) a where a.sal < 2000)
spark.sql(query).explain(true)
2、 Log information 

3、 ... and 、 Rule five
1、 Rule understanding
- ConstantFolding: Constant collapse , Constants pre evaluate or remove expressions from constants , As in the example 1 = 1
- PushDownPredicates: Predicate push-down ,Filter Conditions are pushed to the data source , Reduce the amount of data , As in the example sal > 1000
- ReplaceDistinctWithAggregate: take distinct De duplication to group by Aggregation operation , As in the example distinct
- ReplaceExceptWithAntiJoin: take Except Difference set into Left-Anti Join Left half connection , As in the example except
- FoldablePropagation: Foldable propagation , It's not clear yet …
2、 Code implementation
//oracle Create a new table in the library
create table emp1 as select * from emp where deptno = 20;
// establish DataFrame
val dfOracle1:DataFrame = spark.read.format("jdbc").option("driver", "oracle.jdbc.driver.OracleDriver").option("url", "jdbc:oracle:thin:@192.168.0.101:1521:orcl").option("user", "scott").option("password","scott").option("numPartitions", 20).option("dbtable", "EMP").load()
// Create a temporary table
dfOracle1.createTempView("t1")
val query:String = "select distinct a.job from (select * from t where sal > 1000 and 1 = 1) a where a.deptno = 10 except select job from t1 where sal > 2000"
spark.sql(query).explain(true)
3、 Log information 


边栏推荐
- Linked list - 19. Delete the penultimate node of the linked list
- [don't bother with reinforcement learning] video notes (I) 2. Summary of reinforcement learning methods
- Problem: filesystemversionexception: you have version null and I want version 8
- 来阿里一年后我迎来了第一次工作变动....
- [leetcode] 31. Next arrangement
- Es search summary
- Learning transformer: overall architecture and Implementation
- Foreign lead operation takes one month to collect money, and the sideline still needs it
- PHP Basics - PHP types
- [MySQL] - deep understanding of index
猜你喜欢

Leetcode skimming: dynamic planning 03 (climb stairs with minimum cost)

Embedded development: Tools - optimizing firmware using DRT

Little dolphin "transformed" into a new intelligent scheduling engine, which can be explained in simple terms in the practical development and application of DDS

MySQL基础篇(一)-- SQL基础
![[don't bother to strengthen learning] video notes (II) 1. What is Q-learning?](/img/4f/809adc96e30fad03a113acc3df4b61.png)
[don't bother to strengthen learning] video notes (II) 1. What is Q-learning?

Leetcode94 detailed explanation of middle order traversal of binary tree

ASI-20220222-Implicit PendingIntent

Gnuplot software learning notes

web安全入门-开源防火墙Pfsense安装配置

如何通过NFT GO,来简要判断、分析NFT市场?
随机推荐
Map processing background management menu data
华为无线设备安全策略配置命令
Build practical product help documents to improve user satisfaction
[assembly language practice] (II). Write a program to calculate the value of expression w=v- (x+y+z-51) (including code and process screenshots)
Replace the function of pow with two-dimensional array (solve the time overrun caused by POW)
dp最长公共子序列详细版本(LCS)
SQL optimization principles
Vim: extend the semantic analysis function of YCM for the third-party library of C language
Open source summer interview | learn with problems, Apache dolphin scheduler, Wang Fuzheng
[MySQL] - deep understanding of index
Gnuplot software learning notes
ASI-20220222-Implicit PendingIntent
《动手学深度学习》(七) -- 边界框和锚框
[don't bother with intensive learning] video notes (III) 1. What is SARS?
Scarcity in Web3: how to become a winner in a decentralized world
[example] v-contextmenu right click menu component
[Luogu p5410] [template] extend KMP (Z function) (string)
[leetcode] 31. Next arrangement
[don't bother to strengthen learning] video notes (IV) 1. What is dqn?
Account 1-2