当前位置:网站首页>Spark related FAQ summary
Spark related FAQ summary
2022-07-24 20:30:00 【Slim of the Kobayashi family】
One .SparkSQL relevant
Question 1 unresolvedAdderssException
In execution Spark Throw in the process :
Failed to bingdate001:33381,caused by:java.nio.channels.unresolvedAdderssExceptionreason: The reason is due to hosts Not configured , Cause no recognitionresolvent: Modify the corresponding machine host that will do
Question two IndexOutOfBoundsException
In execution Sparksql operation orc Thrown when a table of type :
java.lang.IndexOutOfBoundsException perhaps java.lang.NullPointerExceptionreason: There is an empty partition or table orc file . The BUG stay Spark2.3.0 And then fix itresolvent: Circumvention solution . modify ORC The default segmentation strategy for is :hive.exec.orc.split.strategy=BI Solve .Orc The score of split Yes 3 Strategies (ETL、BI、HYBIRD), The default is HYBIRD( Mixed mode , Automatically select according to the size and number of files ETL still BI Pattern ),BI The mode is divided according to the number of files split.
Spark2.1.0 Permanent functions are not supported , This is because Spark2.2.0 Reading was not supported before hdfs above jar package .
Question 3 socketTimeOutException:read
Saprk-sql and ThriftServer Use times wrong :
Java.net.socketTimeOutException:read time outreason: Is due to hivemetastore Too busy or gc Cause connection timeoutresolvent:spark-sql solve :hive.metastore.client.socket.timeout Increase the parameter .ThriftServer terms of settlement : Get a Connection And before :DriverManager.setLoginTimeout(100)
Question 4 Spark-sql Running too slow
Spark-sql When executing, a
Very small filesSplit into 25 individual task Run , The running speed is too slow .reason: Is due to HaddopRDD In the process of generation partitions Yes, I can take parameters mapreduce.job.maps , or mapred.map.tasks(25) and spark Default number of partitions (2) Compare the maximum value , Therefore, the default is 25resolvent: Modify this parameter to set task Come down .
Question five StackOverflowError
stay Spark SQL Running in SQL If the sentence is too complicated , There will be
java.lang.StackOverflowErrorabnormalreason: This is because when the program is running Stack Size greater than JVM Set the size ofresolvent: By starting Spark-sql When it's time to add--driver-java-options "-Xss10m"Option to solve this problem
Question 6 OutOfMemoryError
Sparksql In use Executor End throw :
java.lang.OutOfMemoryError: GC overhead limit exceededreason: This is because most of the events are GC, Lead to OOM.resolvent: Increase actuator memory , modify GC Strategy spark.executor.extraJavaOptions -XX:+UseG1GC
Two .Spark core relevant
Question seven No space left on device
In the use of Spark In the process :
java.io.IOException: No space left on devicereason: Generally due to Spark Of tmp The directory is full, causingresolvent: You can set the directory space larger , It supports separating multiple directories by commas :spark.local.dir
Question 8 Java heap space
common OOM:
java.lang.OutOfMemoryError: Java heap spacereason:1. Too much data , Applied Executor Resources are not enough to support .2. The amount of data in a single partition is too large , And too many partitions lead to execution task and job Too much information stored leads to Driver OutOfMemoryErrorresolvent:1、 Try not to use collect operation .2、 Check whether the data is tilted , increase shuffle Parallelism of , enlarge Executor Memory
Question 9 ClassNotFoundException
jar When the package version conflicts :
java.lang.ClassNotFoundException: XXXreason: Generally, it may be the user jar and Spark jar Conflictresolvent:1、 The best and Spark dependent jar ADAPTS .2、 If not, you can use parameters :spark.driver.userClassPathFirstandspark.executor.userClassPathFirstSet to true
3、 ... and 、Streaming relevant
Question 10 OffsetOutOfRangeException
consumption kafka when , Error in reading message :
OffsetOutOfRangeExceptionreason: Read the offsetRange Beyond the Kafka Message range , If it is less than, that is kafka The saved message has been disposed (log.retention.hours). Or beyond Kafka The existing offsetresolvent: In the reading offset Correct first , Get offset Of earliestOffset and lastestOffset
Question 11 consumption kafka
consumption kafka when , first job Read existing
All the news, Lead to the first Job Handle it too long or even failreason:auto.offset.resetSet up in order toearliestFrom the earliest offset Start spending , There's no setupspark.streaming.kafka.maxRatePerPartitionParametersresolvent: Specify to start from the data consumed before : Set up offsetRange. And set the parameter to :auto.offset.reset=latestSet up Spark The rate of each partition .
边栏推荐
- Vscode connected to the remote server cannot format the code / document (resolved)
- Opengl rendering pipeline
- Todolist case
- Leetcode 560 and the subarray of K (with negative numbers, one-time traversal prefix and), leetcode 438 find all alphabetic ectopic words in the string (optimized sliding window), leetcode 141 circula
- [shader realizes the flicker effect of three primary colors of television signal _shader effect Chapter 5]
- Native applets are introduced using vant webapp
- Istio一之Envoy工作原理
- Easy to use office network optimization tool onedns
- [feature construction] construction method of features
- 147-利用路由元信息设置是否缓存——include和exclude使用——activated和deactivated的使用
猜你喜欢

TCP sliding window, singleton mode (lazy and hungry) double checked locking / double checked locking (DCL)

Lunch break train & problem thinking: thinking about the problem of converting the string formed by hour: minute: second to second

Native applets are introduced using vant webapp
![[basic data mining technology] KNN simple clustering](/img/df/f4a3d9b8a636ea968c98d705547be7.png)
[basic data mining technology] KNN simple clustering

whistle ERR_ CERT_ AUTHORITY_ INVALID

Bypass using the upper limit of the maximum number of regular backtracking

In the era of new knowledge economy, who is producing knowledge?

Selenium is detected as a crawler. How to shield and bypass it
![[training Day8] tent [mathematics] [DP]](/img/d3/42869ed5bb7c9148d9fa7367a9af02.png)
[training Day8] tent [mathematics] [DP]

class file has wrong version 55.0, should be 52.0
随机推荐
In the era of new knowledge economy, who is producing knowledge?
Pychart tutorial: 5 very useful tips
Teach you five ways to crack the computer boot password
Leetcode 1928. minimum cost of reaching the destination within the specified time
Markdown to PDF API data interface
Inconsistent time
Fluoronisin peptide nucleic acid oligomer complex | regular active group alkyne, SH thiol alkynyl modified peptide nucleic acid
ma.glasnost.orika. MappingException:No converter registered for conversion from Date to LocalDateTime
Leetcode 560 and the subarray of K (with negative numbers, one-time traversal prefix and), leetcode 438 find all alphabetic ectopic words in the string (optimized sliding window), leetcode 141 circula
Istio一之Envoy工作原理
Oracle 19C datagruad replication standby rman-05535 ora-01275
MySQL data recovery
[training Day8] series [matrix multiplication]
[learning notes] agc008
微服务架构 | 服务监控与隔离 - [Sentinel] TBC...
Todolist case
Appium element positioning - App automated testing
[sciter]: window communication
[FreeRTOS] 10 event flag group
Working principle of envy of istio I