当前位置:网站首页>spark:获取日志中每个时间段的访问量(入门级-简单实现)
spark:获取日志中每个时间段的访问量(入门级-简单实现)
2022-07-24 14:32:00 【一个人的牛牛】
以一小时为时间段获取日志中每个时间段的访问量,结果打印在控制台。
下面是代码,日志文件自己找。
import java.text.SimpleDateFormat
import java.util.Date
import org.apache.spark.{SparkConf, SparkContext}
object RDD_Operator_Transform_groupBy_Test {
def main(args: Array[String]): Unit = {
//TODO 创建环境
val sparkConf = new SparkConf().setMaster("local[*]").setAppName("RDD")
val sc = new SparkContext(sparkConf)
//TODO RDD算子——groupBy
println("获取日志中每个时间段的访问量")
val rdd = sc.textFile("datas/apache.log")//文件路径
val timeRDD = rdd.map(
line => {
val datas = line.split(" ") //以空格分离
val time = datas(3) //取第三位
val sdf = new SimpleDateFormat("DD/MM/YYYY:hh:mm:ss")
val date: Date = sdf.parse(time) //解析时间
val sdf1 = new SimpleDateFormat("hh") //时间段为小时
val hour: String = sdf1.format(date) //传入时间
(hour, 1)
}
).groupBy(_._1)
timeRDD.map{
case (hour, iter) => { //模式匹配
(hour, iter.size)
}
}.collect().foreach(println)
//TODO 关闭环境
sc.stop()
}
}

边栏推荐
- 小熊派 课程导读
- Number of bytes occupied by variables of type char short int in memory
- Introduction to Xiaoxiong school
- Automated penetration scanning tool
- Conversion of timestamp and time in Excel
- Atcoder beginer contest 261 f / / tree array
- Default color setting in uiswitch off state
- Nodejs uses the express framework to post the request message "badrequesterror:request aborted"
- SQL subquery
- 字符串——剑指 Offer 58 - II. 左旋转字符串
猜你喜欢

Dialog manager Chapter 2: create frame window
![The solution to the error of [installation detects that the primary IP address of the system is the address assigned by DHCP] when installing Oracle10g under win7](/img/25/aa9bcb6483bb9aa12ac3730cd87368.png)
The solution to the error of [installation detects that the primary IP address of the system is the address assigned by DHCP] when installing Oracle10g under win7

"After 00" is coming! Digital data ushers in a new generation of "codeless" forces

bibliometrix: 从千万篇论文中挖掘出最值得读的那一篇!

Jmmert aggregation test report

Remove the treasure box app with the green logo that cannot be deleted from iPhone

关于构建网络安全知识库方向相关知识的学习和思考

Typo in static class property declarationeslint

Mini examination - examination system

ISPRS2018/云检测:Cloud/shadow detection based on spectral indices for multi/hyp基于光谱指数的多/高光谱光学遥感成像仪云/影检测
随机推荐
栈与队列——232. 用栈实现队列
Stack and queue - 225. Implement stack with queue
Isprs2018/ cloud detection: cloud/shadow detection based on spectral indexes for multi/hyp multi / hyperspectral optical remote sensing imager cloud / shadow detection
Summary of Baimian machine learning
Default color setting in uiswitch off state
"XXX" cannot be opened because the identity of the developer cannot be confirmed. Or what file has been damaged solution
Nodejs uses the express framework to post the request message "badrequesterror:request aborted"
Must use destructuring props assignmenteslint
Class loading mechanism and parental delegation mechanism
The spiral matrix of the force buckle rotates together (you can understand it)
Moving the mouse into select options will trigger the mouseleave event processing scheme
Centos7安装达梦单机数据库
栈与队列——20. 有效的括号
不要灰心,大名鼎鼎的YOLO、PageRank影响力爆棚的研究,曾被CS顶会拒稿
[oauth2] II. Authorization method of oauth2
【机器学习】之 主成分分析PCA
Overview of dobesie wavelet (DB wavelet function) in wavelet transform
Usage differences of drop, truncate and delete
Atcoder beginer contest 261e / / bitwise thinking + DP
[oauth2] IV. oauth2authorizationrequestredirectfilter