当前位置:网站首页>spark:获取日志中每个时间段的访问量(入门级-简单实现)
spark:获取日志中每个时间段的访问量(入门级-简单实现)
2022-07-24 14:32:00 【一个人的牛牛】
以一小时为时间段获取日志中每个时间段的访问量,结果打印在控制台。
下面是代码,日志文件自己找。
import java.text.SimpleDateFormat
import java.util.Date
import org.apache.spark.{SparkConf, SparkContext}
object RDD_Operator_Transform_groupBy_Test {
def main(args: Array[String]): Unit = {
//TODO 创建环境
val sparkConf = new SparkConf().setMaster("local[*]").setAppName("RDD")
val sc = new SparkContext(sparkConf)
//TODO RDD算子——groupBy
println("获取日志中每个时间段的访问量")
val rdd = sc.textFile("datas/apache.log")//文件路径
val timeRDD = rdd.map(
line => {
val datas = line.split(" ") //以空格分离
val time = datas(3) //取第三位
val sdf = new SimpleDateFormat("DD/MM/YYYY:hh:mm:ss")
val date: Date = sdf.parse(time) //解析时间
val sdf1 = new SimpleDateFormat("hh") //时间段为小时
val hour: String = sdf1.format(date) //传入时间
(hour, 1)
}
).groupBy(_._1)
timeRDD.map{
case (hour, iter) => { //模式匹配
(hour, iter.size)
}
}.collect().foreach(println)
//TODO 关闭环境
sc.stop()
}
}

边栏推荐
- "XXX" cannot be opened because the identity of the developer cannot be confirmed. Or what file has been damaged solution
- 字符串——28. 实现 strStr()
- Usage differences of drop, truncate and delete
- Regular expression and bypass cases
- Introduction to Xiaoxiong school
- Learn science minimize
- 北京一卡通以35288.8529万元挂牌出让68.45%股权,溢价率为84%
- Grpc middleware implements grpc call retry
- Data analysis and mining 1
- Concurrent programming ----------- set
猜你喜欢

LeetCode高频题56. 合并区间,将重叠的区间合并为一个区间,包含所有区间

Leetcode · daily question · 1184. distance between bus stops · simulation

Unity 委托 (Delegate) 的简单理解以及实现
![[oauth2] III. interpretation of oauth2 configuration](/img/31/90c79dbc91ee15c353ec46544c8efa.png)
[oauth2] III. interpretation of oauth2 configuration

Under multi data source configuration, solve org.apache.ibatis.binding Bindingexception: invalid bound statement (not found) problem

Maotai ice cream "bucked the trend" and became popular, but its cross-border meaning was not "selling ice cream"

Rest style

不要灰心,大名鼎鼎的YOLO、PageRank影响力爆棚的研究,曾被CS顶会拒稿

VSCode如何调试Nodejs

Data analysis and mining 1
随机推荐
VS编译后的应用缺少dll
The spiral matrix of the force buckle rotates together (you can understand it)
2022年IAA行业品类发展洞察系列报告·第二期
The server switches between different CONDA environments and views various user processes
Can't remember regular expressions? Here I have sorted out 99 common rules
“00后”来了!数睿数据迎来新生代「无代码」生力军
字符串——459. 重复的子字符串
Afnetworking data raw request mode
【机器学习】之 主成分分析PCA
Machine learning practice notes
Class loading mechanism and parental delegation mechanism
【MATLAB】MATLAB画图系列二 1.元胞与数组转化 2.属性元胞 3.删除nan值 4.合并多fig为同一fig 5.合并多fig至同一axes
AtCoder Beginner Contest 261E // 按位思考 + dp
栈与队列——20. 有效的括号
A common Dao class and util
AtCoder Beginner Contest 261 F // 树状数组
Typo in static class property declarationeslint
清除字符串中所有空格
Regular expression and bypass cases
基于ABP实现DDD--实体创建和更新