当前位置:网站首页>Final part of web crawler: send directional messages to 100000 Netease cloud users
Final part of web crawler: send directional messages to 100000 Netease cloud users
2022-06-26 21:32:00 【Romantic data analysis】
The goal of this article :
In the last article, we got the comment users ID And home page address . This article can conduct some data analysis and market operation based on these data . I learned the method of this article theoretically , You can send advertising messages on any web page , This article has the possibility of being used by bad people , Therefore, charges are set , And this set of crawler tutorials , If you find online classes in Netease cloud class , Tuition fees 1200 yuan . The windfall profits of online classes are still huge .The ultimate goal is achieved :
1、 Through popular singers , Grab songs ID.
2、 Through songs ID, Grab comment users ID.
3、 By commenting on users ID, Send a directed push message .
The last two articles have completed the steps 1、 step 2, This article completes the steps 3.
Conclusion :requests and selenium The difference between :requests No page method to get songs ID, It's quite fast , But you can only get some public web pages without login , If user login and authentication are required ,requests Will not be able to .
selenium Its advantage is that it completely imitates the operation of opening a web page , It's like you hired an assistant to do things for you , Very intuitive , It will not be forbidden to visit . And for interfaces that require user login ( Such as microblog ), use selenium It can easily skip the troublesome part of verification .
In the first part, we use MYSQL Store and crawl the user's home page information , This article will support error redoing , Each time a record is processed, a processing flag bit will be marked Y, Similar to our production system .
step 1: Query the user lD And home page tables
We need to check u
边栏推荐
- numpy中mgrid的用法
- Two methods of QT to realize timer
- VB.net类库——4给屏幕截图,裁剪
- VB.net类库(进阶版——1)
- Godson China Science and technology innovation board is listed: the market value is 35.7 billion yuan, becoming the first share of domestic CPU
- 大家都能看得懂的源码(一)ahooks 整体架构篇
- 关于appium踩坑 :Encountered internal error running command: Error: Cannot verify the signature of (已解决)
- 花店橱窗布置【动态规划】
- 聊聊我的远程工作体验 | 社区征文
- Vi/vim editor
猜你喜欢

俞敏洪:新东方并不存在倒下再翻身,摧毁又雄起的逆转

「连续学习Continual learning, CL」最新2022研究综述

Y48. Chapter III kubernetes from introduction to mastery -- pod status and probe (21)

Leetcode question brushing: String 03 (Sword finger offer 05. replace space)

Establish a connection with MySQL

2022年,中轻度游戏出海路在何方?

Android IO, a first-line Internet manufacturer, is a collection of real questions for senior Android interviews

Leetcode(452)——用最少数量的箭引爆气球

这些地区考研太疯狂!哪个地区报考人数最多?

Netease Yunxin officially joined the smart hospital branch of China Medical Equipment Association to accelerate the construction of smart hospitals across the country
随机推荐
ICML2022 | Neurotoxin:联邦学习的持久后门
Cause analysis of 12 MySQL slow queries
DLA模型(分类模型+改进版分割模型) + 可变形卷积
GameFi 活跃用户、交易量、融资额、新项目持续性下滑,Axie、StepN 能摆脱死亡螺旋吗?链游路在何方?
C: Reverse linked list
SAP Spartacus 中的依赖注入 Dependency Injection 介绍
C: 反转链表
SAP Commerce Cloud 项目 Spartacus 入门
Simple Lianliankan games based on QT
Hands on deep learning pytorch version 3 - Data Preprocessing
The importance of using fonts correctly in DataWindow
Sentinelresource annotation details
[Bayesian classification 2] naive Bayesian classifier
基于QT开发的线性代数初学者的矩阵计算器设计
Common concurrent testing tools and pressure testing methods
Two methods of QT to realize timer
0 basic C language (0)
PostgreSQL notes
VB.net类库(进阶版——1)
Leetcode question brushing: String 03 (Sword finger offer 05. replace space)