当前位置:网站首页>Requests crawler multi page crawling to KFC restaurant location
Requests crawler multi page crawling to KFC restaurant location
2022-07-24 07:30:00 【Can't fail I】
Crawling of paging data - KFC restaurant location information
List of articles
1 analysis
The address displayed after entering the address is the same as the original address
Explain that pressing the query button initiates Ajax request
- The location information refreshed from the current page must be through ajax Request the requested data


. Locate the... Based on the packet capturing tool ajax Requested packet , From this packet :
- Requested url
- Request mode
- The parameter that the request carries
- See response data
At first, you choose to grab bags ALL, But the analysis shows that what is sent here is Ajax request , So this time I choose Fetch/XHR, This is for checking Ajax Requested
open F12, choice Fetch/XHR, Click query to view the results
The discovery request method is post The way
Return value or json Format

2 Crawl to a page of data
import requests
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/103.0.0.0 Safari/537.36'
}
url = 'http://www.kfc.com.cn/kfccda/ashx/GetStoreList.ashx?op=keyword'
data = {
'cname': '',
'pid': '',
'keyword': ' Beijing ',
'pageIndex': '1',
'pageSize': '10',
}
# data Parameter is post Method to deal with parameter dynamic parameters
response = requests.post(url=url, headers=headers, data=data)
page_text = response.json()
for dic in page_text['Table1']:
title = dic['storeName']
addr = dic['addressDetail']
print(title, addr)

3 Crawling multiple pages of data
When you click on the second page , To discover the requested data pageIndex Change into 2, When you click on the third page, it changes to 3.
So you can crawl all pages by writing a cycle

What only needs to be changed in each cycle is pageIndex The value of the parameter , Because the data is required to be of string type , So avoid making mistakes and force it to convert
import requests
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/103.0.0.0 Safari/537.36'
}
url = 'http://www.kfc.com.cn/kfccda/ashx/GetStoreList.ashx?op=keyword'
for page in range(1, 9):
data = {
'cname': '',
'pid': '',
'keyword': ' Beijing ',
'pageIndex': str(page),
'pageSize': '10',
}
# data Parameter is post Method to deal with parameter dynamic parameters
response = requests.post(url=url, headers=headers, data=data)
page_text = response.json()
for dic in page_text['Table1']:
title = dic['storeName']
addr = dic['addressDetail']
print(' The first ', page, ' page :', title, addr)

Follow the column for more details
边栏推荐
- Oauth2==SSO三种协议。Oauth2四种模式
- Source code analysis of Nacos configuration center
- [introduction to C language] zzulioj 1011-1015
- numpy.cumsum
- [FreeRTOS] 11 software timer
- OpenCascade笔记:gp包
- [steering wheel] the super favorite idea efficiency artifact save actions is uninstalled
- Harbor2.2 quick check of user role permissions
- What kind of mode can make platform users self-help fission- Chain 2+1
- Influxdb未授权访问&CouchDB权限绕过
猜你喜欢

QoS服务质量四QoS边界行为之流量监管

Development system selection route

项目中数据库插入大批量数据遇到的问题

服务漏洞&FTP&RDP&SSH&rsync

Jay Chou's live broadcast was watched by more than 6.54 million people, with a total interaction volume of 450million, helping Kwai break the record again

Unity中使用深度和法线纹理

Part II - C language improvement_ 2. Memory partition

Decompress the anchor and enjoy 4000w+ playback, adding a new wind to the Kwai food track?

Source code analysis of Nacos configuration center

周杰伦直播超654万人观看,总互动量破4.5亿,助力快手再破纪录
随机推荐
MySQL语句
23.组件自定义事件
学习笔记-分布式事务理论
UNI-APP_ Playback and pause of background music of applet or H5 page
深度学习二三事-回顾那些经典卷积神经网络
Stm32h750vbt6 drives programmable gain amplifier module pga113 -- Hal Library Based on cubemx
numpy.arange
Paper reading: hardnet: a low memory traffic network
24. Global event bus
Vulnhub DC1
Learning strategies of 2D target detection overview (final chapter)
R语言手写数字识别
Advanced part of Nacos
C语言文件操作
China trichlorosilane Market Forecast and Strategic Research Report (2022 Edition)
A great hymn
[steering wheel] code review ability of idea to ensure code quality
Harbor2.2 quick check of user role permissions
OpenCascade笔记:gp包
The shortest distance of Y axis of 2D plane polyline