当前位置:网站首页>Reasons for automatic allocation failure of crawler agent IP
Reasons for automatic allocation failure of crawler agent IP
2022-06-24 06:21:00 【User 6172015】
Recently, a friend found a problem when using crawler agent , After the request is made through the crawler agent , Not every HTTP Request automatic assignment of different agents IP, Instead, all requests remain the same proxy IP Fixed use 20 Seconds later , Will switch to a new agent IP, What is the cause of this ? Some codes provided by small partners are as follows :
#! -*- encoding:utf-8 -*-
import requests
import random
# Target page to visit
targetUrl = "http://httpbin.org/ip"
# Objectives to visit HTTPS page
# targetUrl = "https://httpbin.org/ip"
# proxy server ( The product's official website www.16yun.cn)
proxyHost = "t.16yun.cn"
proxyPort = "31111"
# Proxy authentication information
proxyUser = "username"
proxyPass = "password"
proxyMeta = "http://%(user)s:%(pass)[email protected]%(host)s:%(port)s" % {
"host" : proxyHost,
"port" : proxyPort,
"user" : proxyUser,
"pass" : proxyPass,
}
# Set up http and https All visits are made with HTTP agent
proxies = {
"http" : proxyMeta,
"https" : proxyMeta,
}
# Set up IP Switch head
tunnel = random.randint(1,10000)
headers = {
‘Connection’:'keep-alive',
'Accept-Language':'zh',
"Proxy-Tunnel": str(tunnel)
}
for i in range(100):
resp = requests.get(targetUrl, proxies=proxies, headers=headers)
print resp.status_code
print resp.text
time.sleep(0.2)After debugging and Analysis , The above code is mainly two problems :
1、‘Connection’:'keep-alive' Need to be closed
keep-alive It is the protocol specification of client and server , Turn on keep-alive, Then the server returns response Do not close after TCP Connect , After receiving the response message , The client does not close the connection , Send next HTTP The connection is reused when requested , This is the guide TCP Links keep opening , Therefore, the automatic of crawler agent IP The switch fails . Cause an agent IP It will be used for a long time , Until the agent IP Effective time of 20 After the second expires , closed TCP Connect and switch to the new agent IP.
2、tunnel Parameter setting error
tunnel Is used to control the agent IP Switching control parameters . The crawler agent will check tunnel The numerical , Different values will HTTP Request random assignment of a new agent IP forward ,tunnel The same will HTTP Request to assign the same agent IP forward . So to achieve each HTTP Requests go through different agents IP forward , Should be in for The following implementation tunnel = random.randint(1,10000), Make sure every time HTTP In the request tunnel Are different values .
边栏推荐
- PNAs: development of white matter pathways in human brain during the second and third trimester of pregnancy
- The influence of TLS protocol and cipher on remote RDP
- Project deployment for learning 3D visualization from scratch
- Small programs import Excel data in batches, and cloud development database exports CVS garbled code solution
- Analysis of official template of wechat personnel recruitment management system (III)
- Risc-v instruction set explanation (4) R-type integer register register instruction
- WordPress pill applet build applet from zero to one [install and configure WordPress site]
- Web address domain name IP query method, what is the use of domain name
- TRTC applet custom message
- Go concurrency - work pool mode
猜你喜欢
![[fault announcement] one stored procedure brings down the entire database](/img/7c/e5adda73a077fe4b8f04b59d1e0e1e.jpg)
[fault announcement] one stored procedure brings down the entire database

One line of keyboard

The product layout is strengthened, the transformation of digital intelligence is accelerated, and FAW Toyota has hit 2022million annual sales

Technology is a double-edged sword, which needs to be well kept

A cigarette of time to talk with you about how novices transform from functional testing to advanced automated testing

Solution to the 39th weekly game of acwing

Manual for automatic testing and learning of anti stepping pits, one for each tester

ServiceStack. Source code analysis of redis (connection and connection pool)

What is the difference between a white box test and a black box test

Enter the software test pit!!! Software testing tools commonly used by software testers software recommendations
随机推荐
Architecture: rest and HATEOAS
How is a Clickhouse query completed?
Analysis on the influence of "network security policy issued successively" on Enterprises
Risk management - Asset Discovery series - public web asset discovery
Semantic web, semantic web, linked data and knowledge map
[in depth sharing] Devops evolution path -- Realizing R & D digital transformation based on four vertical and four horizontal Devops system
Member management system PC side building tutorial (I)
Manual for automatic testing and learning of anti stepping pits, one for each tester
Working principle and type selection of signal generator
WordPress pill applet build applet from zero to one [pagoda panel environment installation]
How to select cloud game platforms? Just pay attention to two points
How to batch move topics to different categories in discover
Excellent tech sharing | research and application of Tencent excellent map in weak surveillance target location
The installation method of apache+mysql+php running environment under Windows
Easycvr is cascaded to easygbs through gb28181 protocol. Notes on video channel failure
Enterprise management background user manual
Idea2020 latest activation tutorial, continuously updated
Risc-v instruction set explanation (7) instruction address alignment and addition and subtraction overflow processing
5 minutes, online from 0 to 1!
TRTC applet custom message