当前位置:网站首页>Lesson 3 urllib
Lesson 3 urllib
2022-06-25 20:43:00 【Osmanthus rice wine balls】
The third class urllib
One 、 Encapsulate the source code in the web page into an object
import urllib.request
# Get one get request
response = urllib.request.urlopen("http://www.baidu.com") # Packaged in response in
print(response.read().decode('utf-8')) #decode('utf-8') Decode the obtained web page code , To prevent the occurrence of Chinese characters , Print out the web source code
# Get one post request ( Used to simulate login ( password , user ))
use httpbin.org
import urllib.parse # Parser , Parsing key value pairs
data = bytes(urllib.parse.urlencode({
"hello":"world"}),encoding = "utf-8")# Forms , Package that encapsulates key value pair information into binary ,encoding = "utf-8" Encapsulation
response = urllib.request.urlopen("http://httpbin.org/post",data = data)
print(response.read().decode('utf-8'))
Two 、 Timeout problem
try:
response = urllib.request.urlopen("http://httpbin.org/post",timeout=0.01)# For more than 0.01 second
print(response.read().decode('utf-8'))
except urllib.error.URLError as e:
print("time out!")
3、 ... and 、 Response header questions ( Pretend to be a browser )
url = "https://httpbin.org/post"
headers = {
"User-Agent":"……"}
data = bytes(urllib.parse.urlencode({
"hello":"world"}),encoding = "utf-8")
req = urllib.request.Request(url=url,data=data,headers=headers,method='post')# encapsulation , A browser that simulates reality
response = urllib.request.urlopen(req)# encapsulation
print(response.read().decode("utf-8"))
look for User-Agent Methods ( look for headers The key/value pair ):
Find in the network
[ Failed to transfer the external chain picture , The origin station may have anti-theft chain mechanism , It is suggested to save the pictures and upload them directly (img-yaNApJ6z-1644636635823)(C:\Users\ litchi \AppData\Roaming\Typora\typora-user-images\image-20220204161745986.png)]
Four 、 get data
# Crawl to the web
def getData(baseurl):
dataist = []
for i in range(0,10):# Call the function to get page information ,10 Time
url = baseurl + str(i*25)
html = askURL(url)# Save the source code of the web page
return datalist
# Get the designated one URL The web content of
def askURL(url):
head = {
"User-Agent":"……"
}# To disguise , Simulate browser header information
request = urllib.request.Request(url,headers=head)# carry headers To visit url
try:
response = urllib.request.urlopen(request)# Get information about the entire web page
html = response.read().decode("utf-8")# Read information ( Web source code )
except urllib.error.URLError as e:# Capture the error
if hasattr(e,"code"):
print(e.code)# Print code, See what's wrong with the coding
if hasattr(e,"reason"):
print(e.reason)# Print out the reasons for the failure
return html
r(e,“reason”):
print(e.reason)# Print out the reasons for the failure
return html
边栏推荐
- III Implementation principle of vector
- MySQL lock
- What are the differences between domestic advanced anti DDoS servers and overseas advanced anti DDoS servers?
- [data recovery in North Asia] a data recovery case in which the upper virtual machine data is lost due to the hard disk failure and disconnection of raid6 disk array
- The super easy-to-use test tool sorted out by Ali P8 for a week
- Expand and check the specified node when loading ztree
- 4.ypthon function foundation
- Day 29/100 local SSH password free login to remote
- Remember to deploy selenium crawler on the server
- Cross project measurement is a good helper for CTOs and PMOS
猜你喜欢

Pcl+vs2019+opencv environment configuration

E-commerce project environment construction
![[opencv] opencv from introduction to mastery -- detailed explanation of input and output XML and yaml files](/img/88/75c4caacef30e0621106a4e3462367.jpg)
[opencv] opencv from introduction to mastery -- detailed explanation of input and output XML and yaml files

Several ways to obtain domain administrator privileges
This is a simple and cool way to make large screen chart linkage. Smartbi will teach you

Clickhouse disables automatic clearing of tables / columns, that is, disables TTL
![[deep learning series] - visual interpretation of neural network](/img/f9/1402fdb1d8aa266529f963b41bedae.jpg)
[deep learning series] - visual interpretation of neural network
Yunzhisheng atlas supercomputing platform: computing acceleration practice based on fluid + alluxio (Part 2)

Leetcode daily question - 28 Implement strstr() (simple)

Yolov4 reading notes (with mind map)! YOLOv4: Optimal Speed and Accuracy of Object Detection
随机推荐
Teach you how to create and publish a packaged NPM component
The latest promo! 1 minute to understand the charm of the next generation data platform
node. JS express connect mysql write webapi Foundation
TypeError: __ init__ () takes 1 positional argument but 5 were given
[golang] leetcode intermediate - the kth largest element in the array &
Redis thread level reentrant distributed lock (different unique IDs can be locked cyclically)
MySQL installation tutorial
Flexible scale out: from file system to distributed file system
Redis common principles interview
Is it safe to open an account with a mobile phone? Where can I open an account to buy shares?
What are cookies in Web site development?
Corporate finance formula_ P1_ Accounting statement and cash flow
Pcl+vs2019 configuration and some source code test cases and demos
Introduction to event flow, event capture, and event bubbling
Yolov4 reading notes (with mind map)! YOLOv4: Optimal Speed and Accuracy of Object Detection
Installing MySQL under Linux (CentOS 7)
Live broadcast preview | front line experts invite you to talk: the value of data science enabling multiple application scenarios
III Implementation principle of vector
[opencv] opencv from introduction to mastery -- detailed explanation of input and output XML and yaml files
Day 29/100 local SSH password free login to remote