当前位置:网站首页>High quality "climbing hand" of course, you have to climb a "high quality" wallpaper

High quality "climbing hand" of course, you have to climb a "high quality" wallpaper

2022-06-23 21:29:00 Charlie is not a dog

One 、 Write it at the front

Every day my wallpaper is Windows Own sky blue , It's really boring , Interesting , boring ~

So! , as everyone knows , I am a blogger who likes high quality , Of course, the whole hand of high-quality wallpaper , I don't have other meaning? .

Okay , No more beeps , Start today's high-quality journey ~

Two 、 preparation

All these arrangements

python 3.6  
pycharm
requests
parsel

3、 ... and 、 Reptile process

======================================================================

1) About data source search :

1、 Identify target requirements : Climb to HD Wallpaper pictures ( The other shore )

Through developer tools (F12 Or right click to check ) Looking for pictures url Address source ;

request Detail page of wallpaper Get its web page source code You can get pictures url Address ( a sheet );

request The list page can get The detail page of each wallpaper url as well as title ;

2) Code implementation :

1、 Send a request

List page of wallpaper url: http://www.netbian.com/1920x1080/index.htm

2、 get data

Web source code / response.text Web page text data

3、 Parsing data

css xpath bs4 re

Wallpaper details page url:/desk/23397.htm 2. Wallpaper title

4、 Save the data

Saving pictures is binary data

Audience grandpa : That's it. That's it ? What about code? ? The code doesn't give you a few meanings ?

Don't panic. , Here it is... Here it is

Four 、 Code display

I don't think so Once disassembled , Add the third step , Believe that smart you can understand , I really can't. finally, I'll play a video to explain .

import requests #  Request module   Third-party module  pip install requests
import parsel #  Data analysis module   Third-party module  pip install parsel
import time #  Time module   Built-in module 

time_1 = time.time()
#  What module do you want   First, you need to know what the module is for 
for page in range(2, 12):
    print(f'==================== Climbing to the top {page} The data content of the page ====================')
    url = f'http://www.netbian.com/1920x1080/index_{page}.htm'
    #  Request header :  hold python The code disguises itself as a browser sending a request to the server 
    headers = {
        'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.114 Safari/537.36'
    }
    response = requests.get(url=url, headers=headers)
    #  What to do if there is a mess ?  Transcoding required 
    # html_data = response.content.decode('gbk')
    response.encoding = response.apparent_encoding #  Automatic transcoding 
    #  Get source code / Get web page text data  response.text
    # print(response.text)
    #  Parsing data 
    selector = parsel.Selector(response.text)
    # CSS Selectors   Is to extract data according to the content of web page tags 
    #  First extraction   Extract all of li Label content 
    lis = selector.css('.list li')
    for li in lis:
        # http://www.netbian.com/desk/23397.htm
        title = li.css('b::text').get()
        if title:
            href = 'http://www.netbian.com' + li.css('a::attr(href)').get()
            response_1 = requests.get(url=href, headers=headers)
            selector_1 = parsel.Selector(response_1.text)
            img_url = selector_1.css('.pic img::attr(src)').get()

            img_content = requests.get(url=img_url, headers=headers).content
            with open('img\\' + title + '.jpg', mode='wb') as f:
                f.write(img_content)
                print(' Saving : ', title)

time_2 = time.time()
use_time = int(time_2) - int(time_1)
print(f' Total time taken {use_time} second ')
原网站

版权声明
本文为[Charlie is not a dog]所创,转载请带上原文链接,感谢
https://yzsam.com/2021/12/202112231047351932.html