当前位置:网站首页>Douban top250
Douban top250
2022-06-26 05:04:00 【Rain and dew touch the real king】
from lxml import etree
import time
import random
import requests
headers ={'User-Agent':'Mozila/5.0(Windows NT 10.0;WOW64) AppleWebKit/537.36(KHTML ,like Gecko )Chrome/83.0.4103.61 Safari/537.36'}
def proprcessing (strs):
s=''
for n in strs:
n=''.join(n.split())
s=s+n
return s
def get_movie_info(url):
response =requests.get(url=url,headers=headers)
html = etree.HTML(response.text)
div_all = html.xpath('//div[@class="info"]')
for div in div_all:
names = div.xpath('./div[@class="hd"]/a//span/text()')
name=proprcessing(names)
infos = div.xpath('./div[@class="bd"]/p/text')
info = proprcessing(infos)
score = div.xpath('./div[@class="bd"]/div/span[2]/text()')
evaluation = div.xp('./div[@class = "bd"]/div/span[4]/text()')
summary = div.xpath('./div[@class="bd"]/p[@class="quote"]/span/text()')
print(' The movie name :', name)
print(' Director and actor :',info)
print(' Movie ratings :',score)
print(' Number of evaluators :',evaluation)
print(' Film summary :',summary)
print('_________________')
if __name__=='__name__':
for i in range(0,250,25):
url='https://movie.douban.com/top250?start={page}&filter='.format(page=i)
get_movie_info(url)
time.sleep(random.randint(1,3))
边栏推荐
- Introduction to classification data cotegory and properties and methods of common APIs
- Multipass中文文档-提高挂载性能
- Using Matplotlib to add an external image at the canvas level
- Yolov5 super parameter setting and data enhancement analysis
- Basic query
- Generalized linear model (logistic regression, Poisson regression)
- 两步处理字符串正则匹配得到JSON列表
- Zuul implements dynamic routing
- Transport layer TCP protocol and UDP protocol
- [geek] product manager training camp
猜你喜欢
AD教程系列 | 4 - 创建集成库文件
6.1 - 6.2 introduction to public key cryptography
5. < tag stack and general problems > supplement: lt.946 Verify the stack sequence (the same as the push in and pop-up sequence of offer 31. stack)
记录一次循环引用的问题
Why do many Shopify independent station sellers use chat robots? Read industry secrets in one minute!
Zhongshanshan: engineers after being blasted will take off | ONEFLOW u
LeetCode 19. 删除链表的倒数第 N 个结点
天才制造者:獨行俠、科技巨頭和AI|深度學習崛起十年
【Latex】错误类型总结(持更)
LeetCode 19. Delete the penultimate node of the linked list
随机推荐
Differences between TCP and UDP
Multipass中文文档-移除实例
Statsmodels Library -- linear regression model
Guanghetong and anti international bring 5g R16 powerful performance to the AI edge computing platform based on NVIDIA Jetson Xavier nx
PSIM software learning ---08 call of C program block
ModuleNotFoundError: No module named ‘numpy‘
Multipass中文文档-使用Packer打包Multipass镜像
The first gift of the project, the flying oar contract!
文件上传与安全狗
GD32F3x0 官方PWM驱动正频宽偏小(定时不准)的问题
Datetime data type - min() get the earliest date and date_ Range() creates a date range, timestamp() creates a timestamp, and tz() changes the time zone
pycharm 导包错误没有警告
Final review of brain and cognitive science
Simple application of KMP
2022.2.10
torchvision_ Transform (image enhancement)
超高精度定位系统中的UWB是什么
Some parameter settings and feature graph visualization of yolov5-6.0
Comment enregistrer une image dans une applet Wechat
A new paradigm for large model application: unified feature representation optimization (UFO)