当前位置:网站首页>Script redis write project notes
Script redis write project notes
2022-07-24 11:42:00 【Fan zhidu】
Crawler file :
from scrapy.spiders import Rule
from scrapy.linkextractors import LinkExtractor
from scrapy_redis.spiders import RedisCrawlSpider
class MyCrawler(RedisCrawlSpider):
name = 'mycrawler_redis'
redis_key = 'mycrawler:start_urls'
# The rules
rules = (
# follow all links
Rule(LinkExtractor(), callback='parse_page', follow=True),
)
# The key is allowed_domains Make the back into an array list
def __init__(self, *args, **kwargs):
# Dynamically define the allowed domains list.
domain = kwargs.pop('domain', '')
self.allowed_domains =list(filter(None, domain.split(',')))
super(MyCrawler, self).__init__(*args, **kwargs)
def parse_page(self, response):
return {
'name': response.css('title::text').extract_first(),
'url': response.url,
}stay setting Add the following code to the file :
REDIS_URL = 'redis://root:@127.0.0.1:6379'
DUPEFILTER_CLASS = "scrapy_redis.dupefilter.RFPDupeFilter"
DUPEFILTER_DEBUG =True
SCHEDULER = "scrapy_redis.scheduler.Scheduler"
SCHEDULER_PERSIST = True
SCHEDULER_QUEUE_CLASS = 'scrapy_redis.queue.SpiderPriorityQueue'边栏推荐
- Two important laws about parallelism
- 链表——剑指offer面试题 02.07. 链表相交
- 哈希——15. 三数之和
- How to use SSH and SFTP protocols at home
- 1184. Distance between bus stops: simple simulation problem
- Hash - 202. Happy number
- What is the difference between low code and no code?
- 哈希——242.有效的字母异位词
- An analysis of the CPU surge of an RFID tag management system in.Net
- 【Markdown语法高级】让你的博客更精彩(四:设置字体样式以及颜色对照表)
猜你喜欢

String -- 344. Reverse string

Sentinel vs Hystrix 限流对比,到底怎么选?

DevOps及DevOps常用的工具介绍
![Operational amplifier - Notes on rapid recovery [II] (application)](/img/fd/e12f43e23e6ec76c2b44ce7813e204.png)
Operational amplifier - Notes on rapid recovery [II] (application)

Introduction to Devops and common Devops tools

Stream stream

Paging query of employee information of black maredge takeout

HCIP MGRE实验 第三天

链表——142. 环形链表 II

Cgo+gsoap+onvif learning summary: 9. Go and C conduct socket communication and onvif protocol processing
随机推荐
Experience of redis deepwater area -- Interview reference
HCIP MGRE实验 第三天
JPS has no namenode and datanode reasons
C # entry series (29) -- preprocessing commands
How to choose sentinel vs. hystrix current limiting?
iMeta观点 | 短读长扩增子测序是否适用于微生物组功能的预测?
Directional crawling Taobao product name and price (teacher Songtian)
How to use a third party without obtaining root permission topic: MIUI chapter
Shell Scripting tips
L1-049 seat allocation of ladder race
Recommended SSH cross platform terminal tool tabby
Collision, removal and cleaning
Leetcode 257. 二叉树的所有路径
Nacos permissions and databases
Sentinel vs Hystrix 限流对比,到底怎么选?
Jmeter-While控制器
Imeta view | is short reading long amplicon sequencing applicable to the prediction of microbiome function?
gcc -l参数和-L参数的区别
MOS tube - Notes on rapid recovery application (I) [principle]
L1-043 阅览室