当前位置:网站首页>Selenium crawl notes
Selenium crawl notes
2022-06-24 20:36:00 【Yu Xu】
Import third-party library selenium.
import selenium
from selenium import webdriver
Download the corresponding browser driver :
edge:https://developer.microsoft.com/en-us/microsoft-edge/tools/webdriver/
chrome:https://code.google.com/p/chromedriver/downloads/list
firefox:https://github.com/mozilla/geckodriver/releases/
IE:NuGet Gallery | Selenium.WebDriver.IEDriver 4.0.0
After downloading, it is a compressed folder , Open folder , There's a webmsedgedriver.exe file , Copy this file to division C In a dish other than a dish , Then configure the path to the system environment of this computer .
The path of the configuration environment is “ This computer — Right click properties — About — Advanced system setup — senior — environment variable — System variables —path
take msedgedriver.exe The path of the file is configured , And then click OK .
# Create a browser object , I am here edge browser , If you are using chrome Browser words , there edge To be converted into chrome,firefox So it is with , The first letter should be capitalized !!
driver = webdriver.Edge()
driver.get('https://www.taobao.com/?spm=a21bo.jianhua.201857.1.5af911d9NTiGPH')
# Page maximization
driver.maximize_window()
Run it here , Find out driver = webdriver.Edge() There is an error .
hp, ht, pid, tid = _winapi.CreateProcess(executable, args,
FileNotFoundError: [WinError 2] The system cannot find the specified file .
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "D:\learn\ test .py", line 4, in <module>
driver = webdriver.Edge()
File "D:\ Study \pycharm practice \learn\lib\site-packages\selenium\webdriver\edge\webdriver.py", line 62, in __init__
super(WebDriver, self).__init__(DesiredCapabilities.EDGE['browserName'], "ms",
File "D:\ Study \pycharm practice \learn\lib\site-packages\selenium\webdriver\chromium\webdriver.py", line 90, in __init__
self.service.start()
File "D:\ Study \pycharm practice \learn\lib\site-packages\selenium\webdriver\common\service.py", line 81, in start
raise WebDriverException(
selenium.common.exceptions.WebDriverException: Message: 'msedgedriver' executable needs to be in PATH. Please download from https://developer.microsoft.com/en-us/microsoft-edge/tools/webdriver/
Here it is said that the driver needs to be in the configuration , But I thought I had configured the path , How to configure , Later I found out , The original path is given to in the form of an object webdriver.Edge() In this way .
So the code has to be changed to this .
# Of course, here it is edge You have to change it to your own browser name , Lowercase is OK
from selenium.webdriver.edge.service import Service
# use Service() Method to give a path to a variable s, Regular expressions are used here
s = Service(r'D:\msedgedriver.exe')
# there service yes Edge Parameters in methods , The specific usage can be selected with the mouse Edge, Then press and hold ctrl, Then click with the left mouse button , The corresponding method file will pop up
driver = webdriver.Edge(service=s)
driver.get('https://www.taobao.com/?spm=a21bo.jianhua.201857.1.5af911d9NTiGPH')
# Page maximization
driver.maximize_window()
Then run the code , Taobao will pop up , There's a point here , When code and people browse the web, there will be different situations :
1、 If people come to visit the web , Search in search , Select items , Until the purchase is finalized , The interface pop-up window for logging in to the user account will pop up ;
2、 If it is the code to manipulate the driver to browse the web , Then you will enter the set product in the search column , Pop up the pop-up window of the login interface directly .
Let's first write the code of the content to search .
Here is another content :
General is to use find_element_by_xpath() To get web page elements , It turned out to be mine pycharm But on the bottom
# Here we need to use a different method , Add a... To it from selenium.webdriver.common.by import By
# It is not recommended to use find_element_by_xpath(), Please use find_element() Methods to replace
find_element_by_* commands are deprecated. Please use find_element() instead
# That is to say find_elemnet_by_xpath() == find_element(By.XAPTH, ‘ The element you are looking for ')
This is used here. xpath Method to get the web page elements of the search box , Then set the random delay of the web page 1 To 3 second .
import random
driver.find_element(By.XPATH, '//*[@id="J_TSearchForm"]/div[1]/button').click()
time.sleep(random.randint(1, 3))
Then get the search button , Also set random delay 1 To 3 second .
边栏推荐
- 史上最全DPU厂商大盘点(上)
- 年轻人捧红的做饭生意经:博主忙卖课带货,机构月入百万
- Coinbase将推出首个针对个人投资者的加密衍生产品
- Get to know the data structure of redis - hash
- 思源笔记工具栏中的按钮名称变成了 undefined,有人遇到过吗?
- 别再用 System.currentTimeMillis() 统计耗时了,太 Low,StopWatch 好用到爆!
- Simulation lottery and probability statistics experiment of the top 16 Champions League
- Basic operation of sequence table
- 微信小程序自定义tabBar
- 伯克利、MIT、剑桥、DeepMind等业内大佬线上讲座:迈向安全可靠可控的AI
猜你喜欢
Image panr
海泰前沿技术|隐私计算技术在医疗数据保护中的应用
传统的IO存在什么问题?为什么引入零拷贝的?
苹果不差钱,但做内容“没底气”
伯克利、MIT、劍橋、DeepMind等業內大佬線上講座:邁向安全可靠可控的AI
虚拟化是什么意思?包含哪些技术?与私有云有什么区别?
Two fellow countrymen from Hunan have jointly launched a 10 billion yuan IPO
使用gorm查询数据库时reflect: reflect.flag.mustBeAssignable using unaddressable value
网络安全审查办公室对知网启动网络安全审查,称其“掌握大量重要数据及敏感信息”
"Ningwang" was sold and bought at the same time, and Hillhouse capital has cashed in billions by "selling high and absorbing low"
随机推荐
传统的IO存在什么问题?为什么引入零拷贝的?
Prototype mode -- clone monster Army
C语言实现扫雷(简易版)
Compressed list of redis data structures
Set up your own website (14)
Basic properties and ergodicity of binary tree
物聯網?快來看 Arduino 上雲啦
The name of the button in the Siyuan notes toolbar has changed to undefined. Has anyone ever encountered it?
刚购买了一个MYSQL数据库,提示已有实例,控制台登录实例要提供数据库账号,我如何知道数据库账号。
You can capture fingerprints with a mobile camera?! Accuracy comparable to signature and monogram, expert: you are aggravating discrimination
Basic operation of sequence table
It is said that Tencent officially announced the establishment of "XR" department to bet on yuanuniverse; Former CEO of Google: the United States is about to lose the chip competition. We should let T
视频平台如何将旧数据库导入到新数据库?
Accurate calculation of task progress bar of lol mobile game
基于SSM的物料管理系统(源码+文档+数据库)
Difference between map and object
等保备案是等保测评吗?两者是什么关系?
[suggested collection] time series prediction application and paper summary
顺序表的基本操作
[cann document express issue 04] unveiling the development of shengteng cann operator