当前位置:网站首页>Selenium crawl notes

Selenium crawl notes

2022-06-24 20:36:00 Yu Xu

         Import third-party library selenium.

import selenium
from selenium import webdriver

         Download the corresponding browser driver :

  edge:https://developer.microsoft.com/en-us/microsoft-edge/tools/webdriver/

  chrome:https://code.google.com/p/chromedriver/downloads/list

  firefox:https://github.com/mozilla/geckodriver/releases/

  IE:NuGet Gallery | Selenium.WebDriver.IEDriver 4.0.0

         After downloading, it is a compressed folder , Open folder , There's a webmsedgedriver.exe file , Copy this file to division C In a dish other than a dish , Then configure the path to the system environment of this computer .

        The path of the configuration environment is “ This computer — Right click properties — About — Advanced system setup — senior — environment variable — System variables —path

        take msedgedriver.exe The path of the file is configured , And then click OK .

#  Create a browser object , I am here edge browser , If you are using chrome Browser words , there edge To be converted into chrome,firefox So it is with , The first letter should be capitalized !!
driver = webdriver.Edge()
driver.get('https://www.taobao.com/?spm=a21bo.jianhua.201857.1.5af911d9NTiGPH')
#  Page maximization 
driver.maximize_window()

         Run it here , Find out driver = webdriver.Edge() There is an error .

hp, ht, pid, tid = _winapi.CreateProcess(executable, args,
FileNotFoundError: [WinError 2]  The system cannot find the specified file .

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "D:\learn\ test .py", line 4, in <module>
    driver = webdriver.Edge()
  File "D:\ Study \pycharm  practice \learn\lib\site-packages\selenium\webdriver\edge\webdriver.py", line 62, in __init__
    super(WebDriver, self).__init__(DesiredCapabilities.EDGE['browserName'], "ms",
  File "D:\ Study \pycharm  practice \learn\lib\site-packages\selenium\webdriver\chromium\webdriver.py", line 90, in __init__
    self.service.start()
  File "D:\ Study \pycharm  practice \learn\lib\site-packages\selenium\webdriver\common\service.py", line 81, in start
    raise WebDriverException(
selenium.common.exceptions.WebDriverException: Message: 'msedgedriver' executable needs to be in PATH. Please download from https://developer.microsoft.com/en-us/microsoft-edge/tools/webdriver/

        

         Here it is said that the driver needs to be in the configuration , But I thought I had configured the path , How to configure , Later I found out , The original path is given to in the form of an object webdriver.Edge() In this way .

        So the code has to be changed to this .

#                         Of course, here it is edge You have to change it to your own browser name , Lowercase is OK 
from selenium.webdriver.edge.service import Service

#  use Service() Method to give a path to a variable s, Regular expressions are used here 
s = Service(r'D:\msedgedriver.exe')
#  there service yes Edge Parameters in methods , The specific usage can be selected with the mouse Edge, Then press and hold ctrl, Then click with the left mouse button , The corresponding method file will pop up 
driver = webdriver.Edge(service=s)
driver.get('https://www.taobao.com/?spm=a21bo.jianhua.201857.1.5af911d9NTiGPH')
#  Page maximization 
driver.maximize_window()

         Then run the code , Taobao will pop up , There's a point here , When code and people browse the web, there will be different situations :

        1、 If people come to visit the web , Search in search , Select items , Until the purchase is finalized , The interface pop-up window for logging in to the user account will pop up ;

        2、 If it is the code to manipulate the driver to browse the web , Then you will enter the set product in the search column , Pop up the pop-up window of the login interface directly .

        Let's first write the code of the content to search .

        Here is another content :

        General is to use find_element_by_xpath() To get web page elements , It turned out to be mine pycharm But on the bottom

#  Here we need to use a different method , Add a... To it from selenium.webdriver.common.by import By


#  It is not recommended to use find_element_by_xpath(), Please use find_element() Methods to replace 
find_element_by_* commands are deprecated. Please use find_element() instead 

#  That is to say find_elemnet_by_xpath() == find_element(By.XAPTH, ‘ The element you are looking for ')

         This is used here. xpath Method to get the web page elements of the search box , Then set the random delay of the web page 1 To 3 second .

import random

driver.find_element(By.XPATH, '//*[@id="J_TSearchForm"]/div[1]/button').click()
time.sleep(random.randint(1, 3))

         Then get the search button , Also set random delay 1 To 3 second .

原网站

版权声明
本文为[Yu Xu]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/02/202202211326421315.html