当前位置:网站首页>12 initialization of beautifulsoup class
12 initialization of beautifulsoup class
2022-06-22 00:55:00 【Andy Python】
12 BeautifulSoup Class initialization
beautifulsoup4 Shorthand for bs4.
bs4 Kuo is Python Third party library .
The function is to extract data from documents .
bs4 It's the library .
BeautifulSoup It's a class .
【 Knowledge review 】
The first letter of the class should be capitalized .
Class instantiation syntax : object = Class name ( Parameters )
12.1 BeautifulSoup Class initialization method
1. Initialization steps

2. initialization BeautifulSoup object
# from bs4 Import... In the library BeautifulSoup class
from bs4 import BeautifulSoup
# Pass in markup、features2 Parameters , Get an instantiated object
# object = Class name ( Parameters )
soup = BeautifulSoup(markup=, features=)
12.2 BeautifulSoup The meaning of the parameter
1. Parameters markup
Parameters markup Refers to the resolved HTML String or file content .
1. Use string variables
# from bs4 Import... In the library BeautifulSoup class
from bs4 import BeautifulSoup
# html_str Is a string variable , It is usually obtained from the previous step HTML Code
soup = BeautifulSoup(html_str)
2. Use open() Function to open a file
# from bs4 Import... In the library BeautifulSoup class
from bs4 import BeautifulSoup
# Use open Function to open the file , Get the file object
# File objects can also be used as initialization parameters
# index.html finger HTML Code
soup = BeautifulSoup(open(index.html))
2. Parameters features
Parameters features Refers to the type of parser
1. Specify the parser
# from bs4 Import... In the library BeautifulSoup class
from bs4 import BeautifulSoup
# html_str To parse HTML Code ( The data type is string )
# The parser is 'lxml', Notice the quotation marks around the parser
# object = Class name ( Parameters )
soup = BeautifulSoup(html_str, 'lxml')
# from bs4 Import... In the library BeautifulSoup class
from bs4 import BeautifulSoup
# html_str To parse HTML Code ( The data type is string )
# The parser is 'html.parser', Notice the quotation marks around the parser
# object = Class name ( Parameters )
soup = BeautifulSoup(html_str, 'html.parser')
2. Parser not specified , BeautifulSoup Select the default parser to parse the document
# from bs4 Import... In the library BeautifulSoup class
from bs4 import BeautifulSoup
# html_str To parse HTML Code ( The data type is string )
# The parser is 'html.parser', Notice the quotation marks around the parser
soup = BeautifulSoup(html_str)
12.3 summary

边栏推荐
猜你喜欢

pytorch学习12:自动求导

Transformation of DS and DXDY in surface integral of area

关于 NFT 和版权的纠结真相

Version dynamic | exchangis 1.0.0-rc1 version release

Mendix公司新任CFO Tom Ellison通过领导团队转型推动公司下一阶段高速增长

旋转框目标检测————关于旋转框定义和解决方案
![Chapter VIII exercises (45A) [microcomputer principles] [exercises]](/img/79/8311a409113331e72f650a83351b46.png)
Chapter VIII exercises (45A) [microcomputer principles] [exercises]

How the conductive slip ring works

NS32F103VBT6软硬件替代STM32F103VBT6

Use of MySQL performance analysis tools
随机推荐
leetcode 279. Perfect Squares 完全平方数(中等)
Introduction to some code static checking tools
笔记
The data magician tells you how far is the integer programming copt5.0 from CPLEX?
位运算位或
Acwing game 56
pytorch学习13:实现LetNet和学习nn.Module相关基本操作
Meetup03期回顾:Linkis新版本介绍以及DSS的应用实践
Farm Game
数字化转型的下一个目标:提供准时制信息
VScode 中查看本地ip地址
NS32F103VBT6软硬件替代STM32F103VBT6
Are Huishang futures accounts reliable? How can a novice safely open an account?
Client construction and Optimization Practice
root检测实现
[2023 approved in advance] Qingdao Dingxin Technology
唐太宗把微服务的“心跳机制”玩到了极致!
pytorch学习05:索引和切片
pytorch学习07:Broadcast广播——自动扩展
Lecture 3 of Data Engineering Series: characteristic engineering of data centric AI