当前位置:网站首页>12 BeautifulSoup类的初始化
12 BeautifulSoup类的初始化
2022-06-23 14:41:00 【安迪python学习笔记】
12 BeautifulSoup类的初始化
文章目录
beautifulsoup4 简写为bs4。
bs4 库是 Python 的第三方库。
作用是从文档中提取数据。
bs4 是库。
BeautifulSoup 是类。
【知识回顾】
类的首字母要大写。
类的实例化语法:对象 = 类名(参数)
12.1 BeautifulSoup 类的初始化方式
1. 初始化步骤

2. 初始化BeautifulSoup 对象
# 从bs4库中导入BeautifulSoup 类
from bs4 import BeautifulSoup
# 传入markup、features2个参数,得到一个实例化对象
# 对象 = 类名(参数)
soup = BeautifulSoup(markup=, features=)
12.2 BeautifulSoup 的参数的含义
1. 参数markup
参数markup指被解析的 HTML字符串或文件内容。
1. 使用字符串变量
# 从bs4库中导入BeautifulSoup 类
from bs4 import BeautifulSoup
# html_str是一个字符串变量,通常是上一步得到的HTML代码
soup = BeautifulSoup(html_str)
2. 使用open()函数打开文件
# 从bs4库中导入BeautifulSoup 类
from bs4 import BeautifulSoup
# 使用open函数将文件打开,得到文件对象
# 文件对象也可以作为初始化参数
# index.html指HTML代码
soup = BeautifulSoup(open(index.html))
2. 参数features
参数features指解析器的类型
1. 指定解析器
# 从bs4库中导入BeautifulSoup 类
from bs4 import BeautifulSoup
# html_str 要解析的HTML代码(数据类型为字符串)
# 解析器为'lxml',注意解析器前后有引号
# 对象 = 类名(参数)
soup = BeautifulSoup(html_str, 'lxml')
# 从bs4库中导入BeautifulSoup 类
from bs4 import BeautifulSoup
# html_str 要解析的HTML代码(数据类型为字符串)
# 解析器为'html.parser',注意解析器前后有引号
# 对象 = 类名(参数)
soup = BeautifulSoup(html_str, 'html.parser')
2. 未指定解析器, BeautifulSoup选择默认的解析器来解析文档
# 从bs4库中导入BeautifulSoup 类
from bs4 import BeautifulSoup
# html_str 要解析的HTML代码(数据类型为字符串)
# 解析器为'html.parser',注意解析器前后有引号
soup = BeautifulSoup(html_str)
12.3 总结

边栏推荐
- SQL injection vulnerability (principle)
- Uniswap acquires genie, an NFT transaction aggregator. Will the NFT transaction market change?
- golang 重要知识:mutex
- Explain in detail the principle and implementation of redis distributed lock
- The work and development steps that must be done in the early stage of the development of the source code of the live broadcasting room
- 山东:美食“隐藏款”,消费“扫地僧”
- golang 重要知识:sync.Cond 机制
- Millions of bonuses are waiting for you to get. The first China Yuan universe innovation and application competition is in hot Recruitment!
- Introduction to the push function in JS
- Important knowledge of golang: mutex
猜你喜欢

Millions of bonuses are waiting for you to get. The first China Yuan universe innovation and application competition is in hot Recruitment!

Tencent ECS failed to send email
mysql主从只同步部分库或表的思路与方法

狂奔的极兔,摔了一跤

JS中的pop()元素

Idea view View the class file idea Class folder
Redis缓存三大异常的处理方案梳理总结

Slice() and slice() of JS

Why is Xiaomi stuck in the chip quagmire?

基因检测,如何帮助患者对抗疾病?
随机推荐
【opencv450】椒盐噪声demo
直播间源码在开发前期必须做的工作及开发步骤
General sequence representation learning in kdd'22 "Ali" recommendation system
Xampp中mysql无法启动问题的解决方法
[opencv450] salt and pepper noise demo
How can genetic testing help patients fight disease?
力扣解法匯總513-找樹左下角的值
JS traversal array (using the foreach () method)
JS的unshift()和shift()
labelme的JSON文件转成COCO数据集格式
Idea view View the class file idea Class folder
聚合生态,使能安全运营,华为云安全云脑智护业务安全
PHP指定字段大于100正序排,小于100随机排
Logistics trade related
乐高宣布涨价,炒家更嗨皮了
Ie mode of selenium edge
js中的push函数介绍
golang 重要知识:RWMutex 读写锁分析
Google &huggingface| zero sample language model structure with the strongest ability
Sectigo(Comodo)证书的由来