当前位置:网站首页>12 initialization of beautifulsoup class
12 initialization of beautifulsoup class
2022-06-23 17:38:00 【Andy Python learning notes】
12 BeautifulSoup Class initialization
List of articles
beautifulsoup4 Shorthand for bs4.
bs4 Kuo is Python Third party library .
The function is to extract data from documents .
bs4 It's the library .
BeautifulSoup It's a class .
【 Knowledge review 】
The first letter of the class should be capitalized .
Class instantiation syntax : object = Class name ( Parameters )
12.1 BeautifulSoup Class initialization method
1. Initialization steps

2. initialization BeautifulSoup object
# from bs4 Import... In the library BeautifulSoup class
from bs4 import BeautifulSoup
# Pass in markup、features2 Parameters , Get an instantiated object
# object = Class name ( Parameters )
soup = BeautifulSoup(markup=, features=)
12.2 BeautifulSoup The meaning of the parameter
1. Parameters markup
Parameters markup Refers to the resolved HTML String or file content .
1. Use string variables
# from bs4 Import... In the library BeautifulSoup class
from bs4 import BeautifulSoup
# html_str Is a string variable , It is usually obtained from the previous step HTML Code
soup = BeautifulSoup(html_str)
2. Use open() Function to open a file
# from bs4 Import... In the library BeautifulSoup class
from bs4 import BeautifulSoup
# Use open Function to open the file , Get the file object
# File objects can also be used as initialization parameters
# index.html finger HTML Code
soup = BeautifulSoup(open(index.html))
2. Parameters features
Parameters features Refers to the type of parser
1. Specify the parser
# from bs4 Import... In the library BeautifulSoup class
from bs4 import BeautifulSoup
# html_str To parse HTML Code ( The data type is string )
# The parser is 'lxml', Notice the quotation marks around the parser
# object = Class name ( Parameters )
soup = BeautifulSoup(html_str, 'lxml')
# from bs4 Import... In the library BeautifulSoup class
from bs4 import BeautifulSoup
# html_str To parse HTML Code ( The data type is string )
# The parser is 'html.parser', Notice the quotation marks around the parser
# object = Class name ( Parameters )
soup = BeautifulSoup(html_str, 'html.parser')
2. Parser not specified , BeautifulSoup Select the default parser to parse the document
# from bs4 Import... In the library BeautifulSoup class
from bs4 import BeautifulSoup
# html_str To parse HTML Code ( The data type is string )
# The parser is 'html.parser', Notice the quotation marks around the parser
soup = BeautifulSoup(html_str)
12.3 summary

边栏推荐
- qYKVEtqdDg
- Three minutes to learn how to retrieve the MySQL password
- [untitled] Application of laser welding in medical treatment
- Tensorrt Paser loading onnx inference use
- Easyplayer mobile terminal plays webrtc protocol for a long time. Pressing the play page cannot close the "about us" page
- 美团三面:聊聊你理解的Redis主从复制原理?
- Freemark uses FTL files to generate word
- Codeforces Round #620 (Div. 2)ABC
- QT layout manager [qvboxlayout, qhboxlayout, qgridlayout]
- A number of individual stocks in Hong Kong stocks performed actively, triggering investors' speculation and concern about the recovery of the Hong Kong stock market
猜你喜欢

Another breakthrough! Alibaba cloud enters the Gartner cloud AI developer service Challenger quadrant

Wechat applet: time selector for the estimated arrival date of the hotel

官方零基础入门 Jetpack Compose 的中文课程来啦

Interface ownership dispute

Performance test bottleneck tuning in 10 minutes! If you want to enter a large factory, you must know

Easyplayer mobile terminal plays webrtc protocol for a long time. Pressing the play page cannot close the "about us" page

混沌工程在云原生中间件稳定性治理中的实践分享

Mathematical analysis_ Certification_ Chapter 1: the union of countable sets is countable

DataNode进入Stale状态问题排查
![[go]沙盒环境下调用支付宝扫码支付](/img/d4/c6d72a697bc08f69f11121a15109b3.png)
[go]沙盒环境下调用支付宝扫码支付
随机推荐
Is it cost-effective to buy a long-term financial product?
[go] calling Alipay to scan code for payment in a sandbox environment
Look, this is the principle analysis of modulation and demodulation! Simulation documents attached
Hapoxy cluster service setup
Digital twin excavator of Tupu software realizes remote control
EasyPlayer移动端播放webrtc协议时长按播放页面无法关闭“关于我们”页面
mysql-选择使用Repeatable read的原因
官方零基础入门 Jetpack Compose 的中文课程来啦!
《AN4190应用笔记 天线选择指南》——天线理论2
How to select an oscilloscope? These 10 points must be considered!
Practice sharing of chaos engineering in stability management of cloud native Middleware
股票网上开户及开户流程怎样?在线开户安全么?
What are the inductance parameters? How to choose inductance?
右腿驱动电路原理?心电采集必备,有仿真文件!
Innovative technology leader! Huawei cloud gaussdb won the 2022 authoritative award in the field of cloud native database
查数据库中每张表的大小
Counter attack by flour dregs: MySQL 66 question! Suggested collection
【网络通信 -- WebRTC】WebRTC 源码分析 -- PacingController 相关知识点补充
创新技术领航者!华为云GaussDB获颁2022年云原生数据库领域权威奖项
解答02:Smith圓為什麼能“上感下容 左串右並”?