当前位置:网站首页>13 `bs_ duixiang. Tag tag ` get a tag object
13 `bs_ duixiang. Tag tag ` get a tag object
2022-06-25 07:23:00 【Andy Python learning notes】
13 bs_duixiang.tag label Get one tag object
List of articles
13.1 BeautifulSoup Class to extract data
The function of selector : operation BeautifulSoup object , lookup 、 Positioning elements , And extract the data .

13.2 Node selector
1. What is a node
<h1> First level title </h1>
<h2> Secondary title </h2>
<h3> Three level title </h3>
<p> I'm a paragraph </p>
The above code is a section HTML Code .h Represents a title label . The title tag has a total of 6 individual , from h1 To h6, The greater the number , The smaller the font size .p Indicates a paragraph label , Used to segment a paragraph , Monopolize one line on the web page .
stay HTML In the code, I put hp be called HTML label .
stay Python in , We put hp It's called a node label , namely tag label .
【 reminder 】 This statement is more convenient for beginners to understand nodes , It is different from many official textbooks , For reference only .
2. Extract node
Grammar format :bs object .tag name
tag Refers to the node name .
Return value : Node object .
use html.parser The parser extracts h2 node
html_str = """ <h2> Farewell my concubine </h2> <span> In mainland China 、 Hong Kong, China, </span> <span>171 minute </span> <h3> synopsis </h3> <a href="https://maoyan.com/"></a> """
# step 1: from bs4 Import... In the library BeautifulSoup class
from bs4 import BeautifulSoup
# step 2: Pass in the parameter , Instantiation BeautifulSoup class
# Parameters 1 Is to be parsed HTML character string
# Parameters 2 It's a parser ( Here we use html.parser Parser )
# After instantiation, you get a BeautifulSoup object
# bs_duixiang = <class 'bs4.BeautifulSoup'>
bs_duixiang = BeautifulSoup(html_str, 'html.parser')
print(" After parsing, the parser gets a BeautifulSoup object :")
print(type(bs_duixiang ),'\n')
# bs object .tag Name extraction node
print(" Extract here h2 node ")
print(bs_duixiang.h2,'\n')
print(" The extracted node data type is tag object :")
print(type(bs_duixiang .h2))
【 Terminal output 】
After parsing, the parser gets a BeautifulSoup object :
<class 'bs4.BeautifulSoup'>
Extract here h2 node
<h2> Farewell my concubine </h2>
The extracted node data type is tag object :
<class 'bs4.element.Tag'>
After running the code , Successfully output h2 node .
use lxml The parser extracts span node
html_str = """ <h2> Farewell my concubine </h2> <span> In mainland China 、 Hong Kong, China, </span> <span>171 minute </span> <h3> synopsis </h3> <a href="https://maoyan.com/"></a> """
# step 1: from bs4 Import... In the library BeautifulSoup class
from bs4 import BeautifulSoup
# step 2: Pass in the parameter , Instantiation BeautifulSoup class
# Parameters 1 Is to be parsed HTML character string
# Parameters 2 It's a parser ( Here we use lxml Parser )
# After instantiation, you get a BeautifulSoup object
# bs_duixiang = <class 'bs4.BeautifulSoup'>
bs_duixiang = BeautifulSoup(html_str, 'lxml')
print(" After parsing, the parser gets a BeautifulSoup object :")
print(type(bs_duixiang ),'\n')
# bs object .tag Name extraction node
print(" Extract here span node ")
print(bs_duixiang.span,'\n')
print(" The extracted node data type is tag object :")
print(type(bs_duixiang .span))
【 Terminal output 】
After parsing, the parser gets a BeautifulSoup object :
<class 'bs4.BeautifulSoup'>
Extract here span node
<span> In mainland China 、 Hong Kong, China, </span>
The extracted node data type is tag object :
<class 'bs4.element.Tag'>
Above html_str There is... In the character 2 A label .
After the code runs successfully , We extracted it to the mainland of China 、 Hong Kong label .
namely 2 The... Of the tags 1 individual .
That's because of dang html There are multiple identical nodes in the code , The node selector will only extract the 1 Nodes .
13.3 summary

边栏推荐
- Harmony美食菜单界面
- Kubernetes 集群中流量暴露的几种方案
- How do I get red green blue (RGB) and alpha back from a UIColor object?
- 1W字|40 图|硬核 ES 实战
- Lotus windowsost manually triggers space-time proof calculation
- Want to self-study SCM, do you have any books and boards worth recommending?
- [he doesn't mention love, but every word is love]
- 基於 KubeSphere 的分級管理實踐
- Reading sensor data with GPIO analog SPI interface
- 【UVM入门 ===> Episode_9 】~ 寄存器模型、寄存器模型的集成、寄存器模型的常规方法、寄存器模型的应用场景
猜你喜欢

Classic paper in the field of character recognition: aster

Event registration | Apache pulsar x kubesphere online meetup is coming

Error reported during vivado simulation common 17-39

为什么要“除夕”,原来是内存爆了!

稳压二极管的原理,它有什么作用?
![[Shangshui Shuo series] day 5](/img/83/28834addd8198d4bcdc718eccf5754.png)
[Shangshui Shuo series] day 5

Kubernetes 集群中流量暴露的几种方案

14 bs对象.节点名称.name attrs string 获取节点名称 属性 内容
![[XXL job] the pond is green and the wind is warm. I remember that Yu Zhen first met](/img/fe/864e9d91be2e0afb163cb8496ae0d2.png)
[XXL job] the pond is green and the wind is warm. I remember that Yu Zhen first met

【一起上水硕系列】Day 4
随机推荐
Blue Bridge Cup SCM module code (matrix key) (code + comments)
太美的承诺因为太年轻
College entrance examination voluntary filling, why is the major the last consideration?
Analysis on the trend of the number of national cinemas, film viewers and average ticket prices in 2021 [figure]
LabVIEW jump to web page
alphassl通配符证书送一个月
One year's time and University experience sharing with CSDN
Want to self-study SCM, do you have any books and boards worth recommending?
[tool sharing] a software that pays equal attention to appearance and skills
关于硬件问题造成的MCU死机,过来人简单的谈一谈
MySQL (12) -- Notes on changing tables
アルマ / alchemy girl
想买股票去哪个证券公司开户更快更安全
5g private network market is in full swing, and it is crucial to solve deployment difficulties in 2022
几款不错的天气插件
Ppt template of small fresh open class education courseware
Operate cnblogs metaweblog API
Why is true == "true" true in R- Why TRUE == “TRUE” is TRUE in R?
Design a MySQL table for message queue to store message data
Simple and complete steps of vivado project