当前位置:网站首页>13 `bs_ duixiang. Tag tag ` get a tag object
13 `bs_ duixiang. Tag tag ` get a tag object
2022-06-24 00:50:00 【Andy Python】
13 bs_duixiang.tag label Get one tag object
13.1 BeautifulSoup Class to extract data
The function of selector : operation BeautifulSoup object , lookup 、 Positioning elements , And extract the data .

13.2 Node selector
1. What is a node
<h1> First level title </h1>
<h2> Secondary title </h2>
<h3> Three level title </h3>
<p> I'm a paragraph </p>
The above code is a section HTML Code .h Represents a title label . The title tag has a total of 6 individual , from h1 To h6, The greater the number , The smaller the font size .p Indicates a paragraph label , Used to segment a paragraph , Monopolize one line on the web page .
stay HTML In the code, I put hp be called HTML label .
stay Python in , We put hp It's called a node label , namely tag label .
【 reminder 】 This statement is more convenient for beginners to understand nodes , It is different from many official textbooks , For reference only .
2. Extract node
Grammar format :bs object .tag name
tag Refers to the node name .
Return value : Node object .
use html.parser The parser extracts h2 node
html_str = """
<h2> Farewell my concubine </h2>
<span> In mainland China 、 Hong Kong, China, </span>
<span>171 minute </span>
<h3> synopsis </h3>
<a href="https://maoyan.com/"></a>
"""
# step 1: from bs4 Import... In the library BeautifulSoup class
from bs4 import BeautifulSoup
# step 2: Pass in the parameter , Instantiation BeautifulSoup class
# Parameters 1 Is to be parsed HTML character string
# Parameters 2 It's a parser ( Here we use html.parser Parser )
# After instantiation, you get a BeautifulSoup object
# bs_duixiang = <class 'bs4.BeautifulSoup'>
bs_duixiang = BeautifulSoup(html_str, 'html.parser')
print(" After parsing, the parser gets a BeautifulSoup object :")
print(type(bs_duixiang ),'\n')
# bs object .tag Name extraction node
print(" Extract here h2 node ")
print(bs_duixiang.h2,'\n')
print(" The extracted node data type is tag object :")
print(type(bs_duixiang .h2))
【 Terminal output 】
After parsing, the parser gets a BeautifulSoup object :
<class 'bs4.BeautifulSoup'>
Extract here h2 node
<h2> Farewell my concubine </h2>
The extracted node data type is tag object :
<class 'bs4.element.Tag'>
After running the code , Successfully output h2 node .
use lxml The parser extracts span node
html_str = """
<h2> Farewell my concubine </h2>
<span> In mainland China 、 Hong Kong, China, </span>
<span>171 minute </span>
<h3> synopsis </h3>
<a href="https://maoyan.com/"></a>
"""
# step 1: from bs4 Import... In the library BeautifulSoup class
from bs4 import BeautifulSoup
# step 2: Pass in the parameter , Instantiation BeautifulSoup class
# Parameters 1 Is to be parsed HTML character string
# Parameters 2 It's a parser ( Here we use lxml Parser )
# After instantiation, you get a BeautifulSoup object
# bs_duixiang = <class 'bs4.BeautifulSoup'>
bs_duixiang = BeautifulSoup(html_str, 'lxml')
print(" After parsing, the parser gets a BeautifulSoup object :")
print(type(bs_duixiang ),'\n')
# bs object .tag Name extraction node
print(" Extract here span node ")
print(bs_duixiang.span,'\n')
print(" The extracted node data type is tag object :")
print(type(bs_duixiang .span))
【 Terminal output 】
After parsing, the parser gets a BeautifulSoup object :
<class 'bs4.BeautifulSoup'>
Extract here span node
<span> In mainland China 、 Hong Kong, China, </span>
The extracted node data type is tag object :
<class 'bs4.element.Tag'>
Above html_str There is... In the character 2 individual label .
After the code runs successfully , We extracted In mainland China 、 Hong Kong, China, label .
namely 2 individual Number... In the label 1 individual .
That's because of dang html There are multiple identical nodes in the code , The node selector will only extract the 1 Nodes .
13.3 summary

边栏推荐
- Do280openshift access control -- manage projects and accounts
- LSF opens job idle information to view the CPU time/elapse time usage of the job
- After the deployment of Beidou navigation system, why didn't we launch a high-precision map similar to Google maps?
- Pad User Guide
- 社招面试必不可少——《1000 道互联网大厂 Android工程师面试题》
- C语言:递归实现N的阶乘
- ARM学习(7) symbol 符号表以及调试
- Tiktok practice ~ one click registration and login process of mobile phone number and password (restrict mobile terminal login)
- The industrial Internet era will be realized by products, technologies and models derived from the industry itself
- [ICPR 2021] tiny object detection in aerial images
猜你喜欢

【CVPR 2022】高分辨率小目标检测:Cascaded Sparse Query for Accelerating High-Resolution Smal Object Detection

阿里巴巴面试题:多线程相关

C language: structure array implementation to find the lowest student record

Android Aidl: cross process call service (Aidl service), kotlininvoke function

Alibaba interview question: multi thread related

13 `bs_duixiang.tag标签`得到一个tag对象

实时计算框架:Flink集群搭建与运行机制
![[Hongke case] how can 3D data become operable information Object detection and tracking](/img/d8/ccda595db67b66eb01f3d55956f4cb.png)
[Hongke case] how can 3D data become operable information Object detection and tracking

What do NLP engineers do? What is the work content?

What problems need to be solved by MES management system in the era of intelligent manufacturing
随机推荐
C language: sorting with custom functions
Application configuration management, basic principle analysis
Mip-NeRF:抗混叠的多尺度神经辐射场ICCV2021
GNN upper edge distributor! Instead of trying to refine pills, you might as well give your GNN some tricks
[day 25] given an array of length N, count the number of occurrences of each number | count hash
现在网上开股票账户安全吗?选择国有券商,最快8分钟开户成功
股票网上开户安全吗?需要满足什么条件?
阿里巴巴面试题:多线程相关
What should I pay attention to in the interview of artificial intelligence technology?
Tiktok practice ~ one click registration and login process of mobile phone number and password (restrict mobile terminal login)
[iccv workshop 2021] small target detection based on density map: coarse-grained density map guided object detection in aerial images
一次 MySQL 误操作导致的事故,「高可用」都顶不住了!
DML operation
JS language precision problem
Interview notes for Android outsourcing workers for 3 years. You still need to go to a large factory to learn and improve when you have the opportunity. Interview questions for Android Development Int
机器学习中 TP FP TN FN的概念
Drag and drop report design - new features of jimureport 1.4.0
numpy. linalg. Lstsq (a, B, rcond=-1) parsing
skywalking 安装部署实践
应用配置管理,基础原理分析