当前位置：网站首页>14 BS object Node name Name attrs string get node name attribute content

14 BS object Node name Name attrs string get node name attribute content

2022-06-25 07:23:00 【Andy Python learning notes】

14 bs object . The name of the node .name attrs string Get node name attribute Content

14.1 Extract node name attribute Method of content

tag [tæɡ]: label .
attr： attribute .
string [strɪŋ]： character string .

Insert picture description here

1. Get node name

Grammar format ：bs object . The name of the node .name
Data type returned ： character string

from bs4 import BeautifulSoup
html_str = """<p align="center"><strong> It should be green, fat, red and thin .</strong></p>"""
bs_duixiang = BeautifulSoup(html_str,"lxml")

#  obtain p Name of node 
print(bs_duixiang.p.name)
print(type(bs_duixiang.p.name))

【 Terminal output 】

p
<class 'str'>

Output after running the code p Is the node name , The data type is string .

2. Get node properties

Grammar format ：bs object . The name of the node .attrs
Data type returned ： Dictionaries

from bs4 import BeautifulSoup
html_str = """<p align="center"><strong> It should be green, fat, red and thin .</strong></p>"""
bs_duixiang = BeautifulSoup(html_str,"lxml")

#  obtain p Properties of a node 
print(bs_duixiang.p.attrs)
print(type(bs_duixiang.p.attrs))

【 Terminal output 】

{'align': 'center'}
<class 'dict'>

Output after running the code align': 'center Attribute for node , The data type is dictionary .

align[əˈlaɪn]： Alignment mode .
center[ˈsentə]： In the middle .

align Represents the code attribute name .
center Represents the property value .

3. Get node content

Grammar format ：bs object . The name of the node .string
Data type returned ： Traversable string objects .

from bs4 import BeautifulSoup
html_str = """<p align="center"><strong> It should be green, fat, red and thin .</strong></p>"""
bs_duixiang = BeautifulSoup(html_str,"lxml")

#  obtain p Content of node 
print(bs_duixiang.p.string)
print(type(bs_duixiang.p.string))

【 Terminal output 】

 It should be green, fat, red and thin .
<class 'bs4.element.NavigableString'>

14.2 Practice

#  Declare a string variable , Storage part HTML Code 
html_str = """ <div id="ArtContent"> <h1> Appreciation of classical poems by Li Qingzhao ——《 Like a dream 》</h1> </div> <p align="center"><strong> Last night, it was windy ,</strong></p> <p align="center"><strong> Deep sleep does not eliminate the wine ,</strong></p> <p align="center"><strong> Let's ask the roller shutter ,</strong></p> <p align="center"><strong> But the Begonia is still .</strong></p> <p align="center"><strong> To know whether ,</strong></p> <p align="center"><strong> To know whether ,</strong></p> <p align="center"><strong> It should be green, fat, red and thin .</strong></p> <a href="https://www.diyifanwen.com/m" target="_blank" class="print-link"> """
#  step 1： from bs4  Import... In the library BeautifulSoup class 
from bs4 import BeautifulSoup

#  step 2： Pass in the parameter , Instantiation BeautifulSoup class 
#  Parameters 1 Is to be parsed HTML character string 
#  Parameters 2 It's a parser （ Here we use lxml Parser ）
#  After instantiation, you get a BeautifulSoup object 
# bs_duixiang = <class 'bs4.BeautifulSoup'>
bs_duixiang  = BeautifulSoup(html_str, 'lxml')
print(" After parsing, the parser gets a BeautifulSoup object ：")
print(type(bs_duixiang ),'\n')

#  step 3：bs object .tag Name acquisition tag object 
print(" The extracted node data type is tag object ：")
print(" The first... Is extracted by default p node ")
print(bs_duixiang.p,'\n')

#  step 4：bs object . The name of the node .name Extract node label name 
print("p The name of the node is ：")
print(bs_duixiang.p.name,'\n')

#  step 4：bs object . The name of the node .attrs Extract node label attributes 
print("p The attribute of the node is ：")
print(bs_duixiang.p.attrs,'\n')

#  step 4：bs object . The name of the node .string Extract the content of the node label 
print("p The content of the node is ：")
print(bs_duixiang.p.string,'\n')

print("name The data type of is ：",type(bs_duixiang.p.name))
print("attrs The data type of is ：",type(bs_duixiang.p.attrs))
print("string The data type of is ：",type(bs_duixiang.p.string))

【 Terminal output 】

 After parsing, the parser gets a BeautifulSoup object ：
<class 'bs4.BeautifulSoup'> 

 The extracted node data type is tag object ：
 The first... Is extracted by default p node 
<p align="center"><strong> Last night, it was windy ,</strong></p> 

p The name of the node is ：
p 

p The attribute of the node is ：
{'align': 'center'} 

p The content of the node is ：
 Last night, it was windy , 

name The data type of is ： <class 'str'>
attrs The data type of is ： <class 'dict'>
string The data type of is ： <class 'bs4.element.NavigableString'>