当前位置:网站首页>14 BS object Node name Name attrs string get node name attribute content

14 BS object Node name Name attrs string get node name attribute content

2022-06-25 07:23:00 Andy Python learning notes

14 bs object . The name of the node .name attrs string Get node name attribute Content

14.1 Extract node name attribute Method of content

tag [tæɡ]: label .
attr: attribute .
string [strɪŋ]: character string .

 Insert picture description here

1. Get node name

Grammar format :bs object . The name of the node .name
Data type returned : character string

from bs4 import BeautifulSoup
html_str = """<p align="center"><strong> It should be green, fat, red and thin .</strong></p>"""
bs_duixiang = BeautifulSoup(html_str,"lxml")

#  obtain p Name of node 
print(bs_duixiang.p.name)
print(type(bs_duixiang.p.name))

【 Terminal output 】

p
<class 'str'>

Output after running the code p Is the node name , The data type is string .

2. Get node properties

Grammar format :bs object . The name of the node .attrs
Data type returned : Dictionaries

from bs4 import BeautifulSoup
html_str = """<p align="center"><strong> It should be green, fat, red and thin .</strong></p>"""
bs_duixiang = BeautifulSoup(html_str,"lxml")

#  obtain p Properties of a node 
print(bs_duixiang.p.attrs)
print(type(bs_duixiang.p.attrs))

【 Terminal output 】

{'align': 'center'}
<class 'dict'>

Output after running the code align': 'center Attribute for node , The data type is dictionary .

align[əˈlaɪn]: Alignment mode .
center[ˈsentə]: In the middle .

align Represents the code attribute name .
center Represents the property value .

3. Get node content

Grammar format :bs object . The name of the node .string
Data type returned : Traversable string objects .

from bs4 import BeautifulSoup
html_str = """<p align="center"><strong> It should be green, fat, red and thin .</strong></p>"""
bs_duixiang = BeautifulSoup(html_str,"lxml")

#  obtain p Content of node 
print(bs_duixiang.p.string)
print(type(bs_duixiang.p.string))

【 Terminal output 】

 It should be green, fat, red and thin .
<class 'bs4.element.NavigableString'>

14.2 Practice

#  Declare a string variable , Storage part HTML Code 
html_str = """ <div id="ArtContent"> <h1> Appreciation of classical poems by Li Qingzhao ——《 Like a dream 》</h1> </div> <p align="center"><strong> Last night, it was windy ,</strong></p> <p align="center"><strong> Deep sleep does not eliminate the wine ,</strong></p> <p align="center"><strong> Let's ask the roller shutter ,</strong></p> <p align="center"><strong> But the Begonia is still .</strong></p> <p align="center"><strong> To know whether ,</strong></p> <p align="center"><strong> To know whether ,</strong></p> <p align="center"><strong> It should be green, fat, red and thin .</strong></p> <a href="https://www.diyifanwen.com/m" target="_blank" class="print-link"> """
#  step 1: from bs4  Import... In the library BeautifulSoup class 
from bs4 import BeautifulSoup

#  step 2: Pass in the parameter , Instantiation BeautifulSoup class 
#  Parameters 1 Is to be parsed HTML character string 
#  Parameters 2 It's a parser ( Here we use lxml Parser )
#  After instantiation, you get a BeautifulSoup object 
# bs_duixiang = <class 'bs4.BeautifulSoup'>
bs_duixiang  = BeautifulSoup(html_str, 'lxml')
print(" After parsing, the parser gets a BeautifulSoup object :")
print(type(bs_duixiang ),'\n')

#  step 3:bs object .tag Name acquisition tag object 
print(" The extracted node data type is tag object :")
print(" The first... Is extracted by default p node ")
print(bs_duixiang.p,'\n')

#  step 4:bs object . The name of the node .name Extract node label name 
print("p The name of the node is :")
print(bs_duixiang.p.name,'\n')

#  step 4:bs object . The name of the node .attrs Extract node label attributes 
print("p The attribute of the node is :")
print(bs_duixiang.p.attrs,'\n')

#  step 4:bs object . The name of the node .string Extract the content of the node label 
print("p The content of the node is :")
print(bs_duixiang.p.string,'\n')

print("name The data type of is :",type(bs_duixiang.p.name))
print("attrs The data type of is :",type(bs_duixiang.p.attrs))
print("string The data type of is :",type(bs_duixiang.p.string))

【 Terminal output 】

 After parsing, the parser gets a BeautifulSoup object :
<class 'bs4.BeautifulSoup'> 

 The extracted node data type is tag object :
 The first... Is extracted by default p node 
<p align="center"><strong> Last night, it was windy ,</strong></p> 

p The name of the node is :
p 

p The attribute of the node is :
{'align': 'center'} 

p The content of the node is :
 Last night, it was windy , 

name The data type of is : <class 'str'>
attrs The data type of is : <class 'dict'>
string The data type of is : <class 'bs4.element.NavigableString'>

14.3 summary

 Insert picture description here

原网站

版权声明
本文为[Andy Python learning notes]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/176/202206250504282180.html