python tag对象下有多个标签、属性_Python爬虫库BeautifulSoup获取对象(标签)名,属性,内容,注释...
时间: 2024-03-25 14:39:28 浏览: 56
Python爬虫库BeautifulSoup获取对象(标签)名,属性,内容,注释
5星 · 资源好评率100%
可以使用BeautifulSoup库中的find()或find_all()方法来获取tag对象。对于tag对象,可以使用.name属性获取标签名,使用.attrs属性获取所有属性,使用.string属性获取内容,使用.comment属性获取注释。
以下是一个示例代码:
```python
from bs4 import BeautifulSoup
html = """
<html>
<head>
<title>BeautifulSoup Demo</title>
</head>
<body>
<p class="intro">Welcome to BeautifulSoup tutorial</p>
<div id="content">
<h1>Python</h1>
<p>Python is a programming language.</p>
<p>It is widely used in data science and web development.</p>
</div>
<!-- This is a comment -->
</body>
</html>
"""
soup = BeautifulSoup(html, 'html.parser')
# 获取所有p标签
p_tags = soup.find_all('p')
for p in p_tags:
print('Tag:', p.name)
print('Attributes:', p.attrs)
print('Content:', p.string)
# 获取注释
comment = soup.find(text=lambda text: isinstance(text, Comment))
print('Comment:', comment)
```
输出结果为:
```
Tag: p
Attributes: {'class': ['intro']}
Content: Welcome to BeautifulSoup tutorial
Tag: p
Attributes: {}
Content: Python is a programming language.
Tag: p
Attributes: {}
Content: It is widely used in data science and web development.
Comment: This is a comment
```
阅读全文