爬取该网页https://www.shicimingju.com/shicimark/songcisanbaishou.html《渔家傲.塞下秋来风景异》诗词内容的代码
时间: 2024-09-30 20:10:03 浏览: 35
爬取网页内容通常需要使用网络请求库(如Python的`requests`或`selenium`)配合HTML解析库(如BeautifulSoup、lxml等)。以下是一个简单的Python示例,使用`requests`和`beautifulsoup4`来获取指定URL的诗词内容:
```python
import requests
from bs4 import BeautifulSoup
def get_poem_content(url):
# 发送HTTP GET请求
response = requests.get(url)
# 检查请求是否成功
if response.status_code == 200:
# 解析HTML内容
soup = BeautifulSoup(response.text, 'html.parser')
# 找到包含诗词的元素(这通常取决于网站结构)
poem_text = soup.find('div', class_='poem-content') or soup.find('p', id='song-text') # 假设诗词在某个class或id标签下
# 提取并返回诗词内容
if poem_text:
return poem_text.get_text()
else:
print(f"未找到诗词内容,检查HTML结构或网站是否有变动.")
else:
print(f"请求失败,状态码:{response.status_code}")
url = "https://www.shicimingju.com/shicimark/songcisanbaishou.html"
poem = get_poem_content(url)
print(poem)
#
阅读全文