首页爬取该网页https://www.shicimingju.com/shicimark/songcisanbaishou.html《渔家傲.塞下秋来风景异》诗词内容的代码

爬取该网页https://www.shicimingju.com/shicimark/songcisanbaishou.html《渔家傲.塞下秋来风景异》诗词内容的代码

时间: 2024-09-30 11:10:03 浏览: 48

爬取网页内容通常需要使用网络请求库（如Python的`requests`或`selenium`）配合HTML解析库（如BeautifulSoup、lxml等）。以下是一个简单的Python示例，使用`requests`和`beautifulsoup4`来获取指定URL的诗词内容： ```python import requests from bs4 import BeautifulSoup def get_poem_content(url): # 发送HTTP GET请求 response = requests.get(url) # 检查请求是否成功 if response.status_code == 200: # 解析HTML内容 soup = BeautifulSoup(response.text, 'html.parser') # 找到包含诗词的元素（这通常取决于网站结构） poem_text = soup.find('div', class_='poem-content') or soup.find('p', id='song-text') # 假设诗词在某个class或id标签下 # 提取并返回诗词内容 if poem_text: return poem_text.get_text() else: print(f"未找到诗词内容，检查HTML结构或网站是否有变动.") else: print(f"请求失败，状态码：{response.status_code}") url = "https://www.shicimingju.com/shicimark/songcisanbaishou.html" poem = get_poem_content(url) print(poem) #

阅读全文