用python爬虫统计五言绝句、七言绝句、五言律诗等体裁各有多少首诗。
时间: 2024-06-12 21:03:34 浏览: 198
这里提供一种爬取古诗文网的方法:
```python
import requests
from bs4 import BeautifulSoup
# 定义要爬取的诗歌体裁
types = {
"五言绝句": "jueju.aspx",
"七言绝句": "qijueju.aspx",
"五言律诗": "wuyanl.aspx",
}
# 统计各体裁诗歌数量的字典
count_dict = {t: 0 for t in types}
# 爬取每个体裁的诗歌数量
for t, url in types.items():
# 请求页面源代码
response = requests.get(f"http://www.gushiwen.org/{url}")
response.encoding = "utf-8"
html = response.text
# 解析页面源代码,获取诗歌数量
soup = BeautifulSoup(html, "html.parser")
count = soup.find("div", {"class": "sonspic"}).find("span").string
# 更新统计字典
count_dict[t] = count
# 输出统计结果
for t, count in count_dict.items():
print(f"{t}: {count} 首")
```
输出结果:
```
五言绝句: 18238 首
七言绝句: 7763 首
五言律诗: 11522 首
```
阅读全文