首页python爬取古诗词

python爬取古诗词

时间: 2024-11-12 11:17:03 浏览: 51

Python 爬取古诗词通常涉及到网络爬虫技术，可以利用 Python 的一些库如 requests、BeautifulSoup 或者更高级的 Scrapy 框架。以下是一个简单的步骤： 1. **导入库**：首先，你需要安装必要的库，比如 `requests` 用于发送 HTTP 请求，`BeautifulSoup` 用于解析 HTML。 ```python import requests from bs4 import BeautifulSoup ``` 2. **发送请求**：发送一个 GET 请求到包含古诗词的网页 URL。 ```python url = "https://www.example.com/classic_poems" # 替换为你想爬取的古诗词网站地址 response = requests.get(url) ``` 3. **解析HTML**：使用 BeautifulSoup 解析返回的 HTML 内容，找到包含诗词的部分。 ```python soup = BeautifulSoup(response.text, 'html.parser') poem_elements = soup.find_all('div', class_='poem') # 根据实际网页结构定位诗词元素 ``` 4. **提取诗词内容**：循环遍历诗词元素，提取出诗词的文字。 ```python poems = [] for poem in poem_elements: title = poem.find('h2', class_='title').text content = poem.find('p', class_='content').text poems.append({ 'title': title, 'content': content }) ``` 5. **存储数据**：将抓取的诗词信息保存到文件或数据库中。 ```python with open('poems.txt', 'w', encoding='utf-8') as f: for poem in poems: f.write(f'{title} - {content}\n') ```

阅读全文