爬取起点中文网信息存入csv
时间: 2023-11-08 09:17:20 浏览: 97
python爬取招聘网信息并保存为csv文件
好的,我可以为您提供一个简单的爬虫代码,将起点中文网的小说信息存入CSV文件中。
请确保您已经安装了Python和相关的库,如beautifulsoup4、requests和csv。
以下是代码:
```python
import requests
from bs4 import BeautifulSoup
import csv
# 获取小说信息
def get_novels(url):
novels = []
response = requests.get(url)
soup = BeautifulSoup(response.text, "html.parser")
items = soup.select(".all-img-list li")
for item in items:
title = item.select(".book-mid-info h4 a")[0].text.strip()
author = item.select(".book-mid-info p")[0].text.strip()
novel_url = item.select(".book-img-box a")[0]["href"]
novels.append([title, author, novel_url])
return novels
# 将小说信息保存到CSV文件中
def save_to_csv(novels):
with open("novels.csv", mode="w", encoding="utf-8", newline="") as file:
writer = csv.writer(file)
writer.writerow(["title", "author", "novel_url"])
for novel in novels:
writer.writerow(novel)
if __name__ == "__main__":
url = "https://www.qidian.com/all"
novels = get_novels(url)
save_to_csv(novels)
```
代码中,我们首先通过requests库获取起点中文网所有小说的页面,然后使用beautifulsoup4库解析页面,获取每一本小说的标题、作者和链接信息,最后将所有信息保存到CSV文件中。
您只需要将以上代码复制到Python环境中执行即可。
阅读全文