python爬虫爬取小说的代码
时间: 2023-03-27 10:01:54 浏览: 197
你可以使用 Python 的 requests 和 BeautifulSoup 库来实现爬取小说的代码。具体实现方法可以参考以下代码:
import requests
from bs4 import BeautifulSoup
# 设置请求头,模拟浏览器访问
headers = {
'User-Agent': 'Mozilla/5. (Windows NT 10.; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58..3029.110 Safari/537.3'}
# 发送请求,获取小说网页内容
url = 'https://www.xxxxx.com/xxxxx.html'
response = requests.get(url, headers=headers)
response.encoding = 'utf-8'
html = response.text
# 解析网页内容,获取小说章节和内容
soup = BeautifulSoup(html, 'html.parser')
chapter_list = soup.find_all('div', class_='chapter')
for chapter in chapter_list:
chapter_title = chapter.find('h3').text
chapter_content = chapter.find('div', class_='content').text
# 将小说内容写入文件
with open('novel.txt', 'a', encoding='utf-8') as f:
f.write(chapter_title + '\n')
f.write(chapter_content + '\n\n')