首页用python写一个csdn爬虫

用python写一个csdn爬虫

时间: 2024-05-03 21:23:12 浏览: 180

以下是一个基本的CSDN爬虫示例，使用Python和BeautifulSoup库： ```python import requests from bs4 import BeautifulSoup # 设置请求头 headers = { 'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.3'} # 设置要爬取的页面链接 url = 'https://blog.csdn.net/' # 发送请求 response = requests.get(url, headers=headers) # 解析页面 soup = BeautifulSoup(response.content, 'html.parser') # 获取文章列表 articles = soup.find_all('div', class_='article-item-box csdn-tracking-statistics') # 遍历文章列表并输出标题和链接 for article in articles: title = article.h4.a.text.strip() link = article.h4.a['href'] print(title) print(link) ``` 此爬虫将输出CSDN主页上的文章列表，并打印每篇文章的标题和链接。您可以根据需要修改代码，例如，您可以更改URL以爬取特定类别的文章，或者添加其他功能以获取更多信息。

阅读全文