首页使用python写一个csdn的爬虫程序

使用python写一个csdn的爬虫程序

时间: 2024-04-29 15:24:13 浏览: 181

以下是一个简单的爬取CSDN博客的程序示例： ```python import requests from bs4 import BeautifulSoup # 设置请求头，模拟浏览器访问 headers = { 'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.3'} # 设置要爬取的博客地址 url = "https://blog.csdn.net/xxxxxx/article/details/xxxxxx" # 发送请求，获取响应内容 res = requests.get(url, headers=headers) res.encoding = 'utf-8' html = res.text # 解析HTML内容 soup = BeautifulSoup(html, 'html.parser') title = soup.find('title').text content = soup.find('div', class_='markdown_views').text # 输出结果 print("标题：", title) print("内容：", content) ``` 以上代码中，使用了requests和BeautifulSoup库来发送请求、解析HTML内容。需要注意的是，需要设置请求头来模拟浏览器访问，否则可能会被CSDN禁止访问。根据需要爬取的内容不同，可以修改代码中的选择器来获取对应的信息。

阅读全文