首页生成爬取csdn数据的代码

生成爬取csdn数据的代码

时间: 2023-10-02 14:12:58 浏览: 101

以下是一个使用 Python 和 requests 库爬取 CSDN 数据的示例代码： ```python import requests from bs4 import BeautifulSoup url = "https://blog.csdn.net/" response = requests.get(url) soup = BeautifulSoup(response.text, "html.parser") articles = soup.find_all("div", {"class": "article-item-box csdn-tracking-statistics"}) for article in articles: title = article.find("h4").text.strip() author = article.find("a", {"class": "nickname"}).text.strip() date = article.find("span", {"class": "date"}).text.strip() views = article.find("span", {"class": "read-num"}).text.strip() print(f"Title: {title}\nAuthor: {author}\nDate: {date}\nViews: {views}\n") ``` 该代码将获取 CSDN 首页上的所有文章，并打印出每篇文章的标题、作者、日期和阅读量。你可以根据自己的需要修改代码，例如更改 URL 来爬取其他页面，或者使用正则表达式来提取更多信息。请注意，尊重网站的使用条款和隐私政策，不要过度爬取数据。

阅读全文