首页爬取https://news.sina.cn/zt_d/yiqing0121中全国疫情每月份的全部累计数据并且正确输出

爬取https://news.sina.cn/zt_d/yiqing0121中全国疫情每月份的全部累计数据并且正确输出

时间: 2023-12-23 15:02:49 浏览: 82

全国各省份疫情数据超级全面

5星 · 资源好评率100%

信息非常全

好的，我会使用Python中的requests和BeautifulSoup库来爬取数据，并使用pandas库进行数据处理和输出。首先，我们需要先安装requests、BeautifulSoup和pandas库。可以使用以下命令来安装： ``` pip install requests pip install beautifulsoup4 pip install pandas ``` 然后，我们可以开始编写代码： ```python import requests from bs4 import BeautifulSoup import pandas as pd url = "https://news.sina.cn/zt_d/yiqing0121" # 发送请求获取网页内容 response = requests.get(url) html = response.content.decode("utf-8") # 使用BeautifulSoup解析网页内容 soup = BeautifulSoup(html, "html.parser") # 获取每个月份的累计数据表格 tables = soup.find_all("table") # 遍历每个表格，获取数据 data = [] for table in tables: # 获取月份 month = table.find_previous_sibling("h2").text.strip() # 获取表格中的数据 rows = table.find_all("tr") for row in rows[1:]: cells = row.find_all("td") province = cells[0].text.strip() confirmed = cells[1].text.strip() cured = cells[2].text.strip() dead = cells[3].text.strip() # 将数据添加到列表中 data.append([month, province, confirmed, cured, dead]) # 将数据转换为DataFrame格式，并输出到CSV文件中 df = pd.DataFrame(data, columns=["月份", "省份", "累计确诊", "治愈", "死亡"]) df.to_csv("全国疫情每月份累计数据.csv", index=False) print("数据已成功输出到CSV文件中！") ``` 这段代码会先发送请求获取网页内容，然后使用BeautifulSoup解析网页内容，找到每个月份的累计数据表格。然后，遍历每个表格，将表格中的数据添加到一个列表中。最后，将数据转换为DataFrame格式，并输出到CSV文件中。

阅读全文