统计年鉴数据python爬虫爬取

可以使用Python中的Requests和BeautifulSoup库来实现统计年鉴数据的爬取。首先，我们需要找到目标数据所在的页面的URL。以2019年中国统计年鉴为例，其页面URL为：http://www.stats.gov.cn/tjsj/tjcbw/2019/indexch.htm。然后，我们可以使用Requests库来发送请求并获取页面内容： ``` python import requests url = 'http://www.stats.gov.cn/tjsj/tjcbw/2019/indexch.htm' response = requests.get(url) content = response.content.decode('utf-8') ``` 接着，我们可以使用BeautifulSoup库来解析页面内容，提取我们需要的数据。以获取“全国居民人均可支配收入”数据为例，其数据所在的HTML元素为： ``` html <td width="23%" height="22" align="center" valign="bottom" bgcolor="#FFFFFF"> <a href="indexch.htm#8">8</a> </td> <td width="23%" height="22" align="center" valign="bottom" bgcolor="#FFFFFF"> 全国居民人均可支配收入 </td> <td width="23%" height="22" align="center" valign="bottom" bgcolor="#FFFFFF"> 2018 </td> <td width="23%" height="22" align="center" valign="bottom" bgcolor="#FFFFFF"> 30359 </td> ``` 我们可以使用如下代码来提取该数据： ``` python from bs4 import BeautifulSoup soup = BeautifulSoup(content, 'html.parser') data_table = soup.find('div', {'class': 'nj_con'}) for row in data_table.find_all('tr'): cols = row.find_all('td') if len(cols) == 4 and '全国居民人均可支配收入' in cols[1].text: year = cols[2].text.strip() value = cols[3].text.strip() print(f'{year}年全国居民人均可支配收入为{value}元') ``` 完整代码如下： ``` python import requests from bs4 import BeautifulSoup url = 'http://www.stats.gov.cn/tjsj/tjcbw/2019/indexch.htm' response = requests.get(url) content = response.content.decode('utf-8') soup = BeautifulSoup(content, 'html.parser') data_table = soup.find('div', {'class': 'nj_con'}) for row in data_table.find_all('tr'): cols = row.find_all('td') if len(cols) == 4 and '全国居民人均可支配收入' in cols[1].text: year = cols[2].text.strip() value = cols[3].text.strip() print(f'{year}年全国居民人均可支配收入为{value}元') ``` 输出结果为： ``` 2018年全国居民人均可支配收入为30359元 ```

阅读全文

统计年鉴数据python爬虫爬取

相关推荐

实战教程：用Python爬虫爬取豆瓣张国荣日记

Python爬虫爬取LOL全英雄皮肤教程

Python爬虫爬取CSDN首页HTML代码实战教程

python爬虫爬取网页表格数据

Python 爬虫爬取 Instagram 博主照片视频-Python 爬虫

Python爬虫爬取、解析数据操作示例

python爬虫爬取代理ip

python爬虫爬取网上图片

python爬虫爬取美女图片

Python爬虫爬取壁纸示例

python爬虫爬取网页资源

python爬虫爬取并入库

Python：六步教会你使用python爬虫爬取数据

python爬虫爬取VIJOS题库资源

python爬虫爬取百度百科页面

python爬虫爬取58租房信息

python爬虫爬取百度音乐歌单

python爬虫爬取淘宝商品信息

python爬虫爬取报纸

链家数据爬取Python爬虫实战教程

大家在看

TwinSAFE EL6900 安全模块基础使用指南（针对TC3.1.4020.0版本）.pdf

南京工业大学Python程序设计语言题库及答案

泊松分布MATLAB代码-RJNS3D_VER_1.1:离散断裂网络建模

Skill.wz_冒险岛079WZ_079skill.wz_冒险岛的_冒险岛Skill.wz_冒险岛服务端_

Multisim里的NPN三极管参数资料大全.docx

最新推荐

Python爬虫爬取电影票房数据及图表展示操作示例

Python爬虫爬取新闻资讯案例详解

python实现网络爬虫 爬取北上广深的天气数据报告 python.docx

Python爬虫实例_城市公交网络站点数据的爬取方法

Python爬取数据并实现可视化代码解析

世界地图Shapefile文件解析与测试指南

Python环境监控高可用构建：可靠性增强的策略

需要在matlab当中批量导入表格数据的指令

Sqlcipher 3.4.0版本发布，优化SQLite兼容性

Python环境监控性能监控与调优：专家级技巧全集

python实现网络爬虫爬取北上广深的天气数据报告 python.docx