数据集exercise2_employee中，有指标、就业人员(万人)、第一产业就业人员(万人)、第二产业就业人员(万人)、第三产业就业人员(万人)五个字段，请根据其数据的内容完成以下操作。（1）使用pandas库读取数据集中三个产业就业人员数据；（2）绘制2000-2019年各产业就业人员散点图，并简要分析数据分布；（3）绘制2000-2019年各产业就业人员折线图，并简要分析变化趋势；（4）绘制2000-2019年各产业就业人员堆积柱形图，并简要分析；（5）绘制2019年各产业就业人员饼图；（6）绘制2000-2019年各产业就业人员总人数箱线图，并简要分析其特征；

时间: 2024-01-06 12:04:36 浏览: 192

1. 使用pandas库读取数据集中三个产业就业人员数据 ```python import pandas as pd employee_df = pd.read_csv('exercise2_employee.csv', encoding='utf-8') industry1 = employee_df['第一产业就业人员(万人)'] industry2 = employee_df['第二产业就业人员(万人)'] industry3 = employee_df['第三产业就业人员(万人)'] ``` 2. 绘制2000-2019年各产业就业人员散点图，并简要分析数据分布 ```python import matplotlib.pyplot as plt years = employee_df['指标'] plt.scatter(years, industry1, c='red', label='第一产业') plt.scatter(years, industry2, c='blue', label='第二产业') plt.scatter(years, industry3, c='green', label='第三产业') plt.xlabel('年份') plt.ylabel('就业人员(万人)') plt.legend() plt.show() ``` 散点图展示了三个产业的就业人员数目随时间的变化情况，可以看出第三产业的就业人员数量普遍高于第一产业和第二产业，而且增长速度较快。 3. 绘制2000-2019年各产业就业人员折线图，并简要分析变化趋势 ```python plt.plot(years, industry1, c='red', label='第一产业') plt.plot(years, industry2, c='blue', label='第二产业') plt.plot(years, industry3, c='green', label='第三产业') plt.xlabel('年份') plt.ylabel('就业人员(万人)') plt.legend() plt.show() ``` 折线图展示了三个产业的就业人员数目随时间的变化趋势，可以看出三个产业的就业人员数量都呈现逐年增长的趋势，其中第三产业的增长速度更快。 4. 绘制2000-2019年各产业就业人员堆积柱形图，并简要分析 ```python import numpy as np ind = np.arange(len(years)) width = 0.5 p1 = plt.bar(ind, industry1, width) p2 = plt.bar(ind, industry2, width, bottom=industry1) p3 = plt.bar(ind, industry3, width, bottom=industry2) plt.xlabel('年份') plt.ylabel('就业人员(万人)') plt.xticks(ind, years) plt.legend((p1[0], p2[0], p3[0]), ('第一产业', '第二产业', '第三产业')) plt.show() ``` 堆积柱形图展示了三个产业的就业人员数目随时间的变化情况，并且清晰地展现了各个产业之间的加总关系，可以看出第三产业的就业人员数量一直是三个产业中最高的。 5. 绘制2019年各产业就业人员饼图 ```python year_2019 = employee_df[employee_df['指标'] == 2019] industry1_2019 = year_2019['第一产业就业人员(万人)'].values[0] industry2_2019 = year_2019['第二产业就业人员(万人)'].values[0] industry3_2019 = year_2019['第三产业就业人员(万人)'].values[0] labels = ['第一产业', '第二产业', '第三产业'] sizes = [industry1_2019, industry2_2019, industry3_2019] colors = ['red', 'blue', 'green'] plt.pie(sizes, labels=labels, colors=colors, autopct='%1.1f%%', startangle=90) plt.axis('equal') plt.show() ``` 饼图展示了2019年各产业就业人员数目的占比情况，可以看出第三产业的就业人员数量占比最高。 6. 绘制2000-2019年各产业就业人员总人数箱线图，并简要分析其特征 ```python total_industry = [industry1 + industry2 + industry3 for industry1, industry2, industry3 in zip(industry1, industry2, industry3)] plt.boxplot(total_industry, labels=years) plt.xlabel('年份') plt.ylabel('就业人员总人数(万人)') plt.show() ``` 箱线图展示了各个年份三个产业就业人员总人数的分布情况，可以看出就业人员总人数呈现逐年上升的趋势，其中2019年的就业人员总人数最多。同时，箱体上沿和箱体下沿之间的距离逐年增大，说明数据分布的离散程度在逐年增大。

阅读全文

相关推荐

anw_Exercise3.rar_exercise3_ssd2_ssd2 anw_exercise3

Answer_Exercise5.rar_SSD2 exercise 5_The Answer

Exercise-5-answer.zip_SSD2 exercise 5_answers_exercise 5 ssd2

exercise2_2_2.py

app.rar_D7 exercise4_Exercise4_ssd7_ssd7 exercise_ssd7 exercise4

exercise1_ssd3exercise1_源码

SSD3-Exercise6.rar_Exercise6_exercise6 ssd3_ssd3_ssd3 exercise6_

exercise1_soldier866_exercise1_win32application_画图软件_

ssd3-exercise2.rar_exercise2 ssd3_ssd3_ssd3 exercise 2_ssd3 exer

Exercise5_22.rar_Exercise5_22_appearance6zi

optional_exercise_3.rar_Optional Exercise _Optional exercise 3_e

Exercise2_1_1.cpp

Exercise2_with_Solutions.ipynb

ssd4_exercise2_vb

ssd8_计算机网络_exercise2_答案

Exercise9_2

exercise3_ssd3exercise_

SSD4Exercise6.rar_SSD4 exercise6_SSD4Exercise_exercise6 ssd4_ssd

exercise_number_5.rar_Catfish_SSD1 Exercise 5_number

Exercise_2_2_3_number_5_matlabsolarenergy_源码

大家在看

TwinSAFE EL6900 安全模块基础使用指南（针对TC3.1.4020.0版本）.pdf

南京工业大学Python程序设计语言题库及答案

泊松分布MATLAB代码-RJNS3D_VER_1.1:离散断裂网络建模

Skill.wz_冒险岛079WZ_079skill.wz_冒险岛的_冒险岛Skill.wz_冒险岛服务端_

Multisim里的NPN三极管参数资料大全.docx

最新推荐

自动删除hal库spendsv、svc以及systick中断

流量主小程序 多功能工具箱小程序源码-操作简单实用.zip

世界地图Shapefile文件解析与测试指南

Python环境监控高可用构建：可靠性增强的策略

需要在matlab当中批量导入表格数据的指令

Sqlcipher 3.4.0版本发布，优化SQLite兼容性

Python环境监控性能监控与调优：专家级技巧全集

simulinlk怎么插入线

Java项目中standard.jar压缩包的处理与使用

Python环境监控动态配置：随需应变的维护艺术

流量主小程序多功能工具箱小程序源码-操作简单实用.zip