爬取天气数据并可视化
时间: 2023-12-06 21:38:46 浏览: 114
以下是爬取天气数据并可视化的方法:
1. 爬取天气数据
```python
import requests
from bs4 import BeautifulSoup
import csv
# 获取网页内容
url = 'http://www.weather.com.cn/weather/101010100.shtml'
res = requests.get(url)
res.encoding = 'utf-8'
soup = BeautifulSoup(res.text, 'html.parser')
# 获取城市名称
city = soup.select('.crumbs a')[2].text
# 获取每天的天气信息
days = soup.select('.t .clearfix')
data = []
for day in days:
date = day.select('.day')[0].text
high_temp = day.select('.tem span')[0].text
low_temp = day.select('.tem i')[0].text
weather = day.select('.wea')[0].text
wind = day.select('.win')[0].text.strip()
data.append([date, high_temp, low_temp, weather, wind])
# 将数据存储到csv文件中
with open(f'{city}.csv', 'w', newline='', encoding='utf-8') as f:
writer = csv.writer(f)
writer.writerow(['日期', '最高温', '最低温', '天气', '风向'])
for row in data:
writer.writerow(row)
```
2. 可视化平均气温走势折线图
```python
import pandas as pd
import matplotlib.pyplot as plt
# 读取csv文件
df = pd.read_csv(f'{city}.csv', encoding='utf-8')
# 计算平均温度
df['平均温度'] = (df['最高温'] + df['最低温']) / 2
# 绘制折线图
plt.plot(df['日期'], df['平均温度'], label=city)
plt.legend()
plt.xlabel('日期')
plt.ylabel('平均气温')
plt.title(f'{city}一年平均气温走势')
plt.show()
```
3. 统计各类天气的天数并绘制条形图
```python
# 统计各类天气的天数
weather_count = df['天气'].value_counts()
# 计算适合旅游的城市指数
weather_index = weather_count['多云'] * 0.3 + weather_count['晴'] * 0.4 + weather_count['阴'] * 0.3
# 绘制条形图
weather_count.plot(kind='bar')
plt.xlabel('天气')
plt.ylabel('天数')
plt.title(f'{city}各类天气的天数')
plt.show()
print(f'{city}的适合旅游的城市指数为:{weather_index}')
```
4. 统计每个月的平均气温并绘制折线图
```python
# 将日期转换为月份
df['月份'] = pd.to_datetime(df['日期']).dt.month
# 按月份分组计算平均气温
month_avg_temp = df.groupby('月份')['平均温度'].mean()
# 绘制折线图
plt.plot(month_avg_temp.index, month_avg_temp.values, label=city)
plt.legend()
plt.xlabel('月份')
plt.ylabel('平均气温')
plt.title(f'{city}每个月的平均气温')
plt.show()
# 计算最适合旅游的月份
best_month = month_avg_temp.idxmax()
print(f'{city}最适合旅游的月份是{best_month}月')
```
5. 统计平均气温在18~25度,风力小于5级的天数并判断哪个城市更适合居住
```python
# 计算符合条件的天数
temp_count = ((df['最高温'] + df['最低温']) / 2).between(18, 25).sum()
wind_count = (df['风向'].str.contains('微风')).sum()
# 判断哪个城市更适合居住
if temp_count > wind_count:
print(f'{city}更适合居住')
else:
print(f'{city2}更适合居住')
```
阅读全文