使用python爬取詹姆斯生涯数据并可视化
时间: 2023-12-06 18:29:12 浏览: 114
好的,以下是爬取詹姆斯生涯数据并可视化的代码:
import requests
from bs4 import BeautifulSoup
import pandas as pd
import matplotlib.pyplot as plt
# 请求网页并解析
url = 'https://www.basketball-reference.com/players/j/jamesle01.html'
res = requests.get(url)
soup = BeautifulSoup(res.text, 'html.parser')
# 获取生涯数据表格
table = soup.find('table', {'id': 'per_game'})
rows = table.find_all('tr')
# 获取表头
headers = [th.getText() for th in rows[0].find_all('th')]
headers = headers[1:]
# 获取数据
data = []
for row in rows[1:]:
cols = row.find_all('td')
cols = [col.getText() for col in cols]
data.append([col for col in cols if col])
# 转换为DataFrame格式
df = pd.DataFrame(data, columns=headers)
# 数据清洗
df = df[df['Season'] != 'Career']
df['PTS'] = pd.to_numeric(df['PTS'])
df['TRB'] = pd.to_numeric(df['TRB'])
df['AST'] = pd.to_numeric(df['AST'])
# 数据可视化
plt.figure(figsize=(10, 6))
plt.plot(df['Season'], df['PTS'], label='Points')
plt.plot(df['Season'], df['TRB'], label='Rebounds')
plt.plot(df['Season'], df['AST'], label='Assists')
plt.legend()
plt.title('LeBron James Career Stats')
plt.xlabel('Season')
plt.ylabel('Value')
plt.show()
这段代码使用了requests库和BeautifulSoup库来爬取网页数据,使用pandas库对数据进行清洗和转换,使用matplotlib库将数据可视化。结果如下图所示: