后续统计都只针对tournament=='FIFA World Cup' 的世界杯正赛。筛选出世界杯正赛'FIFA World Cup', 保存为w数据框对象。然后统计参加过世界杯正赛的一共有多少个国家? 国家数目保存为 countrynum变量。然后计算计算世界杯比赛中胜率最高的前5个队?将队伍名称及胜率保存在 top5 列表中。 约定: 胜率 = 获胜次数/总场次, 总场次为 (胜 + 平 + 负),使用循环加字典和pandas两种方式完成。然后统计获胜场次最多的是哪个国家? 存在 wincountry 变量。然后统计各个国家在世界杯比赛上各自打入多少个球?将统计结果在 scountry 这个Series对象中。 注意进球分别保存在 home_team(主队进球)和 away_score(客队进球)两列中,进球数应合并计算这两列。然后统计巴西队在世界杯上相对来说,对哪个国家的胜率最低? 将此国家名存在 country变量
时间: 2024-02-27 20:52:40 浏览: 113
根据题目的要求,我们需要针对世界杯正赛('FIFA World Cup')进行统计。可以使用以下代码来筛选出世界杯正赛的比赛并保存为w数据框对象:
```python
# 筛选出世界杯正赛的比赛
w = df[df['tournament'] == 'FIFA World Cup'].reset_index(drop=True)
```
其中`reset_index(drop=True)`是为了重新设置索引。
统计参加过世界杯正赛的国家数目可以使用以下代码:
```python
# 统计参加过世界杯正赛的国家数目
countrynum = len(w['home_team'].unique())
print('参加过世界杯正赛的国家数目:', countrynum)
```
其中`unique()`函数用来获取唯一的值。
计算胜率最高的前5个队可以使用以下两种方式:
(1) 循环加字典
```python
# 循环加字典的方式计算胜率
win_dict = {}
total_dict = {}
for index, row in w.iterrows():
# 计算主队的胜平负情况
if row['home_score'] > row['away_score']:
if row['home_team'] not in win_dict:
win_dict[row['home_team']] = 1
else:
win_dict[row['home_team']] += 1
if row['home_team'] not in total_dict:
total_dict[row['home_team']] = 1
else:
total_dict[row['home_team']] += 1
elif row['home_score'] == row['away_score']:
if row['home_team'] not in total_dict:
total_dict[row['home_team']] = 1
else:
total_dict[row['home_team']] += 1
else:
if row['home_team'] not in total_dict:
total_dict[row['home_team']] = 1
else:
total_dict[row['home_team']] += 1
# 计算客队的胜平负情况
if row['home_score'] < row['away_score']:
if row['away_team'] not in win_dict:
win_dict[row['away_team']] = 1
else:
win_dict[row['away_team']] += 1
if row['away_team'] not in total_dict:
total_dict[row['away_team']] = 1
else:
total_dict[row['away_team']] += 1
elif row['home_score'] == row['away_score']:
if row['away_team'] not in total_dict:
total_dict[row['away_team']] = 1
else:
total_dict[row['away_team']] += 1
else:
if row['away_team'] not in total_dict:
total_dict[row['away_team']] = 1
else:
total_dict[row['away_team']] += 1
# 计算胜率
win_rate_dict = {}
for key, value in total_dict.items():
if key not in win_dict:
win_dict[key] = 0
win_rate_dict[key] = win_dict[key] / value
# 获取胜率最高的前5个队
top5 = sorted(win_rate_dict.items(), key=lambda x: x[1], reverse=True)[:5]
print('胜率最高的前5个队:', top5)
```
(2) pandas方式
```python
# pandas方式计算胜率
matches = w['home_team'].append(w['away_team']).reset_index(drop=True)
wins = pd.Series(matches.groupby(matches).apply(lambda x: ((x == x) & (x > 0)).sum()))
totals = pd.Series(matches.groupby(matches).size())
win_rate = wins / totals
# 获取胜率最高的前5个队
top5 = win_rate.sort_values(ascending=False)[:5]
print('胜率最高的前5个队:', list(zip(top5.index, top5.values)))
```
其中`lambda`函数用来计算胜率,`sort_values`函数用来排序。
统计获胜场次最多的国家可以使用以下代码:
```python
# 统计获胜场次最多的国家
wincountry = w.groupby('home_team')['home_score'].sum().idxmax()
print('获胜场次最多的国家:', wincountry)
```
其中`idxmax()`函数用来获取最大值的索引。
统计各个国家在世界杯比赛上各自打入多少个球可以使用以下代码:
```python
# 统计各个国家在世界杯比赛上各自打入多少个球
scountry = pd.Series(index=w['home_team'].unique(), data=0)
scountry = scountry.add(w.groupby('home_team')['home_score'].sum(), fill_value=0)
scountry = scountry.add(w.groupby('away_team')['away_score'].sum(), fill_value=0)
print('各个国家在世界杯比赛上各自打入的球数:\n', scountry)
```
其中`add`函数用来将两个Series进行合并,`fill_value`参数用来填充缺失值。
统计巴西队在世界杯上相对来说,对哪个国家的胜率最低可以使用以下代码:
```python
# 统计巴西队在世界杯上相对来说,对哪个国家的胜率最低
brazil = w[(w['home_team'] == 'Brazil') | (w['away_team'] == 'Brazil')]
brazil_win = brazil[brazil['home_team'] == 'Brazil']['home_score'] > brazil[brazil['home_team'] == 'Brazil']['away_score']
brazil_win = brazil_win.add(brazil[brazil['away_team'] == 'Brazil']['away_score'] > brazil[brazil['away_team'] == 'Brazil']['home_score'], fill_value=0)
brazil_total = brazil_win.count()
brazil_lose = brazil_total - brazil_win.sum()
lose_rate = {}
for country in brazil['home_team'].append(brazil['away_team']).unique():
if country == 'Brazil':
continue
against = brazil[(brazil['home_team'] == country) | (brazil['away_team'] == country)]
against_win = against[against['home_team'] == country]['home_score'] > against[against['home_team'] == country]['away_score']
against_win = against_win.add(against[against['away_team'] == country]['away_score'] > against[against['away_team'] == country]['home_score'], fill_value=0)
against_total = against_win.count()
against_lose = against_total - against_win.sum()
lose_rate[country] = against_lose / against_total
country = min(lose_rate, key=lose_rate.get)
print('巴西队在世界杯上相对来说,对胜率最低的国家:', country)
```
其中,`brazil`数据框用来筛选出巴西队参加的比赛,然后计算胜率最低的国家。
阅读全文