对给定数据集pandas_data.csv,完成如下题目(除pe成绩): (1)计算成绩的平均值,作为新的一列加入到原数据中。 (1)统计每个科目75分以上的人数。Python
时间: 2024-10-27 19:02:49 浏览: 24
pandas中read_csv、rolling、expanding用法详解
对于给定的pandas_data.csv数据集,你可以按照以下步骤操作:
**步骤 1: 导入所需的库和数据**
```python
import pandas as pd
# 读取CSV数据
pandas_data = pd.read_csv('pandas_data.csv')
```
**步骤 2: 计算成绩的平均值并创建新列**
```python
# 假设成绩列名为'Scores',如果不是,请替换为实际列名
average_scores = pandas_data['Scores'].mean()
pandas_data['Average_Score'] = average_scores
```
**步骤 3: 统计每个科目75分以上的人数**
由于题目未明确指定科目列名,这里假设科目列名为'results'(如果不是,请替换为实际列名),并且成绩列包含具体科目分数,比如 'Math', 'English', etc.。我们可以遍历这些科目:
```python
subject_counts = {}
for subject in pandas_data.columns[-1]: # 遍历除了姓名和平均分之外的所有列
if subject != 'Scores' and subject != 'Average_Score':
high_scores_mask = pandas_data[f'{subject}'].astype(int) >= 75
high_score_count = high_scores_mask.sum()
subject_counts[subject] = high_score_count
subject_counts_dict = pd.DataFrame.from_dict(subject_counts, orient='index').reset_index()
subject_counts_dict.columns = ['Subject', 'High_Score_Count']
```
现在`subject_counts_dict`包含了每个科目的75分以上人数。
完整的代码示例如下:
```python
import pandas as pd
# 读取CSV数据
pandas_data = pd.read_csv('pandas_data.csv')
# 计算平均成绩并添加到数据框
average_scores = pandas_data['Scores'].mean()
pandas_data['Average_Score'] = average_scores
# 统计各科目75分以上的人数
subject_counts = {}
for subject in pandas_data.columns[-1]: # 假设最后一列是科目
if subject != 'Scores' and subject != 'Average_Score':
high_scores_mask = pandas_data[subject].astype(int) >= 75
high_score_count = high_scores_mask.sum()
subject_counts[subject] = high_score_count
# 将结果转换为 DataFrame 并展示
subject_counts_dict = pd.DataFrame.from_dict(subject_counts, orient='index').reset_index()
subject_counts_dict.columns = ['Subject', 'High_Score_Count']
print(subject_counts_dict)
```
阅读全文