编写一个mapreduce的成绩统计学生平均成绩,输入文件的格式为 学号:课程号:成绩分数,要求输出为学号:平均绩点GPA。绩点GPA计算方式按上海理工大学规定。Java,输入和输出实现Writable接口使用 S3:C5:95 S4:C7:89 S3:C6:82 S2:C1:68 S1:C3:92 S4:C9:87 S2:C4:90 S1:C2:85 S3:C2:88 S1:C5:76 S4:C3:91 S4:C9:87 S2:C8:78 S3:C2:88 S3:C5:95 S4:C7:89 S2:C4:90 S1:C3:92 S1:C5:76 S3:C6:82 S2:C1:68 S2:C8:78 S4:C3:91 S3:C5:95 S1:C3:92 S4:C7:89 S1:C5:76 S3:C6:82 S2:C1:68 S2:C8:78 S4:C3:91 S1:C2:85 S3:C2:88 S4:C9:87 S2:C4:90
时间: 2023-07-02 12:20:31 浏览: 143
好的,这是一个比较典型的 MapReduce 问题。我们需要利用 MapReduce 框架对每个学生的成绩进行处理,最终得到每个学生的平均绩点 GPA。
首先,我们需要定义一个自定义的数据类型来存储学生信息。对于每个学生,我们需要记录他的学号、课程号、成绩分数以及该课程的学分。因此,我们可以定义一个类 Student:
```java
public class Student implements Writable {
private String studentId; // 学号
private String courseId; // 课程号
private int score; // 成绩分数
private int credit; // 学分
// 省略构造函数、getter 和 setter 方法
@Override
public void write(DataOutput out) throws IOException {
out.writeUTF(studentId);
out.writeUTF(courseId);
out.writeInt(score);
out.writeInt(credit);
}
@Override
public void readFields(DataInput in) throws IOException {
studentId = in.readUTF();
courseId = in.readUTF();
score = in.readInt();
credit = in.readInt();
}
// 计算绩点 GPA
public double calculateGPA() {
double gpa = 0.0;
if (score >= 90) {
gpa = 4.0;
} else if (score >= 85) {
gpa = 3.7;
} else if (score >= 82) {
gpa = 3.3;
} else if (score >= 78) {
gpa = 3.0;
} else if (score >= 75) {
gpa = 2.7;
} else if (score >= 72) {
gpa = 2.3;
} else if (score >= 68) {
gpa = 2.0;
} else if (score >= 64) {
gpa = 1.5;
} else if (score >= 60) {
gpa = 1.0;
}
return gpa;
}
}
```
接下来,我们需要实现 Map 和 Reduce 函数。在 Map 函数中,我们将输入的每一行解析成一个 Student 对象,并以学生学号为 key,以该学生的成绩信息为 value 进行输出。在 Reduce 函数中,我们将计算每个学生的平均绩点 GPA,并以学生学号为 key,以学生的平均绩点 GPA 为 value 进行输出。
```java
public class ScoreAverage {
public static class ScoreMap extends Mapper<Object, Text, Text, Student> {
private Text studentId = new Text();
private Student student = new Student();
public void map(Object key, Text value, Context context) throws IOException, InterruptedException {
String[] fields = value.toString().split(":");
studentId.set(fields[0]);
student.setStudentId(fields[0]);
student.setCourseId(fields[1]);
student.setScore(Integer.parseInt(fields[2]));
// 根据课程号设置学分
if (fields[1].startsWith("C1")) {
student.setCredit(4);
} else if (fields[1].startsWith("C2")) {
student.setCredit(3);
} else if (fields[1].startsWith("C3")) {
student.setCredit(2);
} else {
student.setCredit(1);
}
context.write(studentId, student);
}
}
public static class ScoreReduce extends Reducer<Text, Student, Text, DoubleWritable> {
private DoubleWritable result = new DoubleWritable();
public void reduce(Text key, Iterable<Student> values, Context context)
throws IOException, InterruptedException {
int totalCredit = 0;
double totalGPA = 0.0;
// 遍历该学生的所有成绩信息,计算绩点 GPA
for (Student student : values) {
totalCredit += student.getCredit();
totalGPA += student.calculateGPA() * student.getCredit();
}
// 计算平均绩点 GPA
double averageGPA = totalGPA / totalCredit;
result.set(averageGPA);
context.write(key, result);
}
}
public static void main(String[] args) throws Exception {
Configuration conf = new Configuration();
Job job = Job.getInstance(conf, "ScoreAverage");
job.setJarByClass(ScoreAverage.class);
job.setMapperClass(ScoreMap.class);
job.setReducerClass(ScoreReduce.class);
job.setMapOutputKeyClass(Text.class);
job.setMapOutputValueClass(Student.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(DoubleWritable.class);
FileInputFormat.addInputPath(job, new Path(args[0]));
FileOutputFormat.setOutputPath(job, new Path(args[1]));
System.exit(job.waitForCompletion(true) ? 0 : 1);
}
}
```
最后,我们需要将上述代码打包成一个 jar 文件,并在 Hadoop 上运行该程序,得到每个学生的平均绩点 GPA。
运行命令:
```
$HADOOP_HOME/bin/hadoop jar ScoreAverage.jar ScoreAverage input output
```
其中,ScoreAverage 是打包后的 jar 文件名,input 是输入文件路径,output 是输出文件路径。
阅读全文