利用编程从HBASE中读取学生成绩,并实现计算每个学生成绩平均分和每门课程成绩平均分,用MapReduce方法,使用Java API
时间: 2024-02-25 08:58:12 浏览: 92
学生成绩管理系统
好的,这是一个使用Java API和MapReduce方法从HBase中读取学生成绩,计算每个学生成绩平均分和每门课程成绩平均分的示例代码:
```java
import java.io.IOException;
import java.util.StringTokenizer;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.conf.Configured;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.hbase.Cell;
import org.apache.hadoop.hbase.CellUtil;
import org.apache.hadoop.hbase.HBaseConfiguration;
import org.apache.hadoop.hbase.client.Get;
import org.apache.hadoop.hbase.client.Result;
import org.apache.hadoop.hbase.client.Scan;
import org.apache.hadoop.hbase.client.Table;
import org.apache.hadoop.hbase.mapreduce.TableMapper;
import org.apache.hadoop.hbase.util.Bytes;
import org.apache.hadoop.io.DoubleWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.util.Tool;
import org.apache.hadoop.util.ToolRunner;
public class HBaseScoreAvg extends Configured implements Tool {
public static class ScoreMapper extends TableMapper<Text, DoubleWritable> {
private Text studentId = new Text();
private DoubleWritable score = new DoubleWritable();
@Override
public void map(ImmutableBytesWritable row, Result columns, Context context) throws IOException, InterruptedException {
Cell[] cells = columns.rawCells();
double totalScore = 0;
int numScores = 0;
for (Cell cell : cells) {
String columnName = Bytes.toString(CellUtil.cloneQualifier(cell));
if (columnName.startsWith("score_")) {
double columnValue = Double.parseDouble(Bytes.toString(CellUtil.cloneValue(cell)));
totalScore += columnValue;
numScores++;
} else if (columnName.equals("student_id")) {
studentId.set(Bytes.toString(CellUtil.cloneValue(cell)));
}
}
if (numScores > 0) {
double avgScore = totalScore / numScores;
score.set(avgScore);
context.write(studentId, score);
}
}
}
public static class AvgReducer extends Reducer<Text, DoubleWritable, Text, DoubleWritable> {
private DoubleWritable result = new DoubleWritable();
@Override
public void reduce(Text key, Iterable<DoubleWritable> values, Context context) throws IOException, InterruptedException {
double totalScore = 0;
int numScores = 0;
for (DoubleWritable value : values) {
totalScore += value.get();
numScores++;
}
double avgScore = totalScore / numScores;
result.set(avgScore);
context.write(key, result);
}
}
public int run(String[] args) throws Exception {
Configuration conf = HBaseConfiguration.create();
Job job = Job.getInstance(conf, "HBaseScoreAvg");
job.setJarByClass(HBaseScoreAvg.class);
Scan scan = new Scan();
TableMapReduceUtil.initTableMapperJob(
"scores", // input HBase table name
scan,
ScoreMapper.class,
Text.class,
DoubleWritable.class,
job
);
job.setReducerClass(AvgReducer.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(DoubleWritable.class);
FileOutputFormat.setOutputPath(job, new Path(args[0]));
boolean success = job.waitForCompletion(true);
return success ? 0 : 1;
}
public static void main(String[] args) throws Exception {
int exitCode = ToolRunner.run(HBaseConfiguration.create(), new HBaseScoreAvg(), args);
System.exit(exitCode);
}
}
```
上述代码中,`ScoreMapper`类用于从HBase表中读取学生成绩数据并计算每个学生的平均分;`AvgReducer`类用于计算每门课程的平均分。在`main`方法中,我们将`ScoreMapper`设置为Mapper类,将`AvgReducer`设置为Reducer类,并指定输出的键和值类型为`Text`和`DoubleWritable`。我们还需要指定输入HBase表的名称和输出文件的路径。最后,我们调用`waitForCompletion`方法启动作业并等待其完成。
请注意,上述示例代码仅提供了一种从HBase中读取学生成绩并计算平均分的方法,具体实现可能因实际情况而异。
阅读全文