private static class Step4Reducer extends TableReducer<Text, IntWritable, NullWritable> { @Override protected void reduce(Text key, Iterable<IntWritable> values, Reducer<Text, IntWritable, NullWritable, Mutation>.Context context) throws IOException, InterruptedException { int sum = 0; String name = "", classId = ""; // int count = 0; for (IntWritable v : values) { sum += v.get(); // count ++; } byte[] rk = Bytes.toBytes(key.toString()); byte[] fm = Bytes.toBytes("info"); byte[] c1 = Bytes.toBytes("name"); byte[] c2 = Bytes.toBytes("classId"); byte[] v1 = Bytes.toBytes(name); byte[] v2 = Bytes.toBytes(classId); byte[] c3 = Bytes.toBytes("sum"); // byte[] c2 = Bytes.toBytes("avg"); Put put = new Put(rk); put.addColumn(fm, c1, v1); put.addColumn(fm, c2, v2); put.addColumn(fm, c3, Bytes.toBytes(sum)); // put.addColumn(fm, c2, Bytes.toBytes(sum * 1.0/count)); // 输出 context.write(NullWritable.get(), put); } }

时间: 2024-02-14 13:20:13 浏览: 70

这是一个 MapReduce 中的 Reduce 阶段的代码，它的作用是将来自 Map 阶段的数据进行汇总计算，并将结果写出到 HBase 数据库中。具体来说，它会将相同 key 的数据进行聚合，计算它们的 sum 值，并将 key 和对应的聚合值写入到 HBase 中。其中，rk 表示 HBase 中的 rowkey，fm 表示 column family，c1、c2、c3 分别表示 column family 中的不同列，v1、v2、sum 分别表示对应列的值。在实际使用中，这段代码需要根据具体需求进行修改。

public class IntSortReducer extends Reducer<IntWritable, IntWritable, IntWritable, IntWritable> { private static final IntWritable lineNumber = new IntWritable(0); @Override protected void reduce(IntWritable key, Iterable<IntWritable> values, Context context) throws IOException, InterruptedException { // ignore values: they'll always be 1 (as long as paired with IntMapper) synchronized (lineNumber) { /* TODO: Advance line number / context.write(/ TODO: Write new, reduced <K,V> pair to context */); } } }

补全后的代码如下： ``` public class IntSortReducer extends Reducer<IntWritable, IntWritable, IntWritable, IntWritable> { private static final IntWritable lineNumber = new IntWritable(0); private int count = 0; @Override protected void reduce(IntWritable key, Iterable<IntWritable> values, Context context) throws IOException, InterruptedException { // ignore values: they'll always be 1 (as long as paired with IntMapper) synchronized (lineNumber) { /* Advance line number */ count++; lineNumber.set(count); /* Write new, reduced <K,V> pair to context */ context.write(lineNumber, key); } } } ``` 在补全的代码中，我们通过使用 count 计数器和线程安全的 synchronized 关键字来实现了一个自增的 lineNumber，然后将新生成的键值对通过调用 context.write() 写入到 context 中。

public static class IntSumReducer extends Reducer<Text, IntWritable, Text, IntWritable> { private Map<String, Integer> counts = new HashMap<>(); @Override protected void reduce(Text key, Iterable<IntWritable> values, Context context) throws IOException, InterruptedException { int sum = 0; for (IntWritable value : values) { sum += value.get(); } counts.put(key.toString(), sum); } @Override protected void cleanup(Context context) throws IOException, InterruptedException { // 对 counts 中的键值对按照要求排序 List<Map.Entry<String, Integer>> list = new ArrayList<>(counts.entrySet()); Collections.sort(list, new Comparator<Map.Entry<String, Integer>>() { @Override public int compare(Map.Entry<String, Integer> o1, Map.Entry<String, Integer> o2) { if (o1.getValue() .equals(o2.getValue()) ) { return o2.getKey().compareTo(o1.getKey()); } else { return o2.getValue().compareTo(o1.getValue()); } } }); // 将排序后的结果写入输出文件中 for (Map.Entry<String, Integer> entry : list) { Text word = new Text(entry.getKey()); IntWritable count = new IntWritable(entry.getValue()); context.write(word, count); } } }分析一下这段代码

这段代码是一个 MapReduce 的 Reducer 类，用于对输入数据进行聚合操作。它的输入键值对类型为 Text 和 IntWritable，输出键值对类型也是 Text 和 IntWritable。该类定义了一个名为 counts 的 Map 对象，用于保存每个 key 对应的总和。在 reduce() 方法中，它遍历了所有的 values，将其相加得到了一个 sum 值，并将其保存到 counts 中。在 cleanup() 方法中，它首先将 counts 中的键值对存入一个 ArrayList 中，并按照排序规则进行排序，排序规则是：先按照值降序排序，如果值相同则按照键升序排序。排序后，它遍历排序后的 ArrayList，将其中的键值对写入输出文件中。这段代码的作用是对输入的数据进行聚合操作，并按照一定规则排序后输出结果。

阅读全文

相关推荐

async-reduce: Python异步协程性能优化的减速器

自定义Hadoop OutputFormat：单Reducer生成多输出文件

提案：引入@@reduced符号优化JavaScript的Array.prototype.reduce方法

详解MapReduce框架中的Reducer组件

MapReduce中的Combiner与Reducer选择策略：如何判断何时使用Combiner

【性能调优必读】：Combiner与Reducer的区别及策略选择

MapReduce进阶必读：掌握Reduce阶段的核心技术

掌握Reduce端数据合并：MapReduce中的WordCount技巧解析

MapReduce框架探秘：WordCount案例中的Reduce任务深度探讨

根据网站每日访问次数的统计需求，分析Map阶段和Reduce阶段的处理逻辑，编写Mapper模块、Reducer模块和Driver模块的代码。定义一个dailyAccessCount类，封装Mapper模块、Reducer模块和Driver模块的实现

Mapper Reducer计算每日的访客量JAVA代码

基于eclipse map reduce hadoop的销售统计代码

理解Hadoop MapReduce执行机制

硬核类型友好的Redux减速器工具hard-reducer介绍

大家在看

基于python+opencv实现柚子缺陷识别检测源码+详细代码注释.zip

(信息图)eAPP610 快速入门(3GPP)(V100R005C10-01).zip

C语言第四次作业ppt课件.ppt

C4.5算法在列车轨道故障检测上的应用研究

基于机器视觉的工件识别和定位文献综述.docx

最新推荐

CarSim、MATLAB、PreScan，提供车辆动力学、运动控制联合仿真软件安装激活服务，可远程 内容包括： MATLAB R2018b win64 MATLAB R2020a win64 Pre

包含300个可选插件rails git macOS hub docker homebrew node php pyth.zip

Terraform AWS ACM 59版本测试与实践

【HS1101湿敏电阻全面解析】：从基础知识到深度应用的完整指南

MATLAB在一个图形窗口中创建一行两列的子图的代码

Doks Hugo主题：打造安全快速的现代文档网站

E9流程表单前端接口API(V5)：前端与后端协同开发的黄金法则

c#获取路径 Microsoft.Win32.SaveFileDialog saveFileDialog = new Microsoft.Win32.SaveFileDialog();

CRMSeguros-crx插件：扩展与保险公司CRM集成

揭秘E9流程表单前端接口API(V5)：掌握接口设计与安全性的最佳实践

CarSim、MATLAB、PreScan，提供车辆动力学、运动控制联合仿真软件安装激活服务，可远程内容包括： MATLAB R2018b win64 MATLAB R2020a win64 Pre