使用mr程序统计每年入职的人数。最终结果要求如下： 1. 格式如下：年份：1980 人数:xxx 年份：1981 人数:xxx ....... 2. 两个分区： 0分区存储入职年份<1982年的 1分区存储入职年份>=1982年的

以下是使用MapReduce程序实现统计每年入职人数的步骤： 1. Mapper阶段： mapper读取输入文件中的每一行，提取出入职年份作为键，将值设置为1，将键值对输出。 2. Reducer阶段： reducer接收来自mapper传来的键值对，将同一年份的值进行累加并输出。 3. Partitioner阶段： partitioner根据入职年份的大小，将数据分发到0或1分区。以下是示例代码： Mapper类： ``` public class EmployeeMapper extends Mapper<LongWritable, Text, IntWritable, IntWritable> { private IntWritable year = new IntWritable(); private IntWritable one = new IntWritable(1); @Override public void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException { String[] fields = value.toString().split(","); int yearValue = Integer.parseInt(fields[1]); year.set(yearValue); context.write(year, one); } } ``` Reducer类： ``` public class EmployeeReducer extends Reducer<IntWritable, IntWritable, IntWritable, IntWritable> { private IntWritable result = new IntWritable(); @Override public void reduce(IntWritable key, Iterable<IntWritable> values, Context context) throws IOException, InterruptedException { int sum = 0; for (IntWritable val : values) { sum += val.get(); } result.set(sum); context.write(key, result); } } ``` Partitioner类： ``` public class EmployeePartitioner extends Partitioner<IntWritable, IntWritable> { @Override public int getPartition(IntWritable key, IntWritable value, int numPartitions) { if (key.get() < 1982) { return 0; } else { return 1; } } } ``` 驱动程序： ``` public class EmployeeCount { public static void main(String[] args) throws Exception { Configuration conf = new Configuration(); Job job = Job.getInstance(conf, "employee count"); job.setJarByClass(EmployeeCount.class); job.setMapperClass(EmployeeMapper.class); job.setPartitionerClass(EmployeePartitioner.class); job.setReducerClass(EmployeeReducer.class); job.setOutputKeyClass(IntWritable.class); job.setOutputValueClass(IntWritable.class); job.setNumReduceTasks(2); FileInputFormat.addInputPath(job, new Path(args[0])); FileOutputFormat.setOutputPath(job, new Path(args[1])); System.exit(job.waitForCompletion(true) ? 0 : 1); } } ``` 在运行程序时，需要将输入文件和输出目录作为程序参数传入。例如： ``` hadoop jar EmployeeCount.jar input/employee.csv output ``` 输出结果将会保存在output文件夹中，可以使用cat命令查看输出： ``` hadoop fs -cat output/part-r-00000 hadoop fs -cat output/part-r-00001 ``` 其中，part-r-00000为0分区的输出，part-r-00001为1分区的输出。

阅读全文

使用mr程序统计每年入职的人数。 最终结果要求如下： 1. 格式如下： 年份：1980 人数:xxx 年份：1981 人数:xxx ....... 2. 两个分区： 0分区存储 入职年份<1982年的 1分区存储 入职年份>=1982年的

相关推荐

将员工的参加工作日期按如下格式显示：月份/年份。

Mr.ning：微信小程序初体验：获取坐标位置和根据坐标查看地址 ...

微信小程序学习笔记（一）：EgretWing中打开小程序项目并进行debug ...

FEBS-Cloud：基于Spring Cloud Hoxton.RELEASE，Spring Cloud OAuth2和Spring Cloud Alibaba＆Element微服务权限系统，开箱即用。预览地址：https：cloud.mrbird.cn

mp-weixin:mp.weixin.qq.com

MKS21X-Final-Project:Mr. K 的 APCS 课程第一学期期末项目

mrlab:Mr.lab是一个基于python的站点，用于控制多台计算机。-开源

多线程：CSemaphore.rar

meta-learning：survey.pdf

java餐饮管理系统源码加数据库-13927729580.github.io:13927729580.github.io

whosit:Node.js的个人名称解析器

danger.systems:Danger.Systems的旧网站->新网站=

expression-parser:Node.js和浏览器的表达式解析器和评估器

Linux图形界面中文显示解决方案：msyh.ttf与simhei.ttf字体文件

智慧园区3D可视化解决方案PPT(24页).pptx

labelme标注的json转mask掩码图，用于分割数据集 批量转化，生成cityscapes格式的数据集

（参考GUI）MATLAB GUI漂浮物垃圾分类检测.zip

人脸识别_OpenCV_活体检测_证件照拍照_Demo_1741778955.zip

大家在看

读写通达信股票软件二进制dat文件

CMOS反相器的掩膜版图-集成电路版图设计

调制解调文档

Windows系统kb2577795-kb2553549 补丁

ISO/IEC 27005:2022 英文原版

最新推荐

智慧园区3D可视化解决方案PPT(24页).pptx

labelme标注的json转mask掩码图，用于分割数据集 批量转化，生成cityscapes格式的数据集

虚拟串口软件：实现IP信号到虚拟串口的转换

【Python进阶篇】：掌握这些高级特性，让你的编程能力飞跃提升

后端调用ragflow api

IE6下实现PNG图片背景透明的技术解决方案

【欧姆龙触摸屏故障诊断全攻略】

Educoder综合练习—C&C++选择结构

VBS简明教程：批处理之家论坛下载指南

【欧姆龙触摸屏：新手必读的10个操作技巧】

使用mr程序统计每年入职的人数。最终结果要求如下： 1. 格式如下：年份：1980 人数:xxx 年份：1981 人数:xxx ....... 2. 两个分区： 0分区存储入职年份<1982年的 1分区存储入职年份>=1982年的

labelme标注的json转mask掩码图，用于分割数据集批量转化，生成cityscapes格式的数据集

labelme标注的json转mask掩码图，用于分割数据集批量转化，生成cityscapes格式的数据集