23.1.3. CRC ...............................................................................................................................256
23.1.4. MD5 ...............................................................................................................................256
24.
分布式缓存 ......................................................................257
24.1.1.
缓存雪崩 ...............................................................
257
24.1.2.
缓存穿透 ...............................................................
257
24.1.3.
缓存预热 ...............................................................
257
24.1.4.
缓存更新 ...............................................................
257
24.1.5.
缓存降级 ...............................................................
257
25.
HADOOP..................................................................................................................................259
25.1.1.
概念 ...................................................................
259
25.1.2.
HDFS .............................................................................................................................259
25.1.2.1. Client.............................................................................................................259
25.1.2.2. NameNode....................................................................................................259
25.1.2.3. Secondary NameNode...........................................................................................259
25.1.2.4. DataNode....................................................................................................................259
25.1.3.
MapReduce ...................................................................................................................260
25.1.3.1. Client.............................................................................................................................260
25.1.3.2. JobTracker...................................................................................................................260
25.1.3.3. TaskTracker.................................................................................................................261
25.1.3.4. Task ...............................................................................................................................261
25.1.3.5. Reduce Task 执行过程 .............................................................................................261
25.1.4.
Hadoop MapReduce
作业的生命周期 .......................................
262
1.作业提交与初始化 ........................................................262
2.任务调度与监控。 ........................................................262
3.任务运行环境准备 ........................................................262
4.任务执行 ................................................................262
5.作业完成。 ..............................................................262
26.
SPARK .....................................................................................................................................263
26.1.1.
概念 ...................................................................
263
26.1.2.
核心架构 ...............................................................
263
Spark Core ............................................................................................................................................263
Spark SQL.............................................................................................................................................263
Spark Streaming....................................................................................................................................263
Mllib .......................................................................................................................................................263
GraphX ..................................................................................................................................................263
26.1.3.
核心组件 ...............................................................
264
Cluster Manager-制整个集群,监控 worker.......................................................................................................264
Worker 节点-负责控制计算节点..............................................................................................................................264
Driver: 运行 Application 的 main()函数 .....................................................264
Executor:执行器,是为某个 Application 运行在 worker node 上的一个进程 .....................264
26.1.4.
SPARK
编程模型 ........................................................
264
26.1.5.
SPARK
计算模型 ........................................................
265
26.1.6.
SPARK
运行流程 ........................................................
266
1. 构建 Spark Application 的运行环境,启动 SparkContext......................................267
2. SparkContext 向资源管理器(可以是 Standalone,Mesos,Yarn)申请运行 Executor 资源,并启
动 StandaloneExecutorbackend, ..........................................................267
3. Executor 向 SparkContext 申请 Task....................................................................267
4. SparkContext 将应用程序分发给 Executor ...........................................................267
5. SparkContext 构建成 DAG 图,将 DAG 图分解成 Stage、将 Taskset 发送给 Task Scheduler,最
后由 Task Scheduler 将 Task 发送给 Executor 运行 ...........................................267
6. Task 在 Executor 上运行,运行完释放所有资源 ............................267
26.1.7.
SPARK RDD
流程........................................................
267
26.1.8.
SPARK RDD..................................................................................................................267
(1) RDD 的创建方式 ........................................................267