Spark Autotuning: 藏经阁文件优化手册

需积分: 5 194 浏览量更新于2023-11-24 收藏 1.59MB PDF 举报

The "藏经阁-Spark Autotuning.pdf" file discusses the motivation for using Spark, the challenges of manually tuning Spark configurations, and the future enhancements for Spark Autotuning. In particular, it emphasizes the extensive use of Spark in all stages of data processing, including ETL, feature engineering, model training, and model scoring. However, it points out that data scientists must manually set the size and number of drivers, executors, and partitions, which can lead to inefficiency and trial-and-error approaches. The manual tuning process is described as time-consuming and inefficient, often resulting in OOM (Out of Memory) failures after hours of trial and error. While this may be less problematic for unchanging operationalized jobs, it is still worth spending time to improve the process. The file highlights the need for an automated solution to Spark tuning, and proposes Spark Autotuning as an approach to address these challenges. In conclusion, "藏经阁-Spark Autotuning.pdf" presents a compelling case for the importance of automated Spark tuning, given the extensive use of Spark in data processing and the inefficiency of manual tuning. It also indicates that future enhancements in Spark Autotuning will offer potential solutions to these challenges. This summary aims to provide a concise overview of the content and key points of the file, "藏经阁-Spark Autotuning.pdf."

Inefficient

•  Manual Spark tuning is a time consuming and inefficient

process

–  Frequently results in “trial-and-error” approach

–  Can be hours (or more) before OOM fails occur

•  Less problematic for unchanging operationalized jobs

–  Run same analysis every hour/day/week on new data

–  Worth spending time & resources to fine tune

剩余33页未读，继续阅读

weixin_40191861_zj

粉丝: 85
资源: 1万+

Spark Autotuning: 藏经阁文件优化手册

藏经阁-基于Spark的统一数据管理与数据探索平台.pdf

藏经阁-Auto Scaling Systems With Elastic Spark Streaming.pdf

藏经阁-APACHE SPARK FOR MACHINE.pdf

ILITEK 触摸屏 ITS调试工具中文 Windows-Studio-ITS-User-Guide(CHT)-v1.0.3.5

SparkAutotuning.pdf

台仪TAIE详细操作手册.pdf

6步解决win7局域网内传输慢的问题.pdf

ck-autotuning:CK自动化操作使用户能够实现可移植，可自定义和可重复使用的程序工作流，以实现可再现，协作和多目标的基准测试，优化和SWHW协同设计

士研电机PID数位控制器TF系列说明书.pdf

Autotuning with PID Controller:autotuning with pid controller-matlab开发

最新资源