没有合适的资源?快使用搜索试试~ 我知道了~
首页300页PPT讲述Spark DevOps进阶技巧
资源详情
资源评论
资源推荐

DEVOPS ADVANCED CLASS
June 2015: Spark Summit West 2015
http://training.databricks.com/devops.pdf
www.linkedin.com/in/blueplastic

making big data simple
Databricks Cloud:
“A unified platform for building Big Data pipelines
– from ETL to Exploration and Dashboards, to
Advanced Analytics and Data Products.”
• Founded in late 2013
• by the creators of Apache Spark
• Original team from UC Berkeley AMPLab
• Raised $47 Million in 2 rounds
• ~55 employees
• We’re hiring!
• Level 2/3 support partnerships with
• Hortonworks
• MapR
• DataStax
(http://databricks.workable.com)

The Databricks team contributed more than 75% of the code added to Spark in the past year

AGENDA
• History of Spark
• RDD fundamentals
• Spark Runtime Architecture
Integration with Resource Managers
(Standalone, YARN)
• GUIs
• Lab: DevOps 101
Before Lunch
• Memory and Persistence
• Jobs -> Stages -> Tasks
• Broadcast Variables and
Accumulators
• PySpark
• DevOps 102
• Shuffle
• Spark Streaming
After Lunch

Some slides will be skipped
Please keep Q&A low during class
(5pm – 5:30pm for Q&A with instructor)
2 anonymous surveys: Pre and Post class
Lunch: noon – 1pm
2 breaks (before lunch and after lunch)
剩余300页未读,继续阅读
















安全验证
文档复制为VIP权益,开通VIP直接复制

评论7