没有合适的资源?快使用搜索试试~ 我知道了~
首页Deep Dive into Spark SQL with Advanced Performance Tuning
Spark SQL is a highly scalable and efficient relational processing engine with ease-to-use APIs and mid-query fault tolerance. It is a core module of Apache Spark. Spark SQL can process, integrate and analyze the data from diverse data sources (e.g., Hive, Cassandra, Kafka and Oracle) and file formats (e.g., Parquet, ORC, CSV, and JSON). This talk will dive into the technical details of SparkSQL spanning the entire lifecycle of a query execution. The audience will get a deeper understanding of Spark SQL and understand how to tune Spark SQL performance.
资源详情
资源评论
资源推荐

Deep Dive Into
Xiao Li & Wenchen Fan
Spark Summit | SF | Jun 2018
1
SQL
with Advanced Performance Tuning

About US
• Software Engineers at
• Apache Spark Committers and PMC Members
Xiao Li (Github: gatorsmile)
Wenchen Fan (Github: cloud-fan)

Databricks’ Unified Analytics Platform
DATABRICKS RUNTIME
COLLABORATIVE NOTEBOOKS
Delta SQL Streaming
Powered by
Data Engineers
Data Scientists
CLOUD NATIVE SERVICE
Unifies Data Engineers
and Data Scientists
Unifies Data and AI
Technologies
Eliminates infrastructure
complexity

Spark SQL
A highly scalable and efficient relational
processing engine with ease-to-use APIs
and mid-query fault tolerance.
4

Run Everywhere
Processes, integrates
and analyzes the data
from diverse data
sources (e.g., Cassandra,
Kafka and Oracle) and
file formats (e.g.,
Parquet, ORC, CSV, and
JSON)
5
剩余44页未读,继续阅读















安全验证
文档复制为VIP权益,开通VIP直接复制

评论0