LearningSpark.pdf2015出版高清_learningspark中文版

Spark

5星 · 超过95%的资源需积分: 10 193 浏览量更新于2023-03-16 评论收藏 10.69MB PDF 举报

身份认证购VIP最低享 7 折!

领优惠券(最高得80元）

资源详情

资源评论

资源推荐

PROGRAMMING LANGUAGES/SPARK

Learning Spark

ISBN: 978-1-449-35862-4

US $39.99 CAN $ 45.99

“

Learning Spark is at the

top of my list for anyone

needing a gentle guide

to the most popular

framework for building

big data applications.

”

—Ben Lorica

Chief Data Scientist, O’Reilly Media

Twitter: @oreillymedia

facebook.com/oreilly

Data in all domains is getting bigger. How can you work with it efficiently?

This book introduces Apache Spark, the open source cluster computing

system that makes data analytics fast to write and fast to run. With Spark,

you can tackle big datasets quickly through simple APIs in Python, Java,

and Scala.

Written by the developers of Spark, this book will have data scientists and

engineers up and running in no time. You’ll learn how to express parallel

jobs with just a few lines of code, and cover applications from simple batch

jobs to stream processing and machine learning.

■ Quickly dive into Spark capabilities such as distributed

datasets, in-memory caching, and the interactive shell

■ Leverage Spark’s powerful built-in libraries, including Spark

SQL, Spark Streaming, and MLlib

■ Use one programming paradigm instead of mixing and

matching tools like Hive, Hadoop, Mahout, and Storm

■ Learn how to deploy interactive, batch, and streaming

applications

■ Connect to data sources including HDFS, Hive, JSON, and S3

■ Master advanced topics like data partitioning and shared

variables

Holden Karau, a software development engineer at Databricks, is active in open

source and the author of Fast Data Processing with Spark (Packt Publishing).

Andy Konwinski, co-founder of Databricks, is a committer on Apache Spark and

co-creator of the Apache Mesos project.

Patrick Wendell is a co-founder of Databricks and a committer on Apache Spark.

He also maintains several subsystems of Spark’s core engine.

Matei Zaharia, CTO at Databricks, is the creator of Apache Spark and serves as

its Vice President at Apache.

Learning Spark

Karau, Konwinski,

Wendell & Zaharia

Holden Karau, Andy Konwinski,

Patrick Wendell & Matei Zaharia

L e a r n i n g

Spark

LIGHTNING-FAST DATA ANALYSIS

PROGRAMMING LANGUAGES/SPARK

Learning Spark

ISBN: 978-1-449-35862-4

US $39.99 CAN $45.99

“

Learning Spark is at the

top of my list for anyone

needing a gentle guide

to the most popular

framework for building

big data applications.

”

—Ben Lorica

Chief Data Scientist, O’Reilly Media

Twitter: @oreillymedia

facebook.com/oreilly

Data in all domains is getting bigger. How can you work with it efficiently?

This book introduces Apache Spark, the open source cluster computing

system that makes data analytics fast to write and fast to run. With Spark,

you can tackle big datasets quickly through simple APIs in Python, Java,

and Scala.

Written by the developers of Spark, this book will have data scientists and

engineers up and running in no time. You’ll learn how to express parallel

jobs with just a few lines of code, and cover applications from simple batch

jobs to stream processing and machine learning.

■ Quickly dive into Spark capabilities such as distributed

datasets, in-memory caching, and the interactive shell

■ Leverage Spark’s powerful built-in libraries, including Spark

SQL, Spark Streaming, and MLlib

■ Use one programming paradigm instead of mixing and

matching tools like Hive, Hadoop, Mahout, and Storm

■ Learn how to deploy interactive, batch, and streaming

applications

■ Connect to data sources including HDFS, Hive, JSON, and S3

■ Master advanced topics like data partitioning and shared

variables

Holden Karau, a software development engineer at Databricks, is active in open

source and the author of Fast Data Processing with Spark (Packt Publishing).

Andy Konwinski, co-founder of Databricks, is a committer on Apache Spark and

co-creator of the Apache Mesos project.

Patrick Wendell is a co-founder of Databricks and a committer on Apache Spark.

He also maintains several subsystems of Spark’s core engine.

Matei Zaharia, CTO at Databricks, is the creator of Apache Spark and serves as

its Vice President at Apache.

Learning Spark

Karau, Konwinski,

Wendell & Zaharia

Holden Karau, Andy Konwinski,

Patrick Wendell & Matei Zaharia

L e a r n i n g

Spark

LIGHTNING-FAST DATA ANALYSIS

Table of Contents

Foreword. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix

Preface. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi

1. Introduction to Data Analysis with Spark. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

What Is Apache Spark? 1

A Unified Stack 2

Spark Core 3

Spark SQL 3

Spark Streaming 3

MLlib 4

GraphX 4

Cluster Managers 4

Who Uses Spark, and for What? 4

Data Science Tasks 5

Data Processing Applications 6

A Brief History of Spark 6

Spark Versions and Releases 7

Storage Layers for Spark 7

2. Downloading Spark and Getting Started. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

Downloading Spark 9

Introduction to Spark’s Python and Scala Shells 11

Introduction to Core Spark Concepts 14

Standalone Applications 17

Initializing a SparkContext 17

Building Standalone Applications 18

Conclusion 21

iii

剩余273页未读，继续阅读

zhangcuibao

2017-11-10

学习spark必备，很棒的资源

爱琴忆海

粉丝: 251
资源: 425

会员权益专享

Learning Spark .pdf 2015出版高清

评论3

会员权益专享

最新资源

Learning Spark .pdf 2015出版 高清

评论3

Learning Spark SQL

Spark原著中文版

Learning PySpark(pdf+epub+mobi+code_file).zip

reinforcement learning sutton .pdf

mit deep learning.pdf

deeplearning深度学习笔记v5.72.pdf

learning spark sql :目录

python machine learning machine learning and deep learning with python.pdf

In neural networks, in the context of error back propagation learning, define learning rate and explain its effect on the learning process.

learning markdown write for the web faster.pdf

python强化学习项目 python reinforcement learning projects - 2018.pdf

envi deep learning v1.0深度学习操作教程_202003.pdf

reinforcement learning: an introduction.pdf

deep-learning-with-pytorch.pdf 15章

learning spark笔记17-spark sql

无法解析名称 'learning.simulink.preferences.coursefeature.hasfeature

https://arxiv.org/pdf/1608.04644.pdf

spark-deep-learning spark和深度学习

ModuleNotFoundError: No module named 'learning.login'

会员权益专享

最新资源

Learning Spark .pdf 2015出版高清