Become a Seller

Cart

Learning Spark: Lightning-Fast Big Data Analysis 1st Edition (English, Paperback, KARAU) 1st Edition (English, Paperback, KARAU)

Price: Not Available

Currently Unavailable

Author

KARAU

Highlights

Language: English
Binding: Paperback
Publisher: Shroff/O'Reilly
ISBN: 9789351109945, 9351109941
Edition: 1st Edition, 2015
Pages: 296

Description

Data in all domains is getting bigger. How can you work with it efficiently? This book introduces Apache Spark, the open source cluster computing system that makes data analytics fast to write and fast to run. With Spark, you can tackle big datasets quickly through simple APIs in Python, Java, and Scala.

Written by the developers of Spark, this book will have data scientists and engineers up and running in no time. You’ll learn how to express parallel jobs with just a few lines of code, and cover applications from simple batch jobs to stream processing and machine learning.
Quickly dive into Spark capabilities such as distributed datasets, in-memory caching, and the interactive shell
Leverage Spark’s powerful built-in libraries, including Spark SQL, Spark Streaming, and MLlib
Use one programming paradigm instead of mixing and matching tools like Hive, Hadoop, Mahout, and Storm
Learn how to deploy interactive, batch, and streaming applications
Connect to data sources including HDFS, Hive, JSON, and S3
Master advanced topics like data partitioning and shared variables

About the Authors

Holden Karau is a software development engineer at Databricks and is active in open source. She is the author of an earlier Spark book. Prior to Databricks she worked on a variety of search and classification problems at Google, Foursquare, and Amazon. She graduated from the University of Waterloo with a Bachelors of Mathematics in Computer Science. Outside of software she enjoys playing with fire, welding, and hula hooping.

Most recently, Andy Konwinski co-founded Databricks. Before that he was a PhD student and then postdoc in the AMPLab at UC Berkeley, focused on large scale distributed computing and cluster scheduling.

He co-created and is a committer on the Apache Mesos project. He also worked with systems engineers and researchers at Google on the design of Omega, their next generation cluster scheduling system. More recently, he developed and led the AMP Camp Big Data Bootcamps and first Spark Summit, and has been contributing to the Spark project.

Patrick Wendell is an engineer at Databricks as well as a Spark Committer and PMC member. In the Spark project, Patrick has acted as release manager for several Spark releases, including Spark 1.0. Patrick also maintains several subsystems of Spark's core engine. Before helping start Databricks, Patrick obtained an M.S. in Computer Science at UC Berkeley. His research focused on low latency scheduling for large scale analytics workloads. He holds a B.S.E in Computer Science from Princeton University

Matei Zaharia is the creator of Apache Spark and CTO at Databricks. He holds a PhD from UC Berkeley, where he started Spark as a research project. He now serves as its Vice President at Apache. Apart from Spark, he has made research and open source contributions to other projects in the cluster computing area, including Apache Hadoop (where he is a committer) and Apache Mesos (which he also helped start at Berkeley).

Specifications

Book Details

Publication Year

2015

Contributors

Author

KARAU

Ratings & Reviews

4.5

★

164 Ratings &

9 Reviews

5★
4★
3★
2★
1★

The Best Book For Self Learners

I was already trained in hadoop, so with that knowledge, I wanted to learn spark on my own. I have gone thru many materials including the official documentation page and amplabs. I have also taken couple of video trainings in pluralsight. But, my confident level was low with those materials. But this book perfectly matched my need and addressed the CORE of SPARK.

* Covers the basics & core in detail
* Focuses on all the three languagues ( Python, Scala & Java)
* Covers the advanced top...

Saravanan Subramanian

Certified Buyer, Chennai

Jan, 2016

Permalink

Report Abuse

Excellent

Nice book 👍

Flipkart Customer

Certified Buyer, New Delhi

Sep, 2021

Permalink

Report Abuse

Perfect product!

Is a great book for spark. For beginners with zero knowledge, this book.will help you get a good knowledge.on spark.
This book has to be supplemented with definitive guide to spark.

Siddrameshwar

Certified Buyer, Bangalore

Aug, 2020

Permalink

Report Abuse

Useless product

paper quality is poor and Looks like it is a photocopy

Shailendra Verma

Certified Buyer, Gurugram

Jan, 2020

Permalink

Report Abuse

Best in the market!

book s good...it (spark)has been written with help of java,python,scala..

Sudarshan INDIA

Certified Buyer, Bangalore

Feb, 2019

Permalink

Report Abuse

Just wow! Nice book

very nice book

Avinash waghole

Certified Buyer, Pune

Jan, 2019

Permalink

Report Abuse

Worth every penny

book arrived well in time and in well packaged condition.

Flipkart Customer

Certified Buyer, Gautam Buddha Nagar

May, 2018

Permalink

Report Abuse

Just wow!

Good

Ayyappa Mandadi

Certified Buyer, Hyderabad

Apr, 2018

Permalink

Report Abuse

Review from Rimondi

This is a very informative book available on flipkart.

Rimondi Ram Muvva

Certified Buyer, Hyderabad

Aug, 2016

Permalink

Report Abuse

All 9 reviews

Questions and Answers

Q:this version us 2015, isn't it too old now keeping in mind that spark us moving fron rdd to dataframes..? does it have significant coverage on spafk sql and dataframes?

A:No spark sql is not elaborated...just intro s given

Sudarshan INDIA

Certified Buyer

Report Abuse

Didn't get the right answer you were looking for

Safe and Secure Payments.Easy returns.100% Authentic products.