Apache Spark Ecosystem

Introduction to Apache Spark

Apache Spark™ is a unified analytics engine for large-scale data processing developed at UC Berkeley in 2009. It has received rapid acceptance from a wide range of industries, especially those that process at massive scale. Apache Spark can process multiple petabytes of data residing on over 8,000 nodes. It is an open source project supported by over 1000 contributors from over 250 organizations.

Image from DataBricks

Slides

slides - pdfarrow-up-right

slides - pptxarrow-up-right

Apache Spark is known for:

Credit : Apache Sparkarrow-up-right

Last updated

Was this helpful?