Apache Spark Architecture

Apache Spark is an open-source distributed general-purpose cluster-computing framework. Spark provides an interface for programming entire clusters with implicit data parallelism and fault tolerance. Sparks performance is up to 100 times faster in memory and 10 times faster on disk when compared to Hadoop. It is a general-purpose distributed computing engine used for processing and … Continue reading Apache Spark Architecture