Month: June 2019

Deep Understanding of SparkContext & Application’s Driver Process

DRIVER : A Spark driver is a JVM process that hosts SparkContext for a Spark application. It is the master node in a Spark application. The driver (an application’s driver process) splits a Spark application into tasks. It also schedules them to run on executors. It’s driver responsibility to coordinated with workers and also manage …

Deep Understanding of SparkContext & Application’s Driver Process Read More »

Apache Spark Architecture

Apache Spark is an open-source distributed general-purpose cluster-computing framework. Spark provides an interface for programming entire clusters with implicit data parallelism and fault tolerance. Sparks performance is up to 100 times faster in memory and 10 times faster on disk when compared to Hadoop. It is a general-purpose distributed computing engine used for processing and …

Apache Spark Architecture Read More »