Spark

Calculating executor memory, number of Executors & Cores per executor for a Spark Application

For better performance of spark application it is important to understand the resource allocation and the spark tuning process. This article help you to understand how to calculate the number of executors, executor memory and number of cores required for better performance of your application. Below is the sample spark submit command ./bin/spark-submit –class <class_name> …

Calculating executor memory, number of Executors & Cores per executor for a Spark Application Read More »

What is spark Executor?

Executor is a distributed agent that is responsible for executing tasks. Executors are managed by executor backend (ExecutorBackend is a pluggable interface that TaskRunners use to send task status updates to a scheduler). Executors reports heartbeat to HeartbeatReceiver RPC Endpoint on the driver. Executors provide in-memory storage for RDDs using via Block Manager. BlockManager is …

What is spark Executor? Read More »

Deep Understanding of SparkContext & Application’s Driver Process

DRIVER : A Spark driver is a JVM process that hosts SparkContext for a Spark application. It is the master node in a Spark application. The driver (an application’s driver process) splits a Spark application into tasks. It also schedules them to run on executors. It’s driver responsibility to coordinated with workers and also manage …

Deep Understanding of SparkContext & Application’s Driver Process Read More »

Apache Spark Architecture

Apache Spark is an open-source distributed general-purpose cluster-computing framework. Spark provides an interface for programming entire clusters with implicit data parallelism and fault tolerance. Sparks performance is up to 100 times faster in memory and 10 times faster on disk when compared to Hadoop. It is a general-purpose distributed computing engine used for processing and …

Apache Spark Architecture Read More »