Diagram Raymond Raymond

Spark Memory Management Overview

event 2022-03-27 visibility 452 comment 0

This diagram shows an overview of Spark memory management when running in YARN. It helps you to understand how your Spark memory is allocated and how they are used.

In Spark executor, there are two types of memory used:

  • Execution memory - refers to that used for computation in shuffles, joins, sorts and aggregations;
  • Storage memory - refers to that used for caching and propagating internal data across the cluster.

When no storage memory is used, execution can use all the available memory and vice versa. 

These two types of memory usage are decided by two configuration items:

  • spark.memory.fraction expresses the size of M as a fraction of the (JVM heap space - 300MiB) (default 0.6). The rest of the space (40%) is reserved for user data structures, internal metadata in Spark, and safeguarding against OOM errors.
  • spark.memory.storageFraction expresses the size of R as a fraction of M (default 0.5). 
comment Comments
No comments yet.

Please log in or register to comment.

account_circle Log in person_add Register

Log in with external accounts

tag Tags
info Info
Image URL