I'm new to PySpark and I'm trying to use pySpark (ver 2.3.1) on my local computer with Jupyter-Notebook.
I want to set spark.driver.memory to 9Gb by doing this:
spark = SparkSession.builder \ .master("local[2]") \ .appName("test") \ .config("spark.driver.memory", "9g")\ .getOrCreate()sc = spark.sparkContextfrom pyspark.sql import SQLContextsqlContext = SQLContext(sc)spark.sparkContext._conf.getAll() # check the config
It returns
[('spark.driver.memory', '9g'),('spark.driver.cores', '4'),('spark.rdd.compress', 'True'),('spark.driver.port', '15611'),('spark.serializer.objectStreamReset', '100'),('spark.app.name', 'test'),('spark.executor.id', 'driver'),('spark.submit.deployMode', 'client'),('spark.ui.showConsoleProgress', 'true'),('spark.master', 'local[2]'),('spark.app.id', 'local-xyz'),('spark.driver.host', '0.0.0.0')]
It's quite of weird because when I look at the document, it shows that
Note: In client mode, this config must not be set through the SparkConf directly in your application, because the driver JVM has already started at that point. Instead, please set this through the --driver-memory command line option or in your default properties file. document here
But, as you see in the result above, it returns
[('spark.driver.memory', '9g')
Even when I access to the spark web UI (on port 4040, environment tab), it still shows
I tried one more time, with 'spark.driver.memory', '10g'
. The web UI and spark.sparkContext._conf.getAll()
returned '10g'.I'm so confused about that.My questions are:
Is the document right about
spark.driver.memory
configIf the document is right, is there a proper way that I can check
spark.driver.memory
after config. I triedspark.sparkContext._conf.getAll()
as well as Spark web UI but it seems to lead to a wrong answer.