I have config file that has all the configurations, shell script that has the spark-submit command, and the actual pyspark scriptI want to apply a filter on a field to a df in python script.
ExampleConfig file:sub_list = 'math', 'eng'... etc something like this.Pyspark script:Df = spark.read.parquet(hdfspath)Df.select("Id", "sub", "name").filter(df.id.isin(sub_list))
In config file, I want to have a parameter called sub_list that has multiple values which will eventually be called in shell script that shoots spark-submit command to run the pyspark script.
I tried this in config file but filtering is not being done.sub_list = "'abc', 'def', 'ghi'"
I want the exact format of how we need to pass list as an argument in config file.