Below is the data used in my csv file
empid,empname,empsal,empdept,empblock
1,abc,2000,cse,A
2,def,1000,ece,C
3,ghi,8000,eee,D
4,jkl,4000,ece,B
5,mno,3000,itd,F
6,pqr,6000,mec,C
1)Running below statement would create one job in spark UI to determine the column names although it is not an action which is known. Attached below is job create in spark UI.
df1=spark.read.format("csv").option("header",True).load('csv_file_location')
2)Running below would not create any job at the moment as it is a transformation
x=df1.groupBy("empblock").agg(avg("empsal").alias("avgsal")).filter(col("avgsal")>2000).orderBy("empblock")
3)When I run below, it is creating 2 jobs. Isn't one action supposed to create one job? What's the reason for multiple jobs being created? does number of jobs don't depend on number of actions being called?
x.show()