spark : in-memery computation spark : Data extracting from hard disk and store into ram. and each step operation are store in ram and only after completion of job it write to hard disk. hadoop mapreduce : it perform some operation and store to hard drive every time for each step. it read and write from hard disk every step of the job. so latency is high. Lazy execution : when applying function to read the data it does not read because we do not performing any opearaiton. it does not read data until we perform some operation or computation. suppose we do in pandas pd.read_csv it read data store in ram. Parallel Processing : distributed the data into different cluster and stored in nodes. batch processing and real-time processing ex. credit card transaction. genuine or fake