Affiliations Information Science Dept. , Acharya Doctor Sarvepalli Radhakrishnan Rd, Bengaluru, Karnataka 560107, India.
:10.22362/ijcert/2017/v4/i6/xxxx [UNDER PROCESS]
Big data is growing rapidly regarding volume, variability, and velocity which make it difficult to process, capture and analyze the data. Hadoop uses MapReduce which has two parts Map and Reduce whereas Spark uses Resilient Distributed Datasets (RDD) and Directed Acyclic Graph (DAG) for processing of large datasets. To store data both of them uses Hadoop Distributed File System (HDFS).This paper shows the architecture and working of Hadoop and Spark and brings out the differences between them and the challenges faced by MapReduce during processing of large datasets and how Spark works on Hadoop YARN.
Priya Dahiya et.al, “Survey on Big Data using Apache Hadoop and Spark”, International Journal of Computer Engineering In Research Trends, 4(6):pp:195-201,June -2017.
Keywords : Big data, Spark, Hadoop, HDFS, MapReduce, YARN
We have kept IJCERT is a free peer-reviewed scientific journal to endorse conservation. We have not put up a paywall to readers, and we do not charge for publishing. But running a monthly journal costs is a lot. While we do have some associates, we still need support to keep the journal flourishing. If our readers help fund it, our future will be more secure.