Open Access Open Access  Restricted Access Subscription Access

A Survey on Data Mapping Strategy for Data Stored in the Storage Cloud

Swathi K

Abstract


In the recent past the data being processed over the internet is increasing exponentially so it’s difficult to store such huge amount of data and it becomes computationally inefficient to analyze such huge data. There is currently considerable enthusiasm around the Map Reduce paradigm for large-scale data analysis. It is inspired by functional programming which allows expressing distributed computation massive amounts of data. It is designed for large-scale data processing as it allows running on clusters of commodity hardware. A prominent parallel data processing tool Map Reduce is gaining significant momentum from both industry and academia as the volume of data to analyze grows rapidly. In this paper we propose a method to process huge amount of data over the internet. This method involves storing the data to be processed on the cloud and processing the data on hadoop multicluster environment.


Full Text:

PDF

References


Maitrey S, Jha. An Integrated Approach for CURE Clustering using Map- Reduce Technique. In Proceedings of Elsevier, ISBN 978-81- 910691-6-3,2nd August 2013.

Kyuseok Shim. MapReduce Algorithms for Big Data Analysis. In Proceedings of the VLDB Endowment, Vol. 5, No. 12, August 27th 2012, Istanbul, Turkey.

Jeffrey Dean et al. Mapreduce: Simplified data processing on large clusters. In Proceedings of the 6th USENIX OSDI, pages 137–150, 2004.

J. Dean et al. MapReduce: Simplified data processing on large clusters. Communications of the ACM, 51(1):107– 113, 2008.

D. DeWitt and M. Stonebraker. MapReduce: A major step backwards. The Database Column, 1, 2008.

A. Pavlo et al. A comparison of approaches to large-scale data analysis. In Proceedings of the ACM SIGMOD, pages 165– 178, 2009.

M. Stonebraker et al. MapReduce and parallel DBMSs: friends or foes? Communications of the ACM, 53(1):64–71, 2010.

A. Thusoo et al. Hive: a warehousing solution over a mapreduce framework. Proceedings of the VLDB Endowment, (2):1626–1629, 2009.

A.F. Gates et al. Building a high-level dataflow system on top of Map- Reduce: the Pig experience. Proceedings of the VLDB Endowment, 2(2):1414–1425, 2009.

S. Ghemawat et al. The google file system. ACM SIGOPS Operating Systems Review, 37(5):29–43, 2003.


Refbacks

  • There are currently no refbacks.