PROGRESSIVE DUPLICATE DETECTION

Mr .BETKAR AKSHAY SURESH; Mrs. N.SUJATHA

PDF

Published: Jun 28, 2016

Keywords:

Data Duplicity Detection, Progressive deduplication, PSNM, Data Mining

Mr .BETKAR AKSHAY SURESH

Mrs. N.SUJATHA

Abstract

One of the difficult issues confronted in a few applications with individual subtle elements administration, client alliance administration, information mining, and so on is copy location. This overview manages the different copy record identification strategies in both little and substantial datasets. To identify the deception with less time of execution furthermore without exasperating the dataset quality, strategies like Progressive Blocking and Progressive Neighborhood are utilized. Progressive sorted neighborhood method likewise called as PSNM is utilized as a part of this model for finding or recognizing the copy in a parallel methodology. Progressive Blocking calculation takes a shot at huge datasets where discovering duplication requires massive time. These calculations are utilized to improve copy location framework. The productivity can be multiplied over the ordinary copy recognition technique utilizing this calculation. A few distinct strategies for information examination are considered here with different methodologies for copy discovery.

How to Cite

[1]

Mr .BETKAR AKSHAY SURESH and Mrs. N.SUJATHA, “PROGRESSIVE DUPLICATE DETECTION”, Int. J. Comput. Eng. Res. Trends, vol. 3, no. 6, pp. 284–288, Jun. 2016.

Issue

Vol. 3 No. 6 (2016): June (2016) Issue

Section

Research Articles

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

IJCERT Policy:

The published work presented in this paper is licensed under the Creative Commons Attribution 4.0 International (CC BY 4.0) license. This means that the content of this paper can be shared, copied, and redistributed in any medium or format, as long as the original author is properly attributed. Additionally, any derivative works based on this paper must also be licensed under the same terms. This licensing agreement allows for broad dissemination and use of the work while maintaining the author's rights and recognition.

By submitting this paper to IJCERT, the author(s) agree to these licensing terms and confirm that the work is original and does not infringe on any third-party copyright or intellectual property rights.

References

"Data Mining Curriculum". ACM SIGKDD. 2006-04- 30. Retrieved 2014-01-27.

Fayyad, Usama; Piatetsky-Shapiro, Gregory; Smyth, Padhraic (1996). "From Data Mining to Knowledge Discovery in Databases" (PDF). Retrieved 17 December 2008.

Hastie, Trevor; Tibshirani, Robert; Friedman, Jerome (2009). "The Elements of Statistical Learning: Data Mining, Inference, and Prediction". Retrieved 2012-08- 07.

Witten, Ian H.; Frank, Eibe; Hall, Mark A. (30 January 2011). Data Mining: Practical Machine Learning Tools and Techniques (3 ed.). Elsevier. ISBN 978-0-12- 374856-0.

Think Before You Dig: Privacy Implications of Data Mining & Aggregation, NASCIO Research Brief, September 2004

Clifton, Christopher (2010). "Encyclopædia Britannica: Definition of Data Mining". Retrieved 2010-12-09.

M. A. Hern{ndez and S. J. Stolfo, “Realworld data is dirty: Data cleansing and the merge/purge problem,” Data Mining and Knowledge Discovery, vol. 2, no. 1, 1998

Thorsten Papenbrock, Arvid Heise, and Felix Naumann,’ Progressive Duplicate Detection’ IEEE Transactions on Knowledge and Data Engineering(TKDE),vol . 25, no. 5, 2014.

A.K. Elmagarmid, P. G. Ipeirotis, and V. S.Verykios, “Duplicate record detection: Asurvey,” IEEE Transactions on Knowledge and Data Engineering (TKDE), vol. 19, no. 1, 2007.

S. E. Whang, D. Marmaros, and H. GarciaMolina, “Pay-as-you-go entity resolution,” IEEE Transactions on Knowledge and Data Engineering (TKDE), vol. 25, no. 5, 2012.

U. Draisbach, F. Naumann, S. Szott, and O. Wonneberg, “Adaptive windows for duplicatedetection,” in Proceedings of the International Conference on Data Engineering (ICDE), 2012.

PROGRESSIVE DUPLICATE DETECTION

Abstract

References

Most read articles by the same author(s)

QUICK LINKS

FOR AUTHORS

FOR REVIEWERS

JOURNAL CONTENTS

DOWNLOADS

Article Sidebar

Main Article Content

Abstract

Article Details

References

Most read articles by the same author(s)