Classification of Concept Drifting Data Streams Using Adaptive Novel-Class Detection
Ms. Aparna Yeshwantrao Ladekar, Dr. M.Y. Joshi, , ,
Affiliations MGMâ€™s College of Engineering, SRTMUN University, Nanded
In data stream classification there are many problems observed by the data mining community. Four major problems are addressed, such as, concept-drift, infinite length, feature-evolution and concept-evolution. Concept-drift occurs when underlying concept changes which is common in data streams. Practically it is not possible to store and use all data for training purpose whenever required due to infinite length of data streams. Feature evolution frequently occurs in many text streams. In text streams new features like words or phrases may occur when stream progresses. New classes evolving in the data stream which occurs concept-evolution as a result. Most existing classification techniques of data stream consider only the first two challenges, and ignore the latter two. Classification of concept-drifting data stream using adaptive novel-class detection approach is used to solve concept-drift and concept-evolution problem where novel-class detector is maintained with classifier. Novel-class detector is more adaptive to the dynamic and evolving data streams. It enables to detect more than one novel-class simultaneously. This approach solves feature-evolution problem by using feature set homogenization technique. Experiments done on Twitter data set and got reduced ERR rate and increased detection rate as a result. This approach is very effective as compared with existing data stream classification techniques
Aparna Yeshwantrao Ladekar et.al ," Classification of Concept Drifting Data Streams Using Adaptive Novel-Class Detectionâ€, International Journal of Computer Engineering In Research Trends, Volume 3, Issue 9, September-2016, pp. 514-520
Keywords : â€” Concept-drift, concept-evolution, data streams, novel-class, outlier
 Mohammad M. Masud, Member, IEEE, Qing Chen, Member, IEEE,
Latifur Khan, Senior Member, IEEE, Charu C. Aggarwal, Fellow,
IEEE, Jing Gao, Member, IEEE, Jiawei Han, Fellow, IEEE, Ashok
Srivastava, Senior Member, IEEE, and Nikunj C. Oza, Member,
IEEE,â€ Classification and Adaptive Novel-class Detection of Feature-Evolving
Data Streams,â€ IEEE Transactions on Knowledge
and Data Engineering, vol. 25, no. 7, July 2013.
 M.M. Masud, Q. Chen, J. Gao, L. Khan, J. Han, and B.M.
Thuraisingham, â€œClassification and Novel-class Detection of Data
Streams in a Dynamic Feature Space,â€ Proc. European
Conf.Machine Learning and Knowledge Discovery in Databases
(ECML PKDD), pp. 337-352, 2010.
 M.M. Masud, J. Gao, L. Khan, J. Han, and B.M. Thuraisingham,
â€œIntegrating Novel-class Detection with Classification for ConceptDrifting
Data Streams,â€ Proc. European Conf. Machine Learning
and Knowledge Discovery in Databases (ECML PKDD), pp. 79-94,
 A. Bifet and R. Kirkby. Data stream mining âˆ’ a practical approach.
 M.M. Masud, Q. Chen, L. Khan, C. Aggarwal, J. Gao, J. Han, and
B.M. Thuraisingham, â€œAddressing Concept-Evolution in ConceptDrifting
Data Streams,â€ Proc. IEEE Intâ€™l Conf. Data Mining (ICDM),
pp. 929-934, 2010.
 G. Hulten, L. Spencer, and P. Domingos, â€œMining Time-Changing
Data Streams,â€ Proc. ACM SIGKDD Seventh Intâ€™l Conf. Knowledge
Discovery and Data Mining, pp. 97-106, 2001.
 Christopher D. Manning, Prabhakar Raghavan & Hinrich SchÃ¼tz,
â€œIntroduction to Information Retrieval,â€ e, 2008.
 â€œStemmingâ€, http://en.wikipedia.org/wiki/Stemming.
 M.F.Porter, â€œAn algorithm for suffix stripping,â€ Computer Laboratory,
 E.J.Spinosa, A.P. de Leon F. de Carvalho, and J. Gama, â€œClusterBased
Novel Concept Detection in Data Streams Applied to Intrusion
Detection in Computer Networks,â€Proc. ACM Symp. Applied
Computing (SAC), pp. 976-980, 2008.
 I. Katakis, G. Tsoumakas, and I. Vlahavas, â€œDynamic Feature Space
and Incremental Feature Selection for the Classification of Textual
Data Streams, â€ Proc. IntlWorkshop Knowledge Discovery from
Data Streams (ECML/PKDD), pp. 102-116, 2006.
 M.M. Masud, J. Gao, L. Khan, J. Han, and B.M. Thuraisingham,
â€œClassification and Novel-class Detection in Concept-Drifting Data
Streams under Time Constraints,â€ IEEE Trans. Knowledge and Data
Eng., vol. 23, no. 6, pp. 859-874, June 2011.
 B.Wenerstrom and C.Giraud-Carrier, â€œTemporal Data Mining in
Dynamic Feature Spaces,â€ Proc. Sixth Intâ€™l Conf. Data Mining
(ICDM), pp. 1141-1145, 2006.
 W. Fan, â€œSystematic Data Selection to Mine Concept-Drifting Data
Streams,â€ Proc. ACM SIGKDD 10th Intâ€™l Conf. Knowledge Discovery
and Data Mining, pp. 128-137, 2004.
Authors are not required to pay any article-processing charges (APC) for their article to be published open access in Journal IJCERT. No charge is involved in any stage of the publication process, from administrating peer review to copy editing and hosting the final article on dedicated servers. This is free for all authors.
News & Events
Latest issue :Volume 10 Issue 1 Articles In press
☞ INVITING SUBMISSIONS FOR THE NEXT ISSUE :
☞ LAST DATE OF SUBMISSION : 31st March 2023
☞ SUBMISSION TO FIRST DECISION : In 7 Days
☞ FINAL DECISION : IN 3 WEEKS FROM THE DAY OF SUBMISSION
All the authors, conference coordinators, conveners, and guest editors kindly check their articles' originality before submitting them to IJCERT. If any material is found to be duplicate submission or sent to other journals when the content is in the process with IJCERT, fabricated data, cut and paste (plagiarized), at any stage of processing of material, IJCERT is bound to take the following actions.
1. Rejection of the article.
2. The author will be blocked for future communication with IJCERT if duplicate articles are submitted.
3. A letter regarding this will be posted to the Principal/Director of the Institution where the study was conducted.
4. A List of blacklisted authors will be shared among the Chief Editors of other prestigious Journals
We have been screening articles for plagiarism with a world-renowned tool: Turnitin However, it is only rejected if found plagiarized. This more stern action is being taken because of the illegal behavior of a handful of authors who have been involved in ethical misconduct. The Screening and making a decision on such articles costs colossal time and resources for the journal. It directly delays the process of genuine materials.