A mapreduce fuzzy techniques of big data classification

El Bakry M.; Safwat S.; Hegazy O.

doi:https://doi.org/10.1109/SAI.2016.7555971

A mapreduce fuzzy techniques of big data classification

dc.Affiliation	October University for modern sciences and Arts (MSA)
dc.contributor.author	El Bakry M.
dc.contributor.author	Safwat S.
dc.contributor.author	Hegazy O.
dc.contributor.other	Faculty of Computer Science
dc.contributor.other	October University for Modern Sciences and Arts
dc.contributor.other	Giza
dc.contributor.other	Egypt; Faculty of Computers and Information
dc.contributor.other	Cairo University
dc.contributor.other	Giza
dc.contributor.other	Egypt
dc.date.accessioned	2020-01-09T20:41:34Z
dc.date.available	2020-01-09T20:41:34Z
dc.date.issued	2016
dc.description	Scopus
dc.description.abstract	Due to the huge increase in the size of the data it becomes troublesome to perform efficient analysis using the current traditional techniques. Big data put forward a lot of challenges due to its several characteristics like volume, velocity, variety, variability, value and complexity. Today there is not only a necessity for efficient data mining techniques to process large volume of data but in addition a need for a means to meet the computational requirements to process such huge volume of data. The objective of this research is to implement a map reduce paradigm using fuzzy and crisp techniques, and to provide a comparative study between the results of the proposed systems and the methods reviewed in the literature. In this paper four proposed system is implemented using the map reduce paradigm to process on big data. First, in the mapper there are two techniques used; the fuzzy k-nearest neighbor method as a fuzzy technique and the support vector machine as non-fuzzy technique. Second, in the reducer there are three techniques used; the mode, the fuzzy soft labels and Gaussian fuzzy membership function. The first proposed system is using the fuzzy KNN in the mapper and the mode in the reducer, the second proposed system is using the SVM in the mapper and the mode in the reducer, the third proposed system is using the SVM in the mapper and the soft labels in the reducer, and the fourth proposed system is using the SVM in the mapper and fuzzy Gaussian membership function in the reducer. Results on different data sets show that the fuzzy proposed methods outperform a better performance than the crisp proposed method and the method reviewed in the literature. � 2016 IEEE.	en_US
dc.description.uri	https://www.scimagojr.com/journalsearch.php?q=21100780803&tip=sid&clean=0
dc.identifier.doi	https://doi.org/10.1109/SAI.2016.7555971
dc.identifier.doi	PubMed ID :
dc.identifier.isbn	9.78E+12
dc.identifier.other	https://doi.org/10.1109/SAI.2016.7555971
dc.identifier.other	PubMed ID :
dc.identifier.uri	https://t.ly/LX1KX
dc.language.iso	English	en_US
dc.publisher	Institute of Electrical and Electronics Engineers Inc.	en_US
dc.relation.ispartofseries	Proceedings of 2016 SAI Computing Conference, SAI 2016
dc.subject	October University for Modern Sciences and Arts
dc.subject	جامعة أكتوبر للعلوم الحديثة والآداب
dc.subject	University of Modern Sciences and Arts
dc.subject	MSA University
dc.subject	Big data	en_US
dc.subject	Classification	en_US
dc.subject	Fuzzy k-nearest neighbor	en_US
dc.subject	Hadoop	en_US
dc.subject	MapReduce	en_US
dc.subject	Support vector machine	en_US
dc.subject	Classification (of information)	en_US
dc.subject	Data mining	en_US
dc.subject	Membership functions	en_US
dc.subject	Motion compensation	en_US
dc.subject	Nearest neighbor search	en_US
dc.subject	Support vector machines	en_US
dc.subject	Computational requirements	en_US
dc.subject	Data classification	en_US
dc.subject	Fuzzy k nearest neighbor (FKNN)	en_US
dc.subject	Fuzzy membership function	en_US
dc.subject	Gaussian membership function	en_US
dc.subject	Hadoop	en_US
dc.subject	Map-reduce	en_US
dc.subject	Traditional techniques	en_US
dc.subject	Big data	en_US
dc.title	A mapreduce fuzzy techniques of big data classification	en_US
dc.type	Conference Paper	en_US
dcterms.isReferencedBy	Zhang, J., A survey of recent technologies and challenges in big data utilizations (2015) Information and Communication Technology Convergence (ICTC), 2015 International Conference on, , IEEE; Lu, W., Efficient processing of k nearest neighbor joins using MapReduce (2012) Proceedings of the VLDB Endowment, 5 (10), pp. 1016-1027; Liu, Y., HSim: A MapReduce simulator in enabling cloud computing (2013) Future Generation Computer Systems, 29 (1), pp. 300-308; R�o, S., A mapreduce approach to address big data classification problems based on the fusion of linguistic fuzzy rules (2015) International Journal of Computational Intelligence Systems, 8 (3), pp. 422-437; Xu, K., A mapreduce based parallel SVM for email classification (2014) Journal of Networks, 9 (6), pp. 1640-1647; Liu, Z., Li, H., Miao, G., MapReduce-based backpropagation neural network over large scale mobile data (2010) Natural Computation (ICNC), 2010 Sixth International Conference on, , IEEE; Lu, K., Unbinds data and tasks to improving the Hadoop performance (2014) Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing (SNPD), 2014 15th IEEE/ACIS International Conference on, , IEEE; Tong, H., Big data classification (2014) Data Classification: Algorithms and Applications, p. 275; Wu, G., MReC4 5: C4 5 ensemble classification with MapReduce (2009) ChinaGrid Annual Conference, 2009. ChinaGrid'09. Fourth, , IEEE; Lee, K.-H., Parallel data processing with MapReduce: A survey (2012) AcM SIGMoD Record, 40 (4), pp. 11-20; Koturwar, P., Girase, S., Mukhopadhyay, D., (2015) A Survey of Classification Techniques in the Area of Big Data, , arXiv preprint arXiv:1503.07477; Wang, X., Pardalos, P.M., A survey of support vector machines with uncertainties (2015) Annals of Data Science, 1 (3-4), pp. 293-309; Keller, J.M., Gray, M.R., Givens, J.A., A fuzzy k-nearest neighbor algorithm (1985) Systems, Man and Cybernetics, IEEE Transactions on, (4), pp. 580-585; El Gayar, N., Schwenker, F., Palm, G., A study of the robustness of KNN classifiers trained using soft labels (2006) ANNPR, , Springer; Klir, G., Yuan, B., (1995) Fuzzy Sets and Fuzzy Logic, 4. , Prentice Hall New Jersey; Triguero, I., (2015) MRPR: A MapReduce Solution for Prototype Reduction in Big Data Classification. Neurocomputing, 150, pp. 331-345; Kopczynski, M., Grzes, T., Stepaniuk, J., Computation of cores in big datasets: An FPGA approach (2015) Rough Sets and Knowledge Technology, pp. 153-163. , Springer
dcterms.source	Scopus

Files

Original bundle

Now showing 1 - 1 of 1

Name:: avatar_scholar_256.png
Size:: 6.31 KB
Format:: Portable Network Graphics
Description:

Download

Collections

Faculty of Computer Science Research Paper