Suite of decision tree-based classification algorithms on cancer gene expression data

dc.AffiliationOctober University for modern sciences and Arts (MSA)
dc.contributor.authorAl Khlil, Ibrahim Ali
dc.contributor.authorBadran, Khaled
dc.contributor.authorEl-DeeB, Mohamed
dc.contributor.authorAl Snousy, Mohmad Badr
dc.date.accessioned2019-12-24T14:10:04Z
dc.date.available2019-12-24T14:10:04Z
dc.date.issued2011-07
dc.descriptionAccession Number: WOS:000216412600002en_US
dc.description.abstractOne of the major challenges in microarray analysis, especially in cancer gene expression profiles, is to determine genes or groups of genes that are highly expressed in cancer cells but not in normal cells. Supervised machine learning techniques are used with microarray datasets to build classification models that improve the diagnostic of different diseases. In this study, we compare the classification accuracy among nine decision tree methods; which are divided into two main categories; the first is single decision tree C4.5, CART, Decision Stump, Random Tree and REPTree. The second category is ensample decision tree such Bagging (C4.5 and REPTree), AdaBoost (C4.5 and REPTree), ADTree, and Random Forests. In addition to the previous comparative analyses, we evaluate the behaviors of these methods with/without applying attribute selection (A.S.) techniques such as Chi-square attribute selection and Gain Ratio attribute selection. Usually, the ensembles learning methods: bagging, boosting, and Random Forest; enhanced classification accuracy of single decision tree due to the natures of its mechanism which generate several classifiers from one dataset and vote for their classification decision. The values of enhancement fluctuate between (4.99-6.19%). In majority of datasets and classification methods, Gain ratio attribute selection slightly enhanced the classification accuracy (similar to 1.05%) due to the concentration on the most promising genes having the effective information gain that discriminate the dataset. Also, Chi-square attributes evaluation for ensemble classifiers slightly decreased the classification accuracy due to the elimination of some informative genes. (C) 2011 Faculty of Computers and Information, Cairo University. Production and hosting by Elsevier B.V. All rights reserved.en_US
dc.description.urihttps://www.scimagojr.com/journalsearch.php?q=19700182731&tip=sid&clean=0
dc.identifier.doihttps://doi.org/10.1016/j.eij.2011.04.003
dc.identifier.issn1110-8665
dc.identifier.otherhttps://doi.org/10.1016/j.eij.2011.04.003
dc.identifier.urihttps://www.sciencedirect.com/science/article/pii/S1110866511000223
dc.language.isoenen_US
dc.publisherCAIRO UNIVen_US
dc.relation.ispartofseriesEGYPTIAN INFORMATICS JOURNAL;Volume: 12 Issue: 2 Pages: 73-82
dc.relation.urihttps://t.ly/2GMYk
dc.subjectMICROARRAYen_US
dc.subjectAttribute selectionen_US
dc.subjectEnsample decision treeen_US
dc.subjectDecision treesen_US
dc.subjectClassificationen_US
dc.subjectCanceren_US
dc.subjectDNA microarrayen_US
dc.titleSuite of decision tree-based classification algorithms on cancer gene expression dataen_US
dc.typeArticleen_US

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
avatar_scholar_256.png
Size:
6.31 KB
Format:
Portable Network Graphics
Description: