Classification of tumors based on distinguishing possibilistic biclusters
Loading...
Date
2023-07
Authors
Journal Title
Journal ISSN
Volume Title
Type
Article
Publisher
IEEE
Series Info
1st International Conference of Intelligent Methods, Systems and Applications, IMSA 2023;Pages 333 - 3382023
Scientific Journal Rankings
Abstract
The application of machine learning techniques to analyze very high dimensional gene expression data in order to uncover the underlying biological mechanisms of complex diseases is one of the most crucial bioinformatics tasks. The focus of this research is the development of a multi-class classifier for the classification of tumors and identifying relevant genes based on distinguishing fuzzy biclusters. The proposed technique starts by computing k overlapping clusters of genes. For each cluster of genes C, we have c biclusters, where c is the number of classes, such that, for class t, a bicluster Bt = (C, classt samples). The clustering technique uses the average mean square residue (MSR) such that for each class t the MSR(Bt) is minimized while simultaneously maximizing the cluster size. Then a possibilistic technique is applied to each cluster to compute a membership for each gene in its cluster such that the assigned memberships maximize the weighted average of the fuzzy residue of each row in Bt' when added to Bt while minimize the number of genes and the weighted average of the fuzzy residue of each row in Bt in its bicluster by calculating the derivative's zeros for the proposed objective function. The vector of possibilistic memberships associated with each cluster Ct works as a multi-class classifier. The decision space can be made more discriminative by combining the outcomes predicted by k separate models using stacking or weighted majority voting. In order to classify an unseen sample xu, the relative increase in the weighted average of the squared fuzzy residue of Bt, for each class t, when xu is added to it, is computed. Then, xu is assigned to a class t if the relative increase in the weighted average of the squared fuzzy residue of Bt is the minimum. Results from experiments indicate that the performance of the suggested technique and its computational complexity are comparable to algorithms that combine metaheuristic algorithms with SVM.
Description
Keywords
Biclustering; Biomarkers; Cancer Diagnosis; Classification; Feature Selection; Gene Expressions