Faculty Of Computer Science Research Paper
Permanent URI for this collectionhttp://185.252.233.37:4000/handle/123456789/304
Browse
Recent Submissions
Item Fusing CNNs and attention-mechanisms to improve real-time indoor Human Activity Recognition for classifying home-based physical rehabilitation exercises(Elsevier Ltd, 2025-01-01) Moamen Zaher; Amr S. Ghoneim; Laila Abdelhamid; Ayman AtiaPhysical rehabilitation plays a critical role in enhancing health outcomes globally. However, the shortage of physiotherapists, particularly in developing countries where the ratio is approximately ten physiotherapists per million people, poses a significant challenge to effective rehabilitation services. The existing literature on rehabilitation often falls short in data representation and the employment of diverse modalities, limiting the potential for advanced therapeutic interventions. To address this gap, This study integrates Computer Vision and Human Activity Recognition (HAR) technologies to support home-based rehabilitation. The study mitigates this gap by exploring various modalities and proposing a framework for data representation. We introduce a novel framework that leverages both Continuous Wavelet Transform (CWT) and Mel-Frequency Cepstral Coefficients (MFCC) for skeletal data representation. CWT is particularly valuable for capturing the time-frequency characteristics of dynamic movements involved in rehabilitation exercises, enabling a comprehensive depiction of both temporal and spectral features. This dual capability is crucial for accurately modelling the complex and variable nature of rehabilitation exercises. In our analysis, we evaluate 20 CNNbased models and one Vision Transformer (ViT) model. Additionally, we propose 12 hybrid architectures that combine CNN-based models with ViT in bi-model and tri-model configurations. These models are rigorously tested on the UI-PRMD and KIMORE benchmark datasets using key evaluation metrics, including accuracy, precision, recall, and F1-score, with 5-fold cross-validation. Our evaluation also considers realtime performance, model size, and efficiency on low-power devices, emphasising practical applicability. The proposed fused tri-model architectures outperform both single-architectures and bi-model configurations, demonstrating robust performance across both datasets and making the fused models the preferred choice for rehabilitation tasks. Our proposed hybrid model, DenMobVit, consistently surpasses state-of-the-art methods, achieving accuracy improvements of 2.9% and 1.97% on the UI-PRMD and KIMORE datasets, respectively. These findings highlight the effectiveness of our approach in advancing rehabilitation technologies and bridging the gap in physiotherapy services.Item An efficient deep learning prognostic model for remaining useful life estimation of high speed CNC milling machine cutters(Elsevier B.V, 2024-11-16) Hamdy K. Elminir; Mohamed A. El-Brawany; Dina Adel Ibrahim; Hatem M. Elattar; E.A. RamadanCNC machines are engaged in numerous industries, including critical ones like the aerospace, automotive, and military sectors, among others. Sensor data are time-series that may suffer from complex interconnections between variables and dynamic features. Long Short Term Memory LSTM excels in dynamic feature extraction, and Autoencoder AE has great capabilities in nonlinear deep knowledge of time-series data variables. In this work, we propose a model for tool wear prediction of CNC milling machine cutters as a type of time-series data taking advantage of the LSTM and AE capabilities. The framework consists of many steps, including extracting multi-domain features and a correlation analysis to select the most correlated features to the tool wear. New features are added, such as entropy and interquartile range IQR, which proved to be highly correlated to the cutter tool wear. An LSTM`-AE model is then trained, validated, and tested on this feature map to predict the target tool wear value. The model is provided with degradation or Run-To-Failure data for CNC machine cutters, the PHM10 dataset, to predict the tool wear values. The predicted tool wear value is compared against the wear curve to estimate RUL values. The predicted RUL values mostly underestimate the real values, which helps schedule for maintenance or equipment replacement before failure. The experimental results show that the proposed framework outperforms state-of-the-art DL methods in tool wear prediction accuracy approaching %98, as well as an enhancement of MAE and RMSE in the test set by reaching 2.6 ± 0.3222E-3 and 3.1 ± 0.6146 E-3, respectively.Item Deep Learning-Assisted Compound Bioactivity Estimation Framework(Egyptian Informatics Journal, 2024-12) Yasmine Eid Mahmoud Yousef; Ayman El-Kilany; Ayman El-Kilany; Yassin M. Nissan; Ehab E. HassaneinDrug Discovery is a highly complicated process. On average, it takes six to twelve years to manufacture a new drug and have the product released in the market. It is of utmost importance to find methods that would accelerate the manufacturing process. This significant challenge in drug development can be addressed using deep learning techniques. The aim of this paper is to propose a deep learning-based framework that can help chemists examine compound biological activity in a more accurate manner. The proposed framework employs autoencoder for data representation of the compounds data, which is then classified using deep neural network followed by building a customized deep regression model to estimate an accurate value of the compound bioactivity. The proposed framework achieved an accuracy of 89% in autoencoder reconstruction error, 79.01% in classification, and MAE of 2.4 while predicting compound bioactivity using deep regression model.Item Hybrid Deep Learning Model Based on GAN and RESNET for Detecting Fake Faces(Institute of Electrical and Electronics Engineers Inc, 2024-06) SAFWAT, SOHA; MAHMOUD, AYAT; FATTOH, IBRAHIM ELDESOUKY; ALI, FARIDWhile human brains have the ability to distinguish face characteristics, the use of advanced technology and artificial intelligence blurs the difference between actual and modified images. The evolution of digital editing applications has led to the fabrication of very lifelike false faces, making it harder for humans to discriminate between real and made ones. Because of this, techniques like deep learning are being used increasingly to distinguish between real and artificial faces,producing more consistent and accurate results. In order to detect fraudulent faces, This paper introduces a pioneering hybrid deep learning model, which merges the capabilities of Generative Adversarial Networks (GANs) and the Residual Neural Network (RESNET) architecture, aimed at detecting fake faces. By integrating GANs’ generative strength with RESNET’s discriminative abilities, the proposed model offers a novel approach to discerning real from artificial faces. Through a comparative analysis, the performance of the hybrid model is evaluated against established pre-trained models such as VGG16 and RESNET 50. Results demonstrate the superior effectiveness of the hybrid model in accurately detecting fake faces, marking a notable advancement in facial image recognition and authentication. The findings on a benchmark dataset show that the proposed model obtains outstanding performance measures, including precision 0.79, recall 0.88, F1-score 0.83, accuracy 0.83, and ROC AUC Score 0.825. The study’s conclusions highlight the hybrid model’s strong performance in identifying fake faces, especially when it comes to accuracy, precision, and memory economy. By combining the generative capacity of GANs with the discriminative capabilities of RESNET, this solves the problems caused by more complex fake face generation approaches.With significant potential for use in identity verification, social media content moderation, cybersecurity, and other areas, the study seeks to advance the field of false face identification. In these situations, being able to accurately discriminate between real and altered faces is crucial. Notably, our suggested model adds Channel-Wise Attention Mechanisms to RESNET50 at the feature extraction phase, which increases its effectiveness and boosts its overall performance.Item Random Forest-Based Survival Analysis for Predicting the Future Progression of Brain Disorder from Mild Cognitive Impairment (MCI) to Alzheimer’s Disease (AD)(Intelligent Networks and Systems Society, 2024-05) Zawawi, Nour; Negied, NerminThe race to halt Alzheimer's disease (AD) in its tracks demands an early warning system. By predicting which mild cognitive impairment (MCI) patients are likely to decline into AD, clinicians can intervene while the window of opportunity remains open. But how to separate the MCI patients bound for AD from those with more benign forms of impairment? The key lies in examining the factors that influence disease progression. While prior studies have scratched the surface, a comprehensive analysis has proven elusive. Enter the Alzheimer's Disease Neuroimaging Initiative database, which tracks AD progression through a wealth of patient characteristics. Leveraging these rich data, our hybrid approach combines survival analysis with machine learning to generate dynamic predictions of time to AD onset. Rather than merely detecting AD early or diagnosing its current state, our model gazes into the future, forecasting progression from MCI to AD before the disease fully erupts. Among similar efforts, the proposed approach stands apart in scale and accuracy, validated on more patients and with higher predictive power than earlier attempts. Even cognitive tests or brain scans alone can foretell decline, with the proposed work achieving a remarkable C-index of 0.85 when evaluated using the whole ADNI dataset not only a sample from it. By revealing who is likely to convert to AD and when, this work enables clinicians to intervene at the critical junction where MCI transitions to inevitable decline. The future of AD treatment may hinge on such early warnings.Item Visual Engagement: Quantifying Campus Experiences in Urban Open Spaces Using a Computer Vision Model(2024-05) Mohareb, Nabil; Ashraf, AbdelazizIntroduction Addressing the gap in quantitative analysis of spatial experiences within academic environments, this study introduces a groundbreaking framework designed to measure and quantify the visual experiences of individuals in academic campus settings. Focused on analyzing the visual composition of the built environment—including aspects such as visible sky, greenery, and spatial enclosure—our framework aims to provide a quantitative refl ection of the subjective spatial experiences of campus users. Methods The methodology involves using mobile phones with digital cameras and GPS sensors to capture firstperson visual data and track movements as they freely traverse campus open spaces. Computer vision techniques, including Instance segmentation and convolutional neural networks, will categorize architectural and natural elements within each frame image extracted from a recorded video, quantify proportional compositions and analyze relative amounts of greenery, open sky, walkways, buildings, and other built structures that participants visually experienced. The framework is translated into a Python model capable of producing quantitative outcomes. The analysis will be further enriched by integrating Geographic Information Systems (GIS) for spatial analysis to identify navigation and visual engagement patterns. This comprehensive methodology quantifi es the visual attributes of spaces and interprets their impact on the behavior and experiences of campus users. Results and conclusions The study outcomes reveal relationships between student’s navigation choices, visual experiences, and scene types. The results aim to guide urban designers in understanding university students’ open space needs based on their natural movement and viewing preferences and complement other qualitative approaches.Item Classification of DNA Sequence Based on a Non-gradient Algorithm: Pseudoinverse Learners(Springer, 2024-04) Mahmoud, Mohammed A. BThis chapter proposes a prototype-based classification approach for analyzing DNA barcodes that uses a spectral representation of DNA sequences and a non-gradient neural network. Biological sequences can be viewed as data components with higher non-fixed dimensions, which correspond to the length of the sequences. Through computational procedures such as one-hot encoding, numerical encoding plays an important role in DNA sequence evaluation (OHE). However, the OHE method has some disadvantages: (1) It does not add any details that could result in an additional predictive variable, and (2) if the variable has many classes, OHE significantly expands the feature space. To address these shortcomings, this chapter proposes a computationally efficient framework for classifying DNA sequences of living organisms in the image domain. A multilayer perceptron trained by a pseudoinverse learning autoencoder (PILAE) algorithm is used in the proposed strategy. The learning control parameters and the number of hidden layers do not have to be specified during the PILAE training process. As a result, the PILAE classifier outperforms other deep neural network (DNN) strategies such as the VGG-16 and Xception models.Item Unlocking the potential of RNN and CNN models for accurate rehabilitation exercise classification on multi-datasets(Springer Netherlands, 2024-04) Zaher, Moamen; Ghoneim, Amr S; Abdelhamid, Laila; Atia, AymanPhysical rehabilitation is crucial in healthcare, facilitating recovery from injuries or illnesses and improving overall health. However, a notable global challenge stems from the shortage of professional physiotherapists, particularly acute in some developing countries, where the ratio can be as low as one physiotherapist per 100,000 individuals. To address these challenges and elevate patient care, the field of physical rehabilitation is progressively integrating Computer Vision and Human Activity Recognition (HAR) techniques. Numerous research efforts aim to explore methodologies that assist in rehabilitation exercises and evaluate patient movements, which is crucial as incorrect exercises can potentially worsen conditions. This study investigates applying various deep-learning models for classifying exercises using the benchmark KIMORE and UI-PRMD datasets. Employing Bi-LSTM, LSTM, CNN, and CNN-LSTM, alongside a Random Search for architectural design and Hyper-parameter tuning, our investigation reveals the (CNN) model as the top performer. After applying cross-validation, the technique achieves remarkable mean testing accuracy rates of 93.08% on the KIMORE dataset and 99.7% on the UI-PRMD dataset. This marks a slight improvement of 0.75% and 0.1%, respectively, compared to previous techniques. In addition, expanding beyond exercise classification, this study explores the KIMORE dataset’s utility for disease identification, where the (CNN) model consistently demonstrates an outstanding accuracy of 89.87%, indicating its promising role in both exercises and disease identification within the context of physical rehabilitation.Item ADVANCING DIABETIC FOOT ULCER DETECTION BASED ON RESNET AND GAN INTEGRATION(Little Lion Scientific, 2024-03) El-Kady, Ahmed Mostafa; Abbassy, Mohamed M; Ali, Heba Hamdy; Ali, Moussa FaridDiabetes, characterized by the body's inability to effectively regulate sugar levels due to insulin complications, leads to various serious health issues. Among these, Diabetic Foot Ulcer stands out as a critical yet often ignored consequence. This condition, if not addressed in time, can result in severe outcomes including amputations, posing a substantial burden on both individuals and healthcare systems, particularly in areas where medical care is costly. Addressing this pressing issue, our research focused intensively on the analysis of medical images, with the goal of enhancing the accuracy of Diabetic Foot Ulcer diagnosis. We assessed two different models: the renowned ResNet50 model and hybrid model that fuses ResNet50 with Generative Adversarial Networks. The findings were noteworthy; the ResNet50 demonstrated commendable performance, achieving an average accuracy and precision of 0.76, and an F1-Score of 0.75. However, the hybrid model surpassed these metrics, registering an average accuracy of 0.84, precision of 0.85, and an F1-Score of 0.84. This research contributes to the evolving landscape of medical image analysis, offering a promising avenue for more precise and effective DFU diagnosis in clinical settings. The marked advancement in diagnostic precision afforded by the hybrid model suggests a significant stride forward in effectively managing and treating DFU.Item Predicting progression of Alzheimer’s disease using new survival analysis approach(Institute of Advanced Engineering and Science (IAES), 2024-01) Zawawi, Nour Saad; Saber, Heba Gamal; Hashem, MohamedIt is critical to determine the risk of Alzheimer’s disease (AD) in people with mild cognitive impairment (MCI) to begin treatment early. Its development is affected by many things, but how each effect and how the disease worsens is unclear. Nevertheless, an in-depth examination of these factors may provide a reasonable estimate of how long it will take for patients at various stages of the disease to develop Alzheimer’s. Alzheimer’s disease neuroimaging initiative (ADNI) database had 900 people with 63 features from magnetic resonance imaging (MRI), genetic, cognitive, demographic, and cerebrospinal fluid data. These characteristics are used to track AD progression. A hybrid approach for dynamic prediction in clinical survival analysis has been developed to track progression to AD. The method uses a random forest cox regression approach to figure out how long it will take for MCI to turn into AD. In order to evaluate the result concordance index is used. The concordance index measures the rank correlation between predicted risk scores and observed time points. The concordance index was statistically considerably higher in the suggested work than in previous approaches with a score of 95.3%, which is higher than others.Item Naïve Bayes classifier assisted automated detection of cerebral microbleeds in susceptibility-weighted imaging brain images(National Research Council of Canada, 2023-12) Ateeq, Tayyab; Bin Faheem, Zaid; Ghoneimy, Mohamed; Ali, Jehad; Li, Yang; Baz, AbdullahCerebral microbleeds (CMBs) in the brain are the essential indicators of critical brain disorders such as dementia and ischemic stroke. Generally, CMBs are detected manually by experts, which is an exhaustive task with limited productivity. Since CMBs have complex morphological nature, manual detection is prone to errors. This paper presents a machine learning-based automated CMB detection technique in the brain susceptibility-weighted imaging (SWI) scans based on statistical feature extraction and classification. The proposed method consists of three steps: (1) removal of the skull and extraction of the brain; (2) thresholding for the extraction of initial candidates; and (3) extracting features and applying classification models such as random forest and naïve Bayes classifiers for the detection of true positive CMBs. The proposed technique is validated on a dataset consisting of 20 subjects. The dataset is divided into training data that consist of 14 subjects with 104 microbleeds and testing data that consist of 6 subjects with 63 microbleeds. We were able to achieve 85.7% sensitivity using the random forest classifier with 4.2 false positives per CMB, and the naïve Bayes classifier achieved 90.5% sensitivity with 5.5 false positives per CMB. The proposed technique outperformed many state-of-the-art methods proposed in previous studies.Item Empowering Short Answer Grading: Integrating Transformer-Based Embeddings and BI-LSTM Network(MDPI AG, 2023-06) Gomaa, Wael H; Nagib, Abdelrahman E; Saeed, Mostafa M; Algarni, Abdulmohsen; Nabil, EmadAutomated scoring systems have been revolutionized by natural language processing, enabling the evaluation of students’ diverse answers across various academic disciplines. However, this presents a challenge as students’ responses may vary significantly in terms of length, structure, and content. To tackle this challenge, this research introduces a novel automated model for short answer grading. The proposed model uses pretrained “transformer” models, specifically T5, in con- junction with a BI-LSTM architecture which is effective in processing sequential data by considering the past and future context. This research evaluated several preprocessing techniques and different hyperparameters to identify the most efficient architecture. Experiments were conducted using a standard benchmark dataset named the North Texas Dataset. This research achieved a state-of-the-art correlation value of 92.5 percent. The proposed model’s accuracy has significant implications for education as it has the potential to save educators considerable time and effort, while providing a reliable and fair evaluation for students, ultimately leading to improved learning outcomes.Item Incorporating Connectivity in k-Nearest Neighbors Regression(2023-07) Mahfouz, Mohamed AThe standard k-nearest neighbors' approach to regression analysis encounters a number of problems when used on datasets with various density distributions. This paper proposes a kNNR-relative ensemble regressor based on connectivity. In each cross-validation round, the pipeline starts by clustering the input data using any partitioning algorithm. Then, a random sample of the edges is selected from each partition favoring the edges with small distances. After that, the selected edges are transformed into a dataset in which feature values represent the amount of increase or decrease in each dimension compared to the source node's values and the label of each feature vector is the difference in the label of the source and the destination. Then a regressor is built for each cluster based on the output of the transformer. In order to predict a label for an unseen object, the nearest centroid is identified, and, k-nearest neighbors from the corresponding cluster are identified as the source nodes. Then, a vector representing the difference between the unseen object and each source node is computed and fed to the regressor model of the corresponding cluster, the output label is the predicted difference so it is added to the label of the source node. The diversity between the suggested decision model and the traditional kNN regressor, termed kNNR motivates us to include the kNNR in the suggested ensemble. The k-nearest neighbors of kNNR are also selected from the nearest cluster. The weighted average of the predicted labels offered by the base models serves as the final output label. The sample size, the number of neighbors to be used, and the number of clusters can all be fine-tuned via cross validation. The ensemble is evaluated, and the results showed that the ensemble achieved a significant increase in effectiveness compared to its base regressors and several related algorithms.Item Deep Learning-Based Alzheimer’s Disease Classification: An Experimental Study(IEEE, 2023-07) Ayman, Mohamed; Darwish, Farah; Mohammed, Tukka; Mohammed, AmmarAlzheimer's disease poses a significant challenge to healthcare professionals due to its prevalence in dementia cases. Accurate and timely diagnosis is essential for effective management of patients. Magnetic resonance imaging (MRI) has emerged as a vital tool for diagnosing Alzheimer's disease. This paper evaluates the effectiveness of image classification models in detecting Alzheimer's disease using MRI images, with four categories of Alzheimer's disease ranging from no dementia to very mild, mild, and moderate dementia. The study employs and fine-tunes different CNN-based model including VGG16, Inception, and ResNetV2. In order to ensure greater reliability and robustness of the results obtained, we employ cross-validation during the experimentation phase, with different test splits. The experimental results demonstrate that fine-tuning VGG16 yields the highest accuracy of 98.810%. These findings suggest that further optimization and refinement of these models may lead to enhanced accuracy in MRI-based Alzheimer's disease diagnosis, potentially revolutionizing how this condition is managed.Item Process Discovery Automation: Benefits and Limitations(IEEE, 2023-07) Montasser, Reem Kadry; Helal, Iman M. AProcess discovery algorithms incorporating domain knowledge can have varying levels of user involvement. It ranges from fully automated algorithms to interactive approaches where the user makes critical decisions about the process model. Designing domain knowledge using process discovery techniques faces various challenges. These challenges could cause some issues with existing approaches. Acquiring domain knowledge with domain experts, integrating domain knowledge with process data, scalability to handle large complex data sets, and ensuring data quality are examples of these challenges. In this survey, we assess recent work with varying levels of automation in process discovery to enhance the analysis and understanding of business processes within an organization. Current work can be classified into two categories: fully automated or semi-automated process discovery. We conclude that semi-automated process discovery gives a better opportunity for involving users. Also, the use of deep learning algorithms in automation gives better performance than machine learning algorithms.Item Classification of tumors based on distinguishing possibilistic biclusters(IEEE, 2023-07) Mahfouz, Mohamed AThe application of machine learning techniques to analyze very high dimensional gene expression data in order to uncover the underlying biological mechanisms of complex diseases is one of the most crucial bioinformatics tasks. The focus of this research is the development of a multi-class classifier for the classification of tumors and identifying relevant genes based on distinguishing fuzzy biclusters. The proposed technique starts by computing k overlapping clusters of genes. For each cluster of genes C, we have c biclusters, where c is the number of classes, such that, for class t, a bicluster Bt = (C, classt samples). The clustering technique uses the average mean square residue (MSR) such that for each class t the MSR(Bt) is minimized while simultaneously maximizing the cluster size. Then a possibilistic technique is applied to each cluster to compute a membership for each gene in its cluster such that the assigned memberships maximize the weighted average of the fuzzy residue of each row in Bt' when added to Bt while minimize the number of genes and the weighted average of the fuzzy residue of each row in Bt in its bicluster by calculating the derivative's zeros for the proposed objective function. The vector of possibilistic memberships associated with each cluster Ct works as a multi-class classifier. The decision space can be made more discriminative by combining the outcomes predicted by k separate models using stacking or weighted majority voting. In order to classify an unseen sample xu, the relative increase in the weighted average of the squared fuzzy residue of Bt, for each class t, when xu is added to it, is computed. Then, xu is assigned to a class t if the relative increase in the weighted average of the squared fuzzy residue of Bt is the minimum. Results from experiments indicate that the performance of the suggested technique and its computational complexity are comparable to algorithms that combine metaheuristic algorithms with SVM.Item Empowering MBTI Personality Classification through Transformer-Based Summarization Model(IEEE, 2023-07) Elmoushy, Seif; Saeed, Mostafa; Gomaa, Wael HThe Myers-Briggs Type Indicator (MBTI) is a popular personality classification tool that utilizes the work of Carl Jung. With individuals increasingly expressing themselves online rather than in person, social media has become a promising platform for predicting personality. However, predicting personality from online behavior is a challenging task that requires extensive data processing and modeling. To tackle this challenge, a novel approach is proposed that leverages a transformer-based summarization model to summarize the dataset records before applying the DistilBERT base classification model. The proposed method improves the MBTI classification task by achieving an accuracy rate of 0.96, which demonstrates the efficacy and durability of the strategy. The study emphasizes the importance of transformer-based summarization in enhancing NLP tasks and the need for applying various optimization techniques to achieve optimal performance. The findings provide a foundation for future research in personality classification and NLP.Item Early Rheumatoid Arthritis Detection by miRNA Data Analysis Using a Hybrid CNN-LSTM Deep Learning Model(IEEE, 2023-07) Ali, Nehal MTranscriptomic data analysis has significantly evolved thanks to the high throughput data technology. Analyzing this data can notably support the early detection of several diseases. At the same time, Rheumatoid arthritis (RA) is an autoimmune disease that causes critical joints damages over time and can lead to severe disabilities. This work introduces a hybrid CNN-LSTM deep learning model that combines the CNN capabilities of higher-level feature extractions and the LSTM efficiency of determining the long-term consequences between the transcriptomic terms. This model analyzes miRNA data of Rheumatoid Arthritis patients and healthy controls to provide an early detection model for this disease. In addition, this work studies the impact of NEBNEXT and NEXTFLEX sample preparation kits on the accuracy scores of the miRNA samples' classification. The studied dataset consists of 42 miRNA files of RA cases, healthy controls, and synthetic samples. The results of the proposed model were promising, denoting sensitivity, specificity, precision, accuracy, and F1 Score values of (0.875, 0.884, 0.88, 0.883, 0.872), respectively. Moreover, comparative experiments are conducted with literature work.Item A Novel Two-Phase Approach for Enhancing Process Model Discovery in Processing Mining(IEEE, 2023-07) Abo Khedra, M. M; Mohammed, Ammar; Abdel-Hamid, YasserProcess mining showed great capabilities in many fields, aiming to automatically extract the nature of process models from "event logs."Businesses commonly use process mining to improve key performance indicators (KPIs). Positive records in an event log are used instead of negative ones to achieve KPIs. Fitness, simplicity, accuracy, and generalization are the four primary quality forces for process models, and the current process discovery algorithms commonly take into account only a maximum of two of them. Thus, this paper introduces a novel two-phase approach. Phase one focuses on event log preprocessing by applying K-means clustering to divide the event logs into positive and negative groups according to established key performance indicators. The second phase addresses process discovery by balancing the four quality forces for the process model using the ETM process discovery algorithm. Using three publicly accessible real-life benchmark datasets, we run several experiments and measure the performance of the two-phase approach using the RapidProM workflow tool. The experimental findings reveal that the proposed two-phase approach model gets significant value from the negative records. The ETM process discovery algorithm performs well across the four primary quality forces.Item Rooted Android Devices Risk Assessment using Analytic Hierarchy Process(IEEE, 2023-09) Elsersy, Wael; El-Fishawy, Nawal Ahmed; Zakaria, Ehab EAttackers are targeting rooted Android mobile devices to gain access to confidential data such as credit cards and banking transactions. Despite the removal of rooting applications from Google Play Store, attackers still provide easy rooting methods through third-party application stores. Previous studies have focused on rooting detection systems, but they have ignored Android rooting risk assessment, impacting device security. This research introduces a risk assessment framework for Android devices named ARAS, which uses three risk criteria: system, privacy, and financial criteria. ARAS extracts Android static analysis features and adopts Analytic Hierarchy Process (AHP) pairwise comparison methodology to decide the rooting risk level. The proposed scoring model is applied to a rooted device dataset to demonstrate the risk level assessment. ARAS introduces four levels of risk: low, medium, high, and critical risk levels, providing a decision support system for allowing or denying rooted devices access to applications and confidential information.