Confidence-Credibility Aware Weighted Ensembles of Small LLMs Outperform Large LLMs in Emotion Detection

dc.AffiliationOctober University for modern sciences and Arts MSA
dc.contributor.authorMenna Elgabry
dc.contributor.authorAli Hamdi
dc.date.accessioned2026-06-02T07:21:43Z
dc.date.issued2026-05-01
dc.descriptionSJR 2025 0.119 Q4 H-Index 40 Subject Area and Category: Computer Science Computer Networks and Communications Computer Science Applications Information Systems Engineering Electrical and Electronic Engineering Media Technology
dc.description.abstractThis paper introduces a confidence-weighted, credibility-aware ensemble framework for text-based emotion detection, inspired by Condorcet’s Jury Theorem (CJT). Unlike conventional ensembles that often rely on homogeneous architectures, our approach combines architecturally diverse small transformer-based large language models (sLLMs)—BERT, RoBERTa, DistilBERT, DeBERTa, and ELECTRA—each fully fine-tuned for emotion classification. To preserve error diversity, we minimize parameter convergence while taking advantage of the unique biases of each model. A dual-weighted voting mechanism integrates both global credibility (validation F1- score) and local confidence (instance-level probability) to dynamically weight model contributions. Experiments on the DAIR-AI dataset demonstrate that our credibility-confidence ensemble achieves a macro F1-score of 93.5%, surpassing state-of-the-art benchmarks and significantly outperforming large-scale LLMs, including Falcon, Mistral, Qwen, and Phi, even after task-specific Low-Rank Adaptation (LoRA). With only 595 M parameters in total, our small LLMs ensemble proves more parameter-efficient and robust than models up to 7B parameters, establishing that carefully designed ensembles of small, fine-tuned models can outperform much larger LLMs in specialized natural language processing (NLP) tasks such as emotion detection.
dc.description.urihttps://www.scimagojr.com/journalsearch.php?q=21100975545&tip=sid&clean=0
dc.identifier.citationElgabry, M., & Hamdi, A. (2026). Confidence-Credibility Aware Weighted Ensembles of Small LLMs Outperform Large LLMs in Emotion Detection. Lecture Notes on Data Engineering and Communications Technologies, 170–179. https://doi.org/10.1007/978-3-032-23035-5_16 ‌
dc.identifier.doihttps://doi.org/10.1007/978-3-032-23035-5_16
dc.identifier.otherhttps://doi.org/10.1007/978-3-032-23035-5_16
dc.identifier.urihttps://repository.msa.edu.eg/handle/123456789/6775
dc.language.isoen_US
dc.publisherSpringer International Publishing AG
dc.relation.ispartofseriesLecture Notes on Data Engineering and Communications Technologies ; Volume 293 , Pages 170 - 179
dc.subjectCondorcet’s jury theorem
dc.subjectEmotion detection
dc.subjectEnsemble learning
dc.subjectSmall LLMs
dc.subjectWeighted voting
dc.titleConfidence-Credibility Aware Weighted Ensembles of Small LLMs Outperform Large LLMs in Emotion Detection
dc.typeBook chapter

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
IMG-20231214-WA0000.jpg
Size:
16.8 KB
Format:
Joint Photographic Experts Group/JPEG File Interchange Format (JFIF)

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
51 B
Format:
Item-specific license agreed upon to submission
Description: