HybridFormer: Data-Efficient Deep Learning for High-Dimensional Spatiotemporal Classification With Application to Neural Signal Processing

dc.AffiliationOctober University for modern sciences and Arts MSA
dc.contributor.authorGHADA ABDELHADY
dc.contributor.authorABDULRAHMAN GHANDOURA
dc.contributor.authorABDULLAH ALAJMI
dc.contributor.authorZIAD GHAZALY
dc.date.accessioned2026-04-05T12:16:24Z
dc.date.issued2026-03-20
dc.descriptionSJR 2024 0.849 Q1 H-Index 290 Subject Area and Category: Computer Science Computer Science (miscellaneous) Engineering Engineering (miscellaneous) Materials Science Materials Science (miscellaneous)
dc.description.abstractWe present HybridFormer, a hybrid deep learning architecture for data-efficient multichannel spatiotemporal classification, validated on high-gamma EEG motor imagery. The key novelty is an ordered integration pipeline: spatial CNN features are compressed via 4:1 squeeze-and-excitation channel attention before the BiLSTM, and learned Q/K/V temporal self-attention operates after recurrent encoding. This differs from prior hybrids that apply attention only post-recurrence or as simple pooling. On the 128-channel High-Gamma Dataset, HybridFormer achieves 91.2 ± 2.8% within-subject and 78.5 ± 3.4% cross-subject accuracy using stratified 10-fold and leave-one-subject-out protocols, outperforming CNN-LSTM baselines by 6.7% and 5.4% (p < 0.001). Against transformer baselines—EEG-Transformer (BENDR), Transformer-LSTM, and EEG-Conformer—HybridFormer achieves 8.5%, 7.3%, and 6.1% higher accuracy with 2.3× fewer parameters (1.8M). A strict non-overlapping temporal split experiment without augmentation confirms a 5.1% advantage over the best baseline, ruling out information leakage.Cross-dataset validation across 22–128 channels shows consistent generalization. Ablation studies confirm significant contributions from each component (CNN: 8.7%, LSTM: 6.2%, attention: 4.3%). Attention maps correlate with motor cortex activation (r = 0.76, p < 0.001) and remain stable across random seeds (cosine similarity = 0.91 ± 0.03). Real-time inference (15 ms/sample) supports resource-constrained deployment.
dc.description.urihttps://www.scimagojr.com/journalsearch.php?q=21100374601&tip=sid&clean=0
dc.identifier.citationAbdelhady, G., Ghandoura, A., Alajmi, A., & Ghazaly, Z. (2026). HybridFormer: Data-Efficient Deep Learning for High-Dimensional Spatiotemporal Classification With Application to Neural Signal Processing. IEEE Access, 14, 42551–42568. https://doi.org/10.1109/access.2026.3674967 ‌
dc.identifier.doihttps://doi.org/10.1109/access.2026.3674967
dc.identifier.otherhttps://doi.org/10.1109/access.2026.3674967
dc.identifier.urihttps://repository.msa.edu.eg/handle/123456789/6692
dc.language.isoen_US
dc.publisherInstitute of Electrical and Electronics Engineers Inc.
dc.relation.ispartofseriesIEEE Access; Volume 14 , Pages 42551 - 42568
dc.subjectattention mechanisms
dc.subjectcomputational efficiency
dc.subjectconvolutional neural networks
dc.subjectcross-domain generalization
dc.subjectdata-efficient learning
dc.subjecthybrid deep learning architectures
dc.subjectrecurrent neural networks
dc.subjectspatiotemporal modeling
dc.subjectTime-series classification
dc.titleHybridFormer: Data-Efficient Deep Learning for High-Dimensional Spatiotemporal Classification With Application to Neural Signal Processing
dc.typeArticle

Files