Conditional GANs for human activity recognition: synthetic keypoint data generation
Loading...
Date
Journal Title
Journal ISSN
Volume Title
Publisher
Springer London
Series Info
Neural Computing and Applications; Volume 38, article number 156, (2026)
Scientific Journal Rankings
Orcid
Abstract
Human activity recognition (HAR) relies heavily on high-quality datasets, yet data scarcity often limits model performance. This paper explores the use of Conditional Generative Adversarial Networks (CGANs) for keypoint-based motion generation to enhance activity recognition in three distinct domains: sports, driver behavior analysis, and general human actions using the Weizmann dataset. The first experiment generated synthetic key points for three tennis strokes–forehand, backhand, and serve–to augment an existing dataset. Evaluating the quality of these generated key points using Fréchet Inception Distance (FID) showed promising results, with average scores of 2.72 for forehand, 3.11 for backhand, and 2.86 for serve. Integrating the synthetic data into classical and deep learning models significantly improved stroke classification accuracy. Among the classical machine learning models evaluated, the Gradient Boosting Classifier (GBC) achieved the highest performance, with accuracy increasing from 92.14% to 94.90% after incorporating the synthetic data. In the deep learning category, the Convolutional Neural Network (CNN) optimized using the Stochastic Gradient Descent (SGD) optimizer recorded the highest accuracy, improving from 96.03% to 99.15%, making it the top-performing model overall. The second experiment addressed recognizing rare driver behaviors by generating key points for the "rubbing head while driving" action, a gesture often linked to fatigue or discomfort. FID scores decreased steadily from 5.67 in the first video to 0.25 in the tenth, with an average score of 1.04 across the 10 videos, demonstrating enhanced similarity between real and generated key points. The third experiment focused on generating key points for human actions: walking, running, and jumping jacks from the Weizmann Human Action Dataset. Using CGANs, ten synthetic keypoint videos were generated for each action to simulate realistic full-body motion. The realism of the generated sequences was evaluated using Fréchet Inception Distance (FID), with average scores of 2.69 for walking, 3.99 for running, and 2.99 for jumping jacks. These results show that CGANs successfully create realistic motion data. This provides a useful way to expand human motion datasets without needing to collect large amounts of real-world videos. It helps increase the variety of data and improves the reliability of future models. This study demonstrates that synthetic key points effectively support human activity recognition (HAR), particularly in scenarios where labeled data is limited or difficult to obtain.
Description
SJR 2024
1.102
Q1
H-Index
146
Subject Area and Category:
Computer Science
Artificial Intelligence
Software
Citation
Ramadan, E., Ebrahim, M., Ahmed, H., & Atia, A. (2026). Conditional GANs for human activity recognition: synthetic keypoint data generation. Neural Computing and Applications, 38(6). https://doi.org/10.1007/s00521-025-11831-0
