Camera-Based Human Activity Recognition for Productivity Monitoring using LSTM
Abstract
This paper proposes a camera-based Human Activity Recognition (HAR) framework for real-time productivity monitoring in workplace environments. The system uses MediaPipe Pose to obtain skeletal landmarks, a Convolutional Neural Network (CNN) for spatial feature encoding, and a Long Short-Term Memory (LSTM) network for temporal modeling to recognize activities such as typing, reading, writing, phone use, and idle behavior. A synthetically generated dataset of office actions was used for training, yeilding 98% training accuracy and 96% validation accuracy, and the TensorFlow Lite version runs in real time at 20–25 FPS. The lightweight model supports efficient on-device deployment while maintaining strong performance. A weighted productivity mapping module translates recognized activities into an engagement index from 0 to 1. Experimental results show that the CNN–LSTM–TFLite pipeline provides accurate, fast, and privacy-preserving activity recognition suitable for non-intrusive workplace productivity assessment.
References
Poppe, R. (2010). A survey on vision-based human action recognition. Image and Vision Computing, 28(6), 976–990.
Vrigkas, M., Nikou, C., & Kakadiaris, I. A. (2015). A review of human activity recognition methods.
Frontiers in Robotics and AI, 2, 28.
Ke, S. R., Thuc, H. L. U., Lee, Y. J., Hwang, J. N., Yoo, J. H., & Choi, K. H. (2013). A review on video-based human activity recognition. Computers, 2(2), 88–131.
Beddiar, D. R., Nini, B., Sabokrou, M., & Hadid, A. (2020). Vision-based human activity recognition: A survey. Multimedia Tools and Applications, 79(41), 30509–30555.
Saleem, G., Bajwa, U. I., & Raza, R. H. (2023). Toward human activity recognition: A survey. Neural Computing and Applications, 35(5), 4145–4182.
Gu, F., Chung, M. H., Chignell, M., Valaee, S., Zhou, B., & Liu, X. (2021). A survey on deep learning for human activity recognition. ACM Computing Surveys, 54(8), 1–34.
Mutegeki, R., & Han, D. S. (2020). A CNN-LSTM approach to human activity recognition. IEEE ICAIIC, 362–366.
Xia, K., Huang, J., & Wang, H. (2020). LSTM-CNN architecture for human activity recognition. IEEE Access, 8, 56855–56866.
Khan, I. U., Afzal, S., & Lee, J. W. (2022). Human activity recognition via hybrid deep learning-based model. Sensors, 22(1), 323.
Uddin, M. A., et al. (2024). Deep learning-based human activity recognition using CNN, ConvLSTM, and LRCN. International Journal of Cognitive Computing in Engineering, 5, 259–268.
Refbacks
- There are currently no refbacks.