

Chatbot for Video Summarization and Analysis Using Deep Learning
Abstract
The "Chatbot for Video Summarization and Analysis" project presents an intelligent chatbot that uses deep learning techniques to automate the process of summarizing videos. The system is designed to help users extract important information from long videos without needing to watch them entirely, addressing the challenges posed by the enormous volume of video data generated every day. The chatbot is able to effectively recognize significant scenes, detect significant events, and produce concise summaries that encapsulate the core of the video by combining Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs). The system integrates video captioning, keyframe extraction, and Natural Language Processing (NLP) to deliver accurate, interactive and user-friendly summaries suited for applications such as surveillance, education and media management. The system provides quick and useful insights, enhancing accessibility for users with time constraints or specialized information needs, reducing viewing time and improving content navigation.
References
K. Hande, H. Karlekar, P. Yeole, A. Likhar and H. Rangari, "NLP based Video Summarisation using Machine Learning," International Journal of Scientific Research in Science, Engineering and Technology (IJSRSET), vol. 10, no. 2, pp. 456-461, Mar.-Apr. 2023. doi: 10.32628/IJSRSET2310265.
N. Anand, R. K. Koshariya and V. Garg, "VidSum - Video Summarization using Deep Learning," 2023 Second International Conference on Informatics (ICI), 2023, pp. 1-8. doi: 10.1109/ICI60088.2023.10421339.
H. El Alaoui, Z. El Aouene and V. Cavalli-Sforza, "Building Intelligent Chatbots: Tools, Technologies and Approaches," 2023 3rd International Conference on Innovative Research in Applied Science, Engineering and Technology (IRASET), pp. 1-8. doi: 10.1109/IRASET57153.2023.10153005.
A. N. S. S. Vybhavi, L. V. Saroja, J. Duvvuru and J. Bayana, "Video Transcript Summarizer," in Proceedings of the 2022 International Mobile and Embedded Technology Conference (MECON), Vijayawada, India, 2022, pp. 461-465. doi: 10.1109/MECON53876.2022.9751991.
R. Jasmine, P. Nimmagadda, K. Sudhakar, B. C. J., P. Rajasekar and S. M. A., "Perceptual Video Summarization Using Keyframes Extraction Technique," in Proceedings of the 3rd International Conference on Innovative Practices in Technology and Management (ICIPTM), Kavaraipettai, India, 2023, pp. 1–4. doi: 10.1109/ICIPTM57143.2023.10118236.
W. Xu, R. Wang, X. Guo, S. Li, Q. Ma, Y. Zhao, S. Guo, Z. Zhu and J. Yan, “MHSCNet: A multimodal hierarchical shot-aware convolutional network for video summarization,” in Proc. IEEE Int. Conf. Acoustics, Speech and Signal Process. (ICASSP), Rhodes, Greece, Jun. 2023, pp. 1–6. doi: 10.1109/ICASSP49357.2023.10096265.
R. Vannala, S. B. Swathi and Y. Puranam, "AI Chatbot for Answering FAQs," 2022 IEEE 2nd International Conference on Sustainable Energy and Future Electric Transportation (SeFeT), Hyderabad, India, 2022, pp. 1-7. doi: 10.1109/SeFeT55524.2022.9908774.
O. Makhkamova, K.-H. Lee, K. Do and D. Kim, "Deep Learning-Based Multi-Chatbot Broker for Q&A Improvement of Video Tutoring Assistant," 2020 IEEE International Conference on Big Data and Smart Computing (BigComp), 2020, pp. 221-224. doi: 10.1109/BigComp48618.2020.00-71.
J. Lin, H. Hua, M. Chen, Y. Li, J. Hsiao, C. Ho and J. Luo, "VideoXum: Cross-Modal Visual and Textural Summarization of Videos," IEEE Transactions on Multimedia, vol. 26, pp. 5548–5560, 2024. doi: 10.1109/TMM.2023.3335875.
H. Wang, B. Zhou, Z. Zhang, Y. Du, D. Ho and K.-F. Wong, “M3Sum: A novel unsupervised language-guided video summarization,” in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP), Singapore, Dec. 2024, pp. 4140–4144. doi: 10.1109/ICASSP48485.2024.10447504.
Refbacks
- There are currently no refbacks.