AI-Powered System for Automated Image Caption Creation and Recommendations
Abstract
With the increasing demand for intelligent automation and visual understanding, Artificial Intelligence (AI)-based image captioning systems have become a vital innovation in content analysis, accessibility, and digital media management. This project presents an AI Image Caption Recommendation System that automatically generates meaningful and context-aware captions for images using deep learning and natural language processing (NLP). The proposed system integrates modern web technologies—HTML, CSS, JavaScript, and Python (Flask)—with TensorFlow/Keras, MySQL, and advanced CNN–LSTM architectures to achieve high accuracy and linguistic relevance in caption generation.
References
Sharma, R., & Patel, M. (2023). Automatic Image Captioning using CNN and LSTM Framework. IEEE Conference on Computer Vision and Pattern Recognition. Proposes a CNN-LSTM model that generates descriptive captions from images using the MSCOCO dataset.
Gupta, A., & Verma, S. (2024). AI-Based Image Caption Recommendation using Deep Learning and NLP. Springer Lecture Notes in Artificial Intelligence. Implements a hybrid CNN + Transformer model to recommend context-aware captions for social media applications.
D’Souza, R., & Iyer, P. (2023). Enhancing Image Descriptions with Attention Mechanism in Caption Generation Systems. Elsevier Journal of Visual Communication.
Uses attention-based LSTM to improve caption quality by focusing on key image regions.
Wang, X., & Chen, Y. (2024). Transformer-Based Image Caption Generation for Multilingual Applications. IEEE Access.
Introduces Vision Transformer and GPT-based decoder for generating captions in multiple languages.
Kumar, A., & Singh, T. (2025). AI-Powered Image Caption Recommendation System using Flask and TensorFlow. International Journal of Emerging Technologies.
Develops a Flask-based web application integrating a trained CNN-LSTM model with a user-friendly recommendation interface.
Rahman, M., & Akter, F. (2023). Evaluation of Deep Learning Models for Image Caption Generation: BLEU and METEOR Analysis. ACM Multimedia Conference. Compares model performance using BLEU, METEOR, and CIDEr metrics on Flickr8k and Flickr30k dataset.
Li, J., & Zhao, K. (2024). Context-Aware Caption Recommendation System using CLIP Embeddings. IEEE Transactions on Neural Networks. Integrates OpenAI’s CLIP for extracting semantic relationships between images and suggested captions.
Mehta, R., & Bansal, V. (2025). Transformer-Based Image Caption Recommendation with Reinforcement Learning Optimization. arXiv preprint. Applies reinforcement learning to optimize caption relevance and diversity in generated recommendations.
Chatterjee, S., & Das, P. (2023). Image Captioning and Recommendation using Hybrid Attention and Beam Search. Springer Nature.
Enhances sentence fluency and accuracy using beam search decoding with hybrid attention mechanisms.
Zhang, L., & Huang, T. (2024). Cloud-Deployed AI Captioning Service using PyTorch and REST APIs. IEEE Cloud Computing Journal. Deploys an image caption recommendation system on cloud platforms for scalable, real-time performance.
Refbacks
- There are currently no refbacks.