Open Access Open Access  Restricted Access Subscription Access

AI-Powered System for Automated Image Caption Creation and Recommendations

Dr. Devidas Thosar, Dr. Babaso Shinde, Tejaswini Thube, Sneha Nagargoje, Sapna Bhujbal, Khushi Mujawar

Abstract


With the increasing demand for intelligent automation and visual understanding, Artificial Intelligence (AI)-based image captioning systems have become a vital innovation in content analysis, accessibility, and digital media management. This project presents an AI Image Caption Recommendation System that automatically generates meaningful and context-aware captions for images using deep learning and natural language processing (NLP). The proposed system integrates modern web technologies—HTML, CSS, JavaScript, and Python (Flask)—with TensorFlow/Keras, MySQL, and advanced CNN–LSTM architectures to achieve high accuracy and linguistic relevance in caption generation.


Full Text:

PDF

References


Sharma, R., & Patel, M. (2023). Automatic Image Captioning using CNN and LSTM Framework. IEEE Conference on Computer Vision and Pattern Recognition. Proposes a CNN-LSTM model that generates descriptive captions from images using the MSCOCO dataset.

Gupta, A., & Verma, S. (2024). AI-Based Image Caption Recommendation using Deep Learning and NLP. Springer Lecture Notes in Artificial Intelligence. Implements a hybrid CNN + Transformer model to recommend context-aware captions for social media applications.

D’Souza, R., & Iyer, P. (2023). Enhancing Image Descriptions with Attention Mechanism in Caption Generation Systems. Elsevier Journal of Visual Communication.

Uses attention-based LSTM to improve caption quality by focusing on key image regions.

Wang, X., & Chen, Y. (2024). Transformer-Based Image Caption Generation for Multilingual Applications. IEEE Access.

Introduces Vision Transformer and GPT-based decoder for generating captions in multiple languages.

Kumar, A., & Singh, T. (2025). AI-Powered Image Caption Recommendation System using Flask and TensorFlow. International Journal of Emerging Technologies.

Develops a Flask-based web application integrating a trained CNN-LSTM model with a user-friendly recommendation interface.

Rahman, M., & Akter, F. (2023). Evaluation of Deep Learning Models for Image Caption Generation: BLEU and METEOR Analysis. ACM Multimedia Conference. Compares model performance using BLEU, METEOR, and CIDEr metrics on Flickr8k and Flickr30k dataset.

Li, J., & Zhao, K. (2024). Context-Aware Caption Recommendation System using CLIP Embeddings. IEEE Transactions on Neural Networks. Integrates OpenAI’s CLIP for extracting semantic relationships between images and suggested captions.

Mehta, R., & Bansal, V. (2025). Transformer-Based Image Caption Recommendation with Reinforcement Learning Optimization. arXiv preprint. Applies reinforcement learning to optimize caption relevance and diversity in generated recommendations.

Chatterjee, S., & Das, P. (2023). Image Captioning and Recommendation using Hybrid Attention and Beam Search. Springer Nature.

Enhances sentence fluency and accuracy using beam search decoding with hybrid attention mechanisms.

Zhang, L., & Huang, T. (2024). Cloud-Deployed AI Captioning Service using PyTorch and REST APIs. IEEE Cloud Computing Journal. Deploys an image caption recommendation system on cloud platforms for scalable, real-time performance.


Refbacks

  • There are currently no refbacks.