Open Access Open Access  Restricted Access Subscription Access

An Implementation of Advanced NLP for High-Quality Text-To-Speech Synthesis

Sharmi Islam, Mustahid Hasan, Md. Ismail Jabiullah

Abstract


In this paper, we utilize Bengali voice information (Bangladesh and Kolkata) to convert it in to text format. In order to build a Speech to text conversion framework one should give two key parts: a NLP (Natural Language Processing) stage, basically works on the information on the input speech, and a text generation stage to produce the desired output. These two distinct levels must exchange both data and commands to supply Text. As completing task relies on many distinct scientific areas, any achievement toward standardization can minimize the effort and increase the dynamic of the results. The development in correspondence advancements AI (specially Machine learning and Deep learning) led researcher in convolutional neural network (CNN), which is standing out enough to be noticed because of its high performance. Nonetheless, most normal issue with deep learning architectures such as CNN is that they require large amount of data for training. This paper gives an overview of the NLP stage in the speech to txt framework for Bangla language built by our collective, and describes the integration into the database.


Full Text:

PDF

References


“Vision impairment and blindness,” World Health Organization.

Riazi, A., Riazi, F., Yoosfi, R., & Bahmeei, F. (2016). Outdoor difficulties experienced by a group of visually impaired Iranian people. Journal of current ophthalmology, 28(2), 85-90.

Burileanu, D. (2002). Basic research and implementation decisions for a text-to-speech synthesis system in Romanian. International Journal of Speech Technology, 5(3), 211-225.

Kepuska, V., & Bohouta, G. (2018, January). Next-generation of virtual personal assistants (microsoft cortana, apple siri, amazon alexa and google home). In 2018 IEEE 8th annual computing and communication workshop and conference (CCWC) (pp. 99-103). IEEE.

Yamashita, R., Nishio, M., Do, R. K. G., & Togashi, K. (2018). Convolutional neural networks: an overview and application in radiology. Insights into imaging, 9(4), 611-629.

Galvez, R. L., Bandala, A. A., Dadios, E. P., Vicerra, R. R. P., & Maningo, J. M. Z. (2018, October). Object detection using convolutional neural networks. In TENCON 2018-2018 IEEE Region 10 Conference (pp. 2023-2027). IEEE.

B. Y. . H. G. LeCun, Y.(2015).“Deep learning.,” Nature 521, p. 436–444, 05.

LeCun, Y., Bottou, L., Bengio, Y., & Haffner, P. (1998). Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11), 2278-2324..

Ian Goodfeellow,Youshua Bewngio & Aron Courville (2016).Deep Learning MIT Press

“Object detection: Tensorflow lite,” TensorFlow.


Refbacks

  • There are currently no refbacks.