Evaluation of Speech Activity for Interview in NIST Speaker Recognition

Nirmal Kumar P., Venkatesh C.


Interview speech in ongoing NIST Speaker Recognition Evaluations (SREs) has required the improvement of Speech activity detectors (VADs) that can work under extremely low signal to-noise proportion. This paper features the qualities of Interview speech records in NIST SREs and examines the challenges of distinguishing speech/non-speech fragments in these documents. To mitigate these challenges, this paper proposes a VAD that utilizations noise reduction as a pre-preparing step. A methodology to dodge the undesirable impacts of impulsive signals and sinusoidal foundation signals on the VAD is additionally proposed. The proposed VAD is contrasted and the VAD in the ETSI-AMR speech coder for evacuating quietness areas of interview speech documents. The outcomes show that the proposed VAD is more powerful in detecting speech segments under low SNR, prompting a huge exhibition gain in Common Conditions 1–4 of NIST 2008 SRE.


Keywords: Speech activity detection, far-field microphone, speaker verification, noise reduction, spectral subtraction, NIST speaker recognition evaluations

