Estimating the number of speakers by novel zig-zag nested microphone array based on wavelet packet and adaptive GCC method
Autor
Dehghan Firoozabadi, Ali
Irarrázaval, Pablo
Adasme, Pablo
Zabala-Blanco, David
Palacios Játiva, Pablo
Durney, Hugo
Sanhueza Olave, Miguel
Fecha
2022Resumen
In this paper, a new speaker counting algorithm is proposed by novel zig-zag nested array (ZZNA) combining with adaptive generalized cross-correlation (GCC) function (with phase transform (PHAT) and maximum likelihood (ML)) and wavelet packet transform (WPT) with an agglomerative classification method by Elbow decisioning criteria. The proper ZZNA is introduced for covering the acoustical environments and removing the spatial aliasing. Then, the WPT with different frequency resolution is considered for preparing the frequency subbands. The adaptive GCC function based on PHAT and ML weighting filters is done on the microphone pairs for each subbands. Finally, the unsupervised agglomerative classification method with Elbow criteria is considered for classifying the information and speakers’ counting. The proposed ZZNA-WAGC method is compared with Hilbert envelope, multi-channel correlational recurrent neural network by using of ambisonics features (AF-CRNN) and estimating the number of speakers by density-based classification and clustering decision (ENS-DCCD) algorithms to show the superiority of the method in undesirable scenarios.
Fuente
8th International Conference on Signal Processing and Communication (ICSC), Noida, India, 358-363Link de Acceso
Click aquí para ver el documentoIdentificador DOI
doi.org/10.1109/ICSC56524.2022.10009025Colecciones
La publicación tiene asociados los siguientes ficheros de licencia: