Foreign Accents Classification of English and Urdu Languages, Design of Related Voice Data Base and A Proposed MLP based Speaker Verification System

Muhammad Ismail;Shahzad Ahmed Memon;Lachhman Das Dhomeja;Shahid Munir Shah;

doi:10.22937/IJCSNS.2024.24.10.5

International Journal of Computer Science & Network Security

Volume 24 Issue 10
/
Pages.43-52
/
2024
/
1738-7906(pISSN)

International Journal of Computer Science & Network Security (국제컴퓨터통신보호논문지학회)

DOI QR Code

Foreign Accents Classification of English and Urdu Languages, Design of Related Voice Data Base and A Proposed MLP based Speaker Verification System

Muhammad Ismail (Department of Computer Science, Karakoram International University(KIU)) ;
Shahzad Ahmed Memon (IICT, University of Sindh) ;
Lachhman Das Dhomeja (IICT, University of Sindh) ;
Shahid Munir Shah (Faculty of Information Technology, Department of Computer Science, Barrett Hodgson University)

Received : 2024.10.05
Published : 2024.10.30

https://doi.org/10.22937/IJCSNS.2024.24.10.5 Citation PDF

Download PDF

⟨ Previous Next ⟩

Abstract

A medium scale Urdu speakers' and English speakers' database with multiple accents and dialects has been developed to use in Urdu Speaker Verification Systems, English Speaker Verification Systems, accents and dialect verification systems. Urdu is the national language of Pakistan and English is the official language. Majority of the people are non-native Urdu speakers and non-native English in all regions of Pakistan in general and Gilgit-Baltistan region in particular. In order to design Urdu and English speaker verification systems for security applications in general and telephone banking in particular, two databases has been designed one for foreign accent of Urdu and another for foreign accent of English language. For the design of databases, voice data is collected from 180 speakers from GB region of Pakistan who could speak Urdu as well as English. The speakers include both genders (males and females) with different age groups ranging from 18 to 69 years. Finally, using a subset of the data, Multilayer Perceptron based speaker verification system has been designed. The designed system achieved overall accuracy rate of 83.4091% for English dataset and 80.0454% for Urdu dataset. It shows slight differences (4.0% with English and 7.4% with Urdu) in recognition accuracy if compared with the recently proposed multilayer perceptron (MLP) based SIS achieved 87.5% recognition accuracy

Keywords

References

NSTC, "Biometrics ' Foundation Documents ,'" Subcomm. Biometrics, Natl. Sci. Technol. Counc., pp. 1-166, 2006.
A. K. Jain, A. Ross, and S. Prabhakar, "An Introduction to Biometric Recognition," vol. 14, no. 1, pp. 4-20, 2004.
S. Memon, S. Ghulam, S. Shah, K. Khoumbati, and I. A. Ismaili, "Securing Sensitive eDatabases using Multi- Biometric Technology," no. December 2018, 2010.
T. J. Ibm, "Biometric Recognition: Security and Privacy Concerns," pp. 33-42, 2003.
F. Selection, "Speaker Verification:," no. January, pp. 42-48, 1990.
D. A. Reynolds, "An Overview of Automatic Speaker Recognition Technology.1," pp. 4072-4075, 2002.
E. Karpov, "Real-Time Speaker Identification Real-Time Speaker Identification Evgeny Karpov University of Joensuu Department of Computer Science Master ' s Thesis," no. January 2004, 2014.
K. Khoumbati, "A Pashtu speakers database using accent and dialect approach Shahid Munir Shah*, Shahzad Ahmed Memon and Muhammad Moinuddin," vol. 4, no. 4, pp. 358-380, 2017.
Z. Liu, Z. Wu, T. Li, J. Li, and C. Shen, "GMM and CNN Hybrid Method for Short," IEEE Trans. Ind. Informatics, vol. 14, no. 7, pp. 3244-3252, 2018.
N. Chauhan and M. Chandra, "Speaker Recognition and Verification Using Artificial Neural Network," pp. 1147-1149, 2017.
F. Thullier, B. Bouchard, and B. J. Menelas, "A Text-Independent Speaker Authentication System for Mobile Devices," pp. 1-22, 2017.
N. Harte and E. Gillen, "TCD-TIMIT : An Audio-Visual Corpus of Continuous Speech," vol. 17, no. 5, pp. 603-615, 2015.
K. A. Lee, A. Larcher, G. Wang, P. Kenny, N. Br, D. Van Leeuwen, H. Aronowitz, M. Kockmann, C. Vaquero, B. Ma, H. Li, T. Stafylakis, J. Alam, A. Swart, J. Perez, A. Star, M. Lium, S. West, and S. Africa, "The RedDots Data Collection for Speaker Recognition," pp. 2996-3000, 2016.
A. W. Abbas, N. Ahmad, and H. Ali, "Pashto Spoken Digits Database for the Automatic Speech Recognition Research," no. September, pp. 8-11, 2012.
A. Alarifi and I. Alkurtass, "Arabic Text-Dependent Speaker Verification for Mobile Devices Using Artificial Neural Networks," pp. 350-353, 2011.
G. Droua-hamdani, S. A. Selouani, M. Boudraa, and T. H. Boumediene, "Algerian Arabic Speech Database ( ALGASD ): Corpus design and automatic speech recognition application CORPUS DESIGN AND AUTOMATIC SPEECH," no. December, 2010.
H. Sarfraz, S. Hussain, R. Bokhari, A. A. Raza, I. Ullah, S. Pervez, A. Mustafa, I. Javed, and R. Parveen, "Speech Corpus Development for a Speaker Independent Spontaneous Urdu Speech Recognition System," pp. 1-6, 2010.
M. Alghamdi, F. Alhargan, M. Alkanhal, A. Alkhairy, M. Eldesouki, and A. Alenazi, "Saudi Accented Arabic Voice Bank," J. King Saud Univ. - Comput. Inf. Sci., vol. 20, pp. 45-64, 2008.
R. Kumar, S. P. Kishore, A. Gopalakrishna, R. Chitturi, S. Joshi, S. Singh, R. N. V Sitaram, G. Anumanchipalli, R. Chitturi, S. Joshi, R. Kumar, S. P. Singh, R. N. V Sitaram, and S. P. Kishore, "Development of Indian Language Speech Databases for Large Vocabulary Speech Recognition Systems by Vocabulary Speech Recognition Systems," no. July, 2007.
J. Hennebert, H. Melin, D. Petrovska, and D. Genoud, "POLYCOST : A telephone-speech database for speaker," vol. 31, 2000.
J. Ortega-garcia, J. Gonzalez-rodriguez, and V. Marrero-aguiar, "AHUMADA : A large speech corpus in Spanish for speaker characterization and identi ® cation q," vol. 31, pp. 255-264, 2000.
R. Cole, M. Noel, and V. Noel, "THE CSLU SPEAKER RECOGNITION CORPUS," no. Icslp 98, pp. 98-101, 1998.
"GANDALF - A SWEDISH TELEPHONE SPEAKER VERIFICATION DATABASE," pp. 3-6.
M. Falcone, A. Gdlo, and V. B. Castiglione, "THE ' SIVA ' SPEECH DATABASE FOR SPEAKER VERIFICATION : DESCRIPTION AND EVALUATION," pp. 1902-1905, 1902.
"TESTING WITH THE YOHO CD-ROM VOICE VERIFICATION CORPUS Joseph," pp. 341-344, 1995.
J. Sinclair and C. Watson, "The Development of the Otago Speech Database," pp. 298-301, 1995.
"ISCA Archive 4," no. September, pp. 817-820, 1995.
A. S. Recognition, "ISCA Archive," no. April, pp. 39-42, 1994.
J. J. Godfrey and E. C. Holliman, "SWITCHBOARD : Telephone Speech Corpus for Research and Development," pp. 517-520, 1992.
https://en.wikipedia.org/wiki/Districts_of_Gilgit%E2%80%93Baltistan
https://en.wikipedia.org/wiki/Gilgit-Baltistan
Y. A. Ibrahim, J. C. Odiketa, and T. S. Ibiyemi, "Preprocessing technique in automatic speech recogntion for human computer interaction: an overview," Ann. Comput. Sci. Ser., vol. XV, no. 1, pp. 186 - 191, 2017.
https://www.jcbrolabs.org/speech-processing
Nisha, "Voice Recognition Technique : A Review," Int. J. Res. Appl. Sci. Eng. Technol., vol. 5, no. V, pp. 262-268, 2017.
B. Tomassetti, M. Verdecchia, and F. Giorgi, "NN5: A neural network based approach for the downscaling of precipitation fields - Model description and preliminary results," J. Hydrol., vol. 367, no. 1-2, pp. 14-26, 2009.
B. Choubin, S. Khalighi-Sigaroodi, A. Malekian, and O. Kisi, "Multiple linear regression, multilayer perceptron network and adaptive neuro-fuzzy inference system for forecasting precipitation based on large-scale climate signals," Hydrol. Sci. J., vol. 61, no. 6, pp. 1001-1009, 2016.

International Journal of Computer Science & Network Security

Foreign Accents Classification of English and Urdu Languages, Design of Related Voice Data Base and A Proposed MLP based Speaker Verification System

Abstract

Keywords

References

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)