Efficient DL Models for Voice Pathology Detection in Healthcare Applications using Sustained Vowels | ||
Journal of Innovations in Computer Science and Engineering (JICSE) | ||
مقاله 9، دوره 2، Special Issues 2 - شماره پیاپی 4، فروردین 2025، صفحه 26-32 اصل مقاله (388.93 K) | ||
نوع مقاله: Original Article | ||
شناسه دیجیتال (DOI): 10.48308/jicse.2025.239718.1068 | ||
نویسندگان | ||
Sahar Farazi* ؛ yaser shekofteh | ||
Faculty of Computer Science and Engineering Shahid Beheshti University Tehran, Iran | ||
چکیده | ||
Abstract— Voice Pathology Detection (VPD) aims to identify voice impairments through the analysis of speech signals, providing a foundation for developing diagnostic tools in advanced healthcare services to the public. This paper contributes to the development of efficient and accurate models based on deep learning (DL) for automatic VPD using sustained vowels of speech data. Therefore, this study explores the comparative efficacy of Mel-Frequency Cepstral Coefficients (MFCCs) and Linear Predictive Coding (LPC) as acoustic features extracted from vowels /i/, /a/, and /u/. Using the AVFAD database, we utilized and optimized a Convolutional Neural Network (CNN) as a DL model to classify healthy and pathological voices, prioritizing both accuracy and computational efficiency for real-time applications. Our findings reveal that 20 MFCC features extracted from vowel /i/ achieve the highest accuracy, with the optimal model reaching approximately 88% on test data. Our findings reveal that 20 MFCC features extracted from vowel /i/ achieve the highest accuracy, with the optimal model reaching approximately 88% on test data. | ||
کلیدواژهها | ||
Keywords— Voice Pathology Detection؛ Sustained Vowel؛ Feature extraction؛ MFCC؛ LPC؛ CNN | ||
آمار تعداد مشاهده مقاله: 151 تعداد دریافت فایل اصل مقاله: 80 |