Exploring multicepstral features in a new classical machine learning-based framework for replay attack detection

dc.contributor.authorContreras, Rodrigo Colnago
dc.contributor.authorViana, Monique Simplicio
dc.contributor.authorFonseca, Everthon Silva
dc.contributor.authorBongarti, Marcelo Adriano dos Santos
dc.contributor.authorToygar, Onsen
dc.contributor.authorGuido, Rodrigo Capobianco
dc.date.accessioned2026-02-06T18:37:32Z
dc.date.issued2025
dc.departmentDoğu Akdeniz Üniversitesi
dc.description.abstractThe integration of Internet of Things (IoT) technologies has accelerated the adoption of recognition and authentication systems, offering seamless access across devices from smart homes to workplace systems. Among biometric traits, voice stands out due to its simplicity, cleanliness, low capture cost, uniqueness, and the extensive computational resources supporting it in the scientific literature. Recently, however, spoofing risks have emerged as a serious challenge to the security of voice-based systems. To counteract these threats without additional hardware, techniques analyzing inherent voice signal features have been developed. This paper introduces a new soft computing framework based on classical machine learning classifiers such as Support Vector Machine (SVM), Random Forest (RF), and Logistic Regression (LR), comprising Gaussian-noise-based data augmentation, extraction and fusion of multiple cepstral and non-cepstral features, and dimensionality reduction through Singular Value Decomposition (SVD). In particular, we explore eight distinct cepstral extraction techniques, exemplified by popular approaches such as MFCC and CQCC, and sixteen additional non-cepstral metrics such as Zero Crossing Rate (ZCR) and Harmonic-to-Noise Ratio (HNR). Additionally, we generalize cepstral pattern representation by proposing cepstral multiprojection, a novel strategy designed to systematically reduce the dimensionality and redundancy of multicepstral matrices, thereby enhancing discriminative power and computational efficiency. Evaluated with the ASVSpoof 2017 v2.0 competition benchmark, our approach achieved competitive results, reaching 5.14% equal error rate (EER) on the Dev set and 10.58% on the Eval set,
dc.identifier.doi10.1016/j.compeleceng.2025.110570
dc.identifier.issn0045-7906
dc.identifier.issn1879-0755
dc.identifier.orcid0000-0003-4003-7791
dc.identifier.orcid0000-0002-2960-8293
dc.identifier.orcid0000-0002-9027-7702
dc.identifier.orcid0000-0002-0924-8024
dc.identifier.orcid0000-0001-6202-0806
dc.identifier.orcid0000-0001-7402-9058
dc.identifier.scopus2-s2.0-105011523726
dc.identifier.scopusqualityQ1
dc.identifier.urihttps://doi.org/10.1016/j.compeleceng.2025.110570
dc.identifier.urihttps://hdl.handle.net/11129/12509
dc.identifier.volume127
dc.identifier.wosWOS:001541574200001
dc.identifier.wosqualityQ1
dc.indekslendigikaynakWeb of Science
dc.indekslendigikaynakScopus
dc.language.isoen
dc.publisherPergamon-Elsevier Science Ltd
dc.relation.ispartofComputers & Electrical Engineering
dc.relation.publicationcategoryMakale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanı
dc.rightsinfo:eu-repo/semantics/closedAccess
dc.snmzKA_WoS_20260204
dc.subjectVoice Liveness Detection
dc.subjectSpoofing detection
dc.subjectPattern recognition
dc.subjectCepstral analysis
dc.subjectMachine learning
dc.titleExploring multicepstral features in a new classical machine learning-based framework for replay attack detection
dc.typeArticle

Files