Analytical evaluation of term weighting schemes for text categorization
| dc.contributor.author | Altincay, Hakan | |
| dc.contributor.author | Erenel, Zafer | |
| dc.date.accessioned | 2026-02-06T18:40:18Z | |
| dc.date.issued | 2010 | |
| dc.department | Doğu Akdeniz Üniversitesi | |
| dc.description.abstract | An analytical evaluation of six widely used term weighting techniques for text categorization is presented. The analysis depends on expressing the term weights using term occurrence probabilities in positive and negative categories. The weighting behaviors of the schemes considered are firstly clarified by analyzing the relation between the occurrence probabilities of terms which receive equal weights. Then, the weights are expressed in terms of ratio and difference of term occurrence probabilities where the similarities and differences among different schemes are revealed. Simulations show that the relative performance of different schemes can be explained by the ways they use ratio and difference of term occurrence probabilities in generating the term weights. (C) 2010 Elsevier B.V. All rights reserved. | |
| dc.description.sponsorship | Ministry of Education and Culture of Northern Cyprus [MEKB-09-02] | |
| dc.description.sponsorship | We are grateful to the anonymous reviewers for their constructive suggestions. The numerical calculations reported in this paper were partly performed at TUBITAK ULAKBIM, High Performance and Grid Computing Center (TR-Grid e-Infrastructure). This work was supported by the research grant MEKB-09-02 provided by the Ministry of Education and Culture of Northern Cyprus. | |
| dc.identifier.doi | 10.1016/j.patrec.2010.03.012 | |
| dc.identifier.endpage | 1323 | |
| dc.identifier.issn | 0167-8655 | |
| dc.identifier.issn | 1872-7344 | |
| dc.identifier.issue | 11 | |
| dc.identifier.scopus | 2-s2.0-77953131475 | |
| dc.identifier.scopusquality | Q1 | |
| dc.identifier.startpage | 1310 | |
| dc.identifier.uri | https://doi.org/10.1016/j.patrec.2010.03.012 | |
| dc.identifier.uri | https://hdl.handle.net/11129/13258 | |
| dc.identifier.volume | 31 | |
| dc.identifier.wos | WOS:000279834800011 | |
| dc.identifier.wosquality | Q2 | |
| dc.indekslendigikaynak | Web of Science | |
| dc.indekslendigikaynak | Scopus | |
| dc.language.iso | en | |
| dc.publisher | Elsevier | |
| dc.relation.ispartof | Pattern Recognition Letters | |
| dc.relation.publicationcategory | Makale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanı | |
| dc.rights | info:eu-repo/semantics/closedAccess | |
| dc.snmz | KA_WoS_20260204 | |
| dc.subject | Contour lines | |
| dc.subject | Term occurrence probability | |
| dc.subject | Term weighting | |
| dc.subject | Relative weights | |
| dc.subject | Text categorization | |
| dc.title | Analytical evaluation of term weighting schemes for text categorization | |
| dc.type | Article |










