Analytical evaluation of term weighting schemes for text categorization

dc.contributor.authorAltincay, Hakan
dc.contributor.authorErenel, Zafer
dc.date.accessioned2026-02-06T18:40:18Z
dc.date.issued2010
dc.departmentDoğu Akdeniz Üniversitesi
dc.description.abstractAn analytical evaluation of six widely used term weighting techniques for text categorization is presented. The analysis depends on expressing the term weights using term occurrence probabilities in positive and negative categories. The weighting behaviors of the schemes considered are firstly clarified by analyzing the relation between the occurrence probabilities of terms which receive equal weights. Then, the weights are expressed in terms of ratio and difference of term occurrence probabilities where the similarities and differences among different schemes are revealed. Simulations show that the relative performance of different schemes can be explained by the ways they use ratio and difference of term occurrence probabilities in generating the term weights. (C) 2010 Elsevier B.V. All rights reserved.
dc.description.sponsorshipMinistry of Education and Culture of Northern Cyprus [MEKB-09-02]
dc.description.sponsorshipWe are grateful to the anonymous reviewers for their constructive suggestions. The numerical calculations reported in this paper were partly performed at TUBITAK ULAKBIM, High Performance and Grid Computing Center (TR-Grid e-Infrastructure). This work was supported by the research grant MEKB-09-02 provided by the Ministry of Education and Culture of Northern Cyprus.
dc.identifier.doi10.1016/j.patrec.2010.03.012
dc.identifier.endpage1323
dc.identifier.issn0167-8655
dc.identifier.issn1872-7344
dc.identifier.issue11
dc.identifier.scopus2-s2.0-77953131475
dc.identifier.scopusqualityQ1
dc.identifier.startpage1310
dc.identifier.urihttps://doi.org/10.1016/j.patrec.2010.03.012
dc.identifier.urihttps://hdl.handle.net/11129/13258
dc.identifier.volume31
dc.identifier.wosWOS:000279834800011
dc.identifier.wosqualityQ2
dc.indekslendigikaynakWeb of Science
dc.indekslendigikaynakScopus
dc.language.isoen
dc.publisherElsevier
dc.relation.ispartofPattern Recognition Letters
dc.relation.publicationcategoryMakale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanı
dc.rightsinfo:eu-repo/semantics/closedAccess
dc.snmzKA_WoS_20260204
dc.subjectContour lines
dc.subjectTerm occurrence probability
dc.subjectTerm weighting
dc.subjectRelative weights
dc.subjectText categorization
dc.titleAnalytical evaluation of term weighting schemes for text categorization
dc.typeArticle

Files