Using the absolute difference of term occurrence probabilities in binary text categorization

dc.contributor.authorAltincay, Hakan
dc.contributor.authorErenel, Zafer
dc.date.accessioned2026-02-06T18:34:18Z
dc.date.issued2012
dc.departmentDoğu Akdeniz Üniversitesi
dc.description.abstractIn this study, the differences among widely used weighting schemes are studied by means of ordering terms according to their discriminative abilities using a recently developed framework which expresses term weights in terms of the ratio and absolute difference of term occurrence probabilities. Having observed that the ordering of terms is dependent on the weighting scheme under concern, it is emphasized that this can be explained by the way different schemes use term occurrence differences in generating term weights. Then, it is proposed that the relevance frequency which is shown to provide the best scores on several datasets can be improved by taking into account the way absolute difference values are used in other widely used schemes. Experimental results on two different datasets have shown that improved F-1 scores can be achieved.
dc.description.sponsorshipMinistry of Education and Culture of Northern Cyprus [MEKB-09-02]
dc.description.sponsorshipThe numerical calculations reported in this paper were partly performed at TUBITAK ULAKBIM, High Performance and Grid Computing Center (TR-Grid e-Infrastructure) in Turkey. This work was supported by the research grant MEKB-09-02 provided by the Ministry of Education and Culture of Northern Cyprus.
dc.identifier.doi10.1007/s10489-010-0250-3
dc.identifier.endpage160
dc.identifier.issn0924-669X
dc.identifier.issn1573-7497
dc.identifier.issue1
dc.identifier.scopus2-s2.0-84856280759
dc.identifier.scopusqualityQ1
dc.identifier.startpage148
dc.identifier.urihttps://doi.org/10.1007/s10489-010-0250-3
dc.identifier.urihttps://hdl.handle.net/11129/11735
dc.identifier.volume36
dc.identifier.wosWOS:000298853200009
dc.identifier.wosqualityQ2
dc.indekslendigikaynakWeb of Science
dc.indekslendigikaynakScopus
dc.language.isoen
dc.publisherSpringer
dc.relation.ispartofApplied Intelligence
dc.relation.publicationcategoryMakale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanı
dc.rightsinfo:eu-repo/semantics/closedAccess
dc.snmzKA_WoS_20260204
dc.subjectTerm occurrence probability
dc.subjectTerm weighting
dc.subjectRelevance frequency
dc.subjectMutual information
dc.subjectChi-square
dc.subjectOdds ratio
dc.subjectText categorization
dc.titleUsing the absolute difference of term occurrence probabilities in binary text categorization
dc.typeArticle

Files