Termset weighting by adapting term weighting schemes to utilize cardinality statistics for binary text categorization

dc.contributor.authorBadawi, Dima
dc.contributor.authorAltincay, Hakan
dc.date.accessioned2026-02-06T18:34:18Z
dc.date.issued2017
dc.departmentDoğu Akdeniz Üniversitesi
dc.description.abstractThis study proposes a novel scheme for termset weighting based on cardinality statistics. Specifically, termsets are evaluated by considering the number of apparent member terms. Based on a recently verified hypothesis that the occurrence of a subset of terms may also transfer worthwhile information about class memberships, the existing term weighting schemes are adapted. Here, the weight of a given termset is computed as the product of two factors. The first is a function of the member term frequencies that exist in the given document, and the second takes into account the numbers of positive and negative training documents in which the same number of members appear. By assigning a non-zero weight to the termsets when a subset of the member terms appears, the discriminative ability of different member term subsets is taken into consideration.
dc.identifier.doi10.1007/s10489-017-0911-6
dc.identifier.endpage472
dc.identifier.issn0924-669X
dc.identifier.issn1573-7497
dc.identifier.issue2
dc.identifier.scopus2-s2.0-85017186491
dc.identifier.scopusqualityQ1
dc.identifier.startpage456
dc.identifier.urihttps://doi.org/10.1007/s10489-017-0911-6
dc.identifier.urihttps://hdl.handle.net/11129/11737
dc.identifier.volume47
dc.identifier.wosWOS:000407827300013
dc.identifier.wosqualityQ2
dc.indekslendigikaynakWeb of Science
dc.indekslendigikaynakScopus
dc.language.isoen
dc.publisherSpringer
dc.relation.ispartofApplied Intelligence
dc.relation.publicationcategoryMakale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanı
dc.rightsinfo:eu-repo/semantics/closedAccess
dc.snmzKA_WoS_20260204
dc.subjectTermsets
dc.subjectTermset cardinality
dc.subjectTermset weighting
dc.subjectTermset selection
dc.subjectDocument representation
dc.subjectText categorization
dc.titleTermset weighting by adapting term weighting schemes to utilize cardinality statistics for binary text categorization
dc.typeArticle

Files