Analytical evaluation of term weighting schemes for text categorization

Loading...
Thumbnail Image

Date

Journal Title

Journal ISSN

Volume Title

Publisher

Elsevier

Access Rights

info:eu-repo/semantics/closedAccess

Abstract

An analytical evaluation of six widely used term weighting techniques for text categorization is presented. The analysis depends on expressing the term weights using term occurrence probabilities in positive and negative categories. The weighting behaviors of the schemes considered are firstly clarified by analyzing the relation between the occurrence probabilities of terms which receive equal weights. Then, the weights are expressed in terms of ratio and difference of term occurrence probabilities where the similarities and differences among different schemes are revealed. Simulations show that the relative performance of different schemes can be explained by the ways they use ratio and difference of term occurrence probabilities in generating the term weights. (C) 2010 Elsevier B.V. All rights reserved.

Description

Keywords

Contour lines, Term occurrence probability, Term weighting, Relative weights, Text categorization

Journal or Series

Pattern Recognition Letters

WoS Q Value

Scopus Q Value

Volume

31

Issue

11

Citation

Endorsement

Review

Supplemented By

Referenced By