A Symmetric Term Weighting Scheme for Text Categorization Based on Term Occurrence Probabilities

Loading...
Thumbnail Image

Date

Journal Title

Journal ISSN

Volume Title

Publisher

IEEE

Access Rights

info:eu-repo/semantics/closedAccess

Abstract

Term weighting schemes used in text categorization can be considered as functions of term occurence probabilities in positive and negative classes. In this paper, widely used weighting schemes are firstly evaluated from this perspective. Then, a novel feature weighting scheme based on term occurrence probabilities is proposed. Experiments conducted using SVM classifier on the Reuters-21578 ModApte Top10 dataset shows that the proposed method outperforms other well known measures such as CHI, IG, OR and RF in terms of macro-F-1 and micro-F-1 scores.

Description

5th International Conference on Soft Computing, Computing with Words and Perceptions in System Analysis, Decision and Control -- SEP 02-04, 2009 -- Famagusta, CYPRUS

Keywords

Journal or Series

2009 Fifth International Conference on Soft Computing, Computing With Words and Perceptions in System Analysis, Decision and Control

WoS Q Value

Scopus Q Value

Volume

Issue

Citation

Endorsement

Review

Supplemented By

Referenced By