Feature extraction using single variable classifiers for binary text classification

Loading...
Thumbnail Image

Date

Journal Title

Journal ISSN

Volume Title

Publisher

Access Rights

info:eu-repo/semantics/closedAccess

Abstract

The most popular approach for document representation is the bag-of-words where terms are considered as features. In order to compute the values of these features, the term frequencies are generally scaled by a collection frequency factor to take into account the relative importance of different terms. The term frequencies can be considered as raw data about the input document. In this study, a novel framework for feature extraction is proposed for binary text classification where feature extraction is defined as a single variable classification problem. The term frequencies are the inputs and the output of each classifier is used to define a triple of features for the corresponding term. The magnitude of the classifier output that is in the interval [0.5,1] is an indicator for the confidence of the classifier and it is also employed in document representation together with the term frequency and the collection frequency factor. © 2013 Springer-Verlag.

Description

26th International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems, IEA/AIE 2013 --

Keywords

document classification, document representation, Feature extraction, single variable classifiers, term weighting

Journal or Series

Lecture Notes in Computer Science

WoS Q Value

Scopus Q Value

Volume

7906 LNAI

Issue

Citation

Endorsement

Review

Supplemented By

Referenced By