Using Linear Regression Residual of Document Vectors in Text Categorization

Loading...
Thumbnail Image

Date

Journal Title

Journal ISSN

Volume Title

Publisher

IEEE

Access Rights

info:eu-repo/semantics/closedAccess

Abstract

The use of linear regression residual for binary text categorization is studied. The main idea is to predict the given test vector using its k nearest neighbors in both positive and negative classes. The predicted vectors are the projections of the test vector onto the subspaces of different classes. The differences between the test vector and the projections are known as the residual vectors. The magnitudes of these vectors show the effectiveness of the neighbors in different classes to represent the test vector. The residuals obtained from both positive and negative classes are cancatenated with the document vectors computed using bag of words approach. Experimental results on three widely used datasets have shown that residual vectors provide improved document representation.

Description

21st Signal Processing and Communications Applications Conference (SIU) -- APR 24-26, 2013 -- CYPRUS

Keywords

linear regression, residual vector, document representation, text categorization

Journal or Series

2013 21St Signal Processing and Communications Applications Conference (Siu)

WoS Q Value

Scopus Q Value

Volume

Issue

Citation

Endorsement

Review

Supplemented By

Referenced By