Improving the precision-recall trade-off in undersampling-based binary text categorization using unanimity rule

Erenel, Zafer; Altincay, Hakan

doi:10.1007/s00521-012-1056-5

Improving the precision-recall trade-off in undersampling-based binary text categorization using unanimity rule

dc.contributor.author	Erenel, Zafer
dc.contributor.author	Altincay, Hakan
dc.date.accessioned	2026-02-06T18:34:14Z
dc.date.issued	2013
dc.department	Doğu Akdeniz Üniversitesi
dc.description.abstract	The distribution of documents over two classes in binary text categorization problem is generally uneven where resampling approaches are shown to improve F-1 scores. The improvement achieved is mainly due to the gain in recall where precision may deteriorate. Since precision is the primary concern in some applications, achieving higher F-1 scores with a desired level of trade-off between precision and recall is important. In this study, we present an analytical comparison between unanimity and majority voting rules. It is shown that unanimity rule can provide better F-1 scores compared to majority voting when an ensemble of high recall but low precision classifiers is considered. Then, category-based undersampling is proposed to generate high recall members. The experiments conducted on three datasets have shown that superior F-1 scores can be realized compared to the support vector machines(SVM)-based baseline system and voting over a random undersampling-based ensemble.
dc.description.sponsorship	Ministry of Education and Culture of Northern Cyprus [MEKB-09-02]
dc.description.sponsorship	The numerical calculations reported in this paper were partly performed at the ULAKBIM High Performance Computing Center of the Turkish Scientific and Technical Research Council (TUBITAK). This work was supported by the research grant MEKB-09-02 provided by the Ministry of Education and Culture of Northern Cyprus.
dc.identifier.doi	10.1007/s00521-012-1056-5
dc.identifier.endpage	S100
dc.identifier.issn	0941-0643
dc.identifier.issn	1433-3058
dc.identifier.scopus	2-s2.0-84878018711
dc.identifier.scopusquality	Q1
dc.identifier.startpage	S83
dc.identifier.uri	https://doi.org/10.1007/s00521-012-1056-5
dc.identifier.uri	https://hdl.handle.net/11129/11673
dc.identifier.volume	22
dc.identifier.wos	WOS:000323413300008
dc.identifier.wosquality	N/A
dc.indekslendigikaynak	Web of Science
dc.indekslendigikaynak	Scopus
dc.language.iso	en
dc.publisher	Springer London Ltd
dc.relation.ispartof	Neural Computing & Applications
dc.relation.publicationcategory	Makale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanı
dc.rights	info:eu-repo/semantics/closedAccess
dc.snmz	KA_WoS_20260204
dc.subject	Class imbalance
dc.subject	Resampling
dc.subject	Classifier ensemble
dc.subject	Unanimity rule
dc.subject	Binary text categorization
dc.title	Improving the precision-recall trade-off in undersampling-based binary text categorization using unanimity rule
dc.type	Article

Collections

WoS Indexed Publications Collection
Scopus İndeksli Yayınlar Koleksiyonu

Improving the precision-recall trade-off in undersampling-based binary text categorization using unanimity rule

Files

Collections