Compact Representation of Documents Using Terms and Termsets

dc.contributor.authorBadawi, Dima
dc.contributor.authorAltinçay, Hakan
dc.date.accessioned2026-02-06T17:53:59Z
dc.date.issued2018
dc.departmentDoğu Akdeniz Üniversitesi
dc.description14th International Conference on Machine Learning and Data Mining in Pattern Recognition, MLDM 2018 -- 2018-07-15 through 2018-07-19 -- New York -- 216139
dc.description.abstractIn this study, computation of compact document vectors by utilizing both terms and termsets for binary text categorization is addressed. In general, termsets are concatenated with all terms, leading to large document vectors. Selection of a subset of terms and termsets for compact but also effective representation of documents is considered in this study. Two different methods are studied for this purpose. In the first method, combination of terms and termsets in different proportions is evaluated. As an alternative approach, normalized ranking scores of terms and termsets are employed for subset selection. Experiments conducted on two widely used datasets have shown that termsets can effectively complement terms also in cases when small number of features are used to represent documents. © 2018, Springer International Publishing AG, part of Springer Nature.
dc.identifier.doi10.1007/978-3-319-96136-1_7
dc.identifier.endpage84
dc.identifier.isbn9789819698936
dc.identifier.isbn9789819698042
dc.identifier.isbn9789819698110
dc.identifier.isbn9789819698905
dc.identifier.isbn9783032004949
dc.identifier.isbn9789819512324
dc.identifier.isbn9783032026019
dc.identifier.isbn9783032008909
dc.identifier.isbn9783031915802
dc.identifier.isbn9789819698141
dc.identifier.issn0302-9743
dc.identifier.scopus2-s2.0-85050511013
dc.identifier.scopusqualityQ3
dc.identifier.startpage77
dc.identifier.urihttps://doi.org/10.1007/978-3-319-96136-1_7
dc.identifier.urihttps://search.trdizin.gov.tr/tr/yayin/detay/
dc.identifier.urihttps://hdl.handle.net/11129/7158
dc.identifier.volume10934 LNAI
dc.indekslendigikaynakScopus
dc.language.isoen
dc.publisherSpringer Verlag
dc.relation.ispartofLecture Notes in Computer Science
dc.relation.publicationcategoryKonferans Öğesi - Uluslararası - Kurum Öğretim Elemanı
dc.rightsinfo:eu-repo/semantics/closedAccess
dc.snmzKA_Scopus_20260204
dc.subjectCompact representation
dc.subjectDifferent proportions
dc.subjectDocument vectors
dc.subjectSubset selection
dc.subjectText categorization
dc.subjectBinary sequences
dc.titleCompact Representation of Documents Using Terms and Termsets
dc.typeConference Object

Files