Image scene geometry recognition using low-level features fusion at multi-layer deep CNN

dc.contributor.authorKhan, Altaf
dc.contributor.authorChefranov, Alexander
dc.contributor.authorDemirel, Hasan
dc.date.accessioned2026-02-06T18:40:13Z
dc.date.issued2021
dc.departmentDoğu Akdeniz Üniversitesi
dc.description.abstractThe image scene geometry recognition is an important element for reconstructing the 3D scene geometry of a single image. It is useful for computer vision applications, such as 3D TV, video categorization, and robot navigation system. A 3D scene geometry with a unique depth represents a rough structure of 2D images. An approach to efficient implementation and achieving high recognition accuracy of 3D scene geometry remains a significant challenges in the computer vision domain. Existing approaches attempt to use the pre-trained deep convolutional neural networks (CNN) models as feature extractor and also explore the benefits of multi-layer features representation for small or medium-size datasets. However, these studies pay little attention to building a discriminative feature representation by exploring the benefits of low-level features fusion with multi-layer feature from a single CNN model. To address this problem, we propose a novel model of image scene geometry recognition in which the low-level handcrafted features are integrated with deep CNN multi-stage features (HF-MSF) by using the feature-fusion and score-level fusion strategies. The low-level features contain rich discriminative information of 3D scene geometry, including shape, color, and depth estimation. In feature-fusion, the multi layer features at different stages and handcrafted features are fused at an early phase, and in score-level fusion, the handcrafted features are integrated with multi-layer feature of a single CNN model at different stages and each stage is connected with a classifier and then score-level fusion of these classifiers is performed automatically to recognize the scene geometry type. For validation and comparison purposes, two well-known deep learning architectures, namely GoogLeNet and ResNet are employed as a backbone of proposed model. Experimental results exhibited that by taking the advantages of both types of fusion, the proposed HF-MSF model has an improved recognition accuracy of 12.21% and 4.96% when compared to G-MS2F model for 12-Scene and 15-Scene image datasets, respectively. Similarly, it improves the accuracy by 3.85% when compared with the FTOTLM model for the 15-Scene dataset. (c) 2021 Elsevier B.V. All rights reserved.
dc.identifier.doi10.1016/j.neucom.2021.01.085
dc.identifier.endpage126
dc.identifier.issn0925-2312
dc.identifier.issn1872-8286
dc.identifier.orcid0000-0003-4116-520X
dc.identifier.scopus2-s2.0-85101800559
dc.identifier.scopusqualityQ1
dc.identifier.startpage111
dc.identifier.urihttps://doi.org/10.1016/j.neucom.2021.01.085
dc.identifier.urihttps://hdl.handle.net/11129/13202
dc.identifier.volume440
dc.identifier.wosWOS:000642408200010
dc.identifier.wosqualityQ1
dc.indekslendigikaynakWeb of Science
dc.indekslendigikaynakScopus
dc.language.isoen
dc.publisherElsevier
dc.relation.ispartofNeurocomputing
dc.relation.publicationcategoryMakale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanı
dc.rightsinfo:eu-repo/semantics/closedAccess
dc.snmzKA_WoS_20260204
dc.subjectImage Scene Geometry recognition
dc.subjectMulti-layer CNN features
dc.subjectLow-level features
dc.subjectGoogLeNet
dc.subjectResNet
dc.titleImage scene geometry recognition using low-level features fusion at multi-layer deep CNN
dc.typeArticle

Files