Image scene geometry recognition using low-level features fusion at multi-layer deep CNN
| dc.contributor.author | Khan, Altaf | |
| dc.contributor.author | Chefranov, Alexander | |
| dc.contributor.author | Demirel, Hasan | |
| dc.date.accessioned | 2026-02-06T18:40:13Z | |
| dc.date.issued | 2021 | |
| dc.department | Doğu Akdeniz Üniversitesi | |
| dc.description.abstract | The image scene geometry recognition is an important element for reconstructing the 3D scene geometry of a single image. It is useful for computer vision applications, such as 3D TV, video categorization, and robot navigation system. A 3D scene geometry with a unique depth represents a rough structure of 2D images. An approach to efficient implementation and achieving high recognition accuracy of 3D scene geometry remains a significant challenges in the computer vision domain. Existing approaches attempt to use the pre-trained deep convolutional neural networks (CNN) models as feature extractor and also explore the benefits of multi-layer features representation for small or medium-size datasets. However, these studies pay little attention to building a discriminative feature representation by exploring the benefits of low-level features fusion with multi-layer feature from a single CNN model. To address this problem, we propose a novel model of image scene geometry recognition in which the low-level handcrafted features are integrated with deep CNN multi-stage features (HF-MSF) by using the feature-fusion and score-level fusion strategies. The low-level features contain rich discriminative information of 3D scene geometry, including shape, color, and depth estimation. In feature-fusion, the multi layer features at different stages and handcrafted features are fused at an early phase, and in score-level fusion, the handcrafted features are integrated with multi-layer feature of a single CNN model at different stages and each stage is connected with a classifier and then score-level fusion of these classifiers is performed automatically to recognize the scene geometry type. For validation and comparison purposes, two well-known deep learning architectures, namely GoogLeNet and ResNet are employed as a backbone of proposed model. Experimental results exhibited that by taking the advantages of both types of fusion, the proposed HF-MSF model has an improved recognition accuracy of 12.21% and 4.96% when compared to G-MS2F model for 12-Scene and 15-Scene image datasets, respectively. Similarly, it improves the accuracy by 3.85% when compared with the FTOTLM model for the 15-Scene dataset. (c) 2021 Elsevier B.V. All rights reserved. | |
| dc.identifier.doi | 10.1016/j.neucom.2021.01.085 | |
| dc.identifier.endpage | 126 | |
| dc.identifier.issn | 0925-2312 | |
| dc.identifier.issn | 1872-8286 | |
| dc.identifier.orcid | 0000-0003-4116-520X | |
| dc.identifier.scopus | 2-s2.0-85101800559 | |
| dc.identifier.scopusquality | Q1 | |
| dc.identifier.startpage | 111 | |
| dc.identifier.uri | https://doi.org/10.1016/j.neucom.2021.01.085 | |
| dc.identifier.uri | https://hdl.handle.net/11129/13202 | |
| dc.identifier.volume | 440 | |
| dc.identifier.wos | WOS:000642408200010 | |
| dc.identifier.wosquality | Q1 | |
| dc.indekslendigikaynak | Web of Science | |
| dc.indekslendigikaynak | Scopus | |
| dc.language.iso | en | |
| dc.publisher | Elsevier | |
| dc.relation.ispartof | Neurocomputing | |
| dc.relation.publicationcategory | Makale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanı | |
| dc.rights | info:eu-repo/semantics/closedAccess | |
| dc.snmz | KA_WoS_20260204 | |
| dc.subject | Image Scene Geometry recognition | |
| dc.subject | Multi-layer CNN features | |
| dc.subject | Low-level features | |
| dc.subject | GoogLeNet | |
| dc.subject | ResNet | |
| dc.title | Image scene geometry recognition using low-level features fusion at multi-layer deep CNN | |
| dc.type | Article |










