Patch Token Fusion in Vision Transformers for Brain Cancer Classification

Loading...
Thumbnail Image

Date

Journal Title

Journal ISSN

Volume Title

Publisher

IEEE

Access Rights

info:eu-repo/semantics/closedAccess

Abstract

Accurate and robust image classification plays a critical role in advancing medical diagnostics, particularly in detecting complex conditions such as brain cancer. This study investigates the integration of multiple Vision Transformer (ViT) models for patch-token-based image classification, aiming to enhance diagnostic accuracy. By leveraging three pre-trained ViT architectures (TinyViT, SmallViT, and BaseViT), features from each model are dynamically extracted, aligned, and combined into a unified representation for classification. The proposed approach demonstrated significant improvements in accuracy, AUC, and F1-score when evaluated across various model combinations and configurations. The highest performance was observed with specific combinations, achieving an accuracy of 95.96%, AUC of 99.58%, and F1-score of 95.95% for the ViT-Tiny-based classifier.

Description

33rd Conference on Signal Processing and Communications Applications-SIU-Annual -- JUN 25-28, 2025 -- Istanbul, TURKIYE

Keywords

Brain cancer, Multi, model fusion, Patch token, Vision Transformer

Journal or Series

2025 33Rd Signal Processing and Communications Applications Conference, Siu

WoS Q Value

Scopus Q Value

Volume

Issue

Citation

Endorsement

Review

Supplemented By

Referenced By