Comparison of Vision Transformer with Convolutional Neural Networks for Brain Cancer Classification
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Access Rights
Abstract
Brain cancer is one of the most deadly illnesses. It causes abnormal cells to grow in the brain. Planning for treatment and the prognosis of patients with brain tumors depend greatly on early diagnosis. Brain tumors can have different characteristics, treatments, and forms. Consequently, the process of manually detecting brain tumors is difficult, labor-intensive, and error-prone. Doctors use magnetic resonance imaging to detect those abnormal cells in the brain. With the growth of artificial intelligence, it is possible to diagnose the brain tumor from MIR images. For instance, convolutional neural networks and transformers could be used. The self-Attention mechanism is implemented by transformers, which are models that give each input data component a distinct weight. Transformers have limited applications in image classification tasks because they were originally designed for use in natural language processing applications. Thus far, the majority of image classification research has employed convolutional neural networks. In this paper, six different pretrained convolutional neural networks and a vision transformer are used to classify four distinct brain tumor classes. The models include ResNet50, AlexNet, VGG16, InceptionV3, MobileNetV2, FractalNet, and the Vision Transformer. The goal of this study is to compare the performance of these pretrained convolutional neural network models with that of the vision transformer, demonstrating that transformers can also be effectively applied to image classification tasks. The performance of a vision transformer model shows 84.39% accuracy in the classification problem, which is better than the other six architectures. © 2025 IEEE.










