Early identification and adequate treatment can help prevent lung disorders from becoming chronic, severe, and life-threatening. X-ray images are commonly used and an automated and effective method involving deep learning techniques can potentially contribute to quick and accurate diagnosis of lung disorders. However, in the study of medical imaging using deep learning, two obstacles limit interpretability. One is an insufficient and imbalanced number of training samples in most medical datasets. The other is excessive training time. Although training time can be reduced by decreasing the number of pixels in the images, training with low resolution images tends to result in poor performance. This study represents a solution to overcome these impediments by balancing the number of images and reducing overall processing time while preserving accuracy. The dataset used in this research contains an unequal number of images in the different classes. The quantity of data in the classes is balanced by creating synthetic images based on the patterns and characteristics of the original images, using a Deep Convolutional Generative Adversarial Network (DCGAN). Unwanted regions are removed from the X-ray images, the brightness and contrast of the images are enhanced, and the abnormalities are highlighted by using different artifact removal, noise reduction, and enhancement techniques. We propose a Modified Compact Convolutional Transformer (MCCT) model using 32 × 32 sized images for the categorization of lung disorders into four classes. An ablation study of eleven cases is employed to adjust several hyper parameters and layer topologies. This reduces training time while preserving accuracy. Six transfer learning models, VGG19, VGG16, ResNet152, ResNet50, ResNet50V2, and MobileNet are applied with the same image size the performance is compared with the proposed MCCT model. Our MCCT model records the greatest test accuracy of 95.37%, requiring a short training time, 10-12 s/epoch, whereas the other models only reach near-moderate performance with accuracies ranging from 43% to 79% and training times of 80-90 s/epoch. The robustness of the model with regards to the number of training samples is validated by training the model multiple times reducing the number of training images gradually from 49621 images to 6204 images. Results suggest that even with a smaller dataset, the performance is sustained. Our proposed approach may contribute to an effective CAD based diagnostic system by addressing the issues of insufficient and imbalanced numbers of medical images, excessive training times and low-resolution images.