Recently, transforming windows files into images and its analysis using machine learning and deep learning have been considered as a state-of-the art works for malware detection and classification. This is mainly due to the fact that image-based malware detection and classification is platform independent, and the recent surge of success of deep learning model performance in image classification. Literature survey shows that convolutional neural network (CNN) deep learning methods are successfully employed for image-based windows malware classification. However, the malwares were embedded in a tiny portion in the overall image representation. Identifying and locating these affected tiny portions is important to achieve a good malware classification accuracy. In this work, a multi-headed attention based approach is integrated to a CNN to locate and identify the tiny infected regions in the overall image. A detailed investigation and analysis of the proposed method was done on a malware image dataset. The performance of the proposed multi-headed attention-based CNN approach was compared with various non-attention-CNN-based approaches on various data splits of training and testing malware image benchmark dataset. In all the data-splits, the attention-based CNN method outperformed non-attention-based CNN methods while ensuring computational efficiency. Most importantly, most of the methods show consistent performance on all the data splits of training and testing and that illuminates multi-headed attention with CNN model's generalizability to perform on the diverse datasets. With less number of trainable parameters, the proposed method has achieved an accuracy of 99% to classify the 25 malware families and performed better than the existing non-attention based methods. The proposed method can be applied on any operating system and it has the capability to detect packed malware, metamorphic malware, obfuscated malware, malware family variants, and polymorphic malware. In addition, the proposed method is malware file agnostic and avoids usual methods such as disassembly, de-compiling, de-obfuscation, or execution of the malware binary in a virtual environment in detecting malware and classifying malware into their malware family.