Newswise — role in intelligent traffic management and assisting criminal investigations. However, the existing datasets for vehicle colors are limited to only 13 classes, which fails to meet the current demand. Moreover, the efforts made in VCR face the challenge of class imbalance within the datasets.

To address these issues, a research team led by Mingdi HU has recently published their groundbreaking research in Frontiers of Computer Science, co-published by Higher Education Press and Springer·Nature.

The team introduces a novel VCR method called Smooth Modulation Neural Network with Multi-Scale Feature Fusion (SMNN-MSFF). They also present a new VCR dataset, named Vehicle Color-24, which consists of 24 vehicle classes. The SMNN-MSFF model incorporates multiscale feature fusion and smooth modulation. The former focuses on extracting feature information from local to global levels, while the latter enhances the learning process by increasing the importance of images belonging to underrepresented classes, effectively addressing class imbalance. Through extensive ablation studies, each module of the proposed method is shown to be effective, with smooth modulation particularly aiding in feature learning for minority or tail classes. The comprehensive experimental evaluation conducted on Vehicle Color-24 and three previously representative datasets demonstrates that the SMNN-MSFF model surpasses state-of-the-art VCR methods in terms of performance.

The research team also developed a new dataset called Vehicle Color-24, which consists of 24 distinct vehicle colors. These colors encompass a wide range, including red, dark-red, pink, orange, dark-orange, red-orange, yellow, lemon-yellow, earthy-yellow, green, dark-green, grass-green, cyan, blue, dark-blue, purple, black, white, silver-gray, gray, dark-gray, champagne, brown, and dark-brown. Vehicle Color-24 aims to address the practical needs of vehicle traffic management and criminal vehicle tracking applications.

Next, they propose a novel method for vehicle color recognition based on SMNN-MSFF. Firstly, the algorithm takes into account the inherent imbalance in color distribution within any dataset. Through ablation experiments, they fine-tune the loss function of the network to enable better capture of characteristics associated with small-scale classes, surpassing the performance of focal loss. Secondly, the network incorporates an FPN module to extract information regarding edges and corners, which aids in extracting vehicle shape features and local location information to assist in vehicle recognition. Thirdly, the backbone network is designed with only 42 layers, making it a lightweight network that reduces storage requirements and increases feasibility for practical applications.

The experimental results demonstrate that the proposed method achieves an mAP (mean Average Precision) of 94.96% in recognizing the 24 types of colors. The SMNN-MSFF model outperforms state-of-the-art VCR methods, fulfilling the requirements for accurate fine classification of vehicle colors.

However, given the influence of unpredictable factors in real-world environments and the presence of a long-tail effect in vehicle color distribution, further efforts are needed to enhance fine-grained vehicle color recognition. Future research will focus on addressing class imbalance since vehicle colors exhibit diversity, and the vehicle color dataset must encompass long-tail distribution characteristics.

 

###

Journal Link: Frontiers of Computer Science