Advanced Rice Grain Classification Using Hybrid Vision Transformer Models
DOI:
https://doi.org/10.52700/scir.v7i1.179Keywords:
Rice Variety Classification, Vision Transformers (ViTs), Grain Image Recognition, Deep Learning, Data AugmentationAbstract
Accurate classification of rice varieties is crucial for improving rice quality, determining premium pricing, and enhancing processing efficiency in agriculture. Traditional methods, including manual inspection and classical machine learning, often suffer from limitations such as low accuracy, poor scalability, and challenges in distinguishing between similar grain types. To address these issues, we propose a hybrid classification approach that leverages the strengths of Vision Transformers (ViTs), attention mechanisms, and advanced data augmentation techniques. This method significantly improves the accuracy and robustness of rice grain classification. Using a dataset of 2,100 images of Seela and Super rice types, along with 1,508 images of another rice variety, the proposed system achieved an impressive accuracy of 99.17%. Furthermore, it demonstrated superior performance in precision, recall, and F1-score when compared to traditional Convolutional Neural Network (CNN) models. The results are visualized to highlight the effectiveness of the model and underscore the potential of ViT-based techniques in agricultural quality control. This experimental approach not only addresses long-standing issues in rice variety recognition but also paves the way for scalable and automated grain quality assessment systems, making a significant step forward in modern agricultural practices and food supply chain management.