EDCST-Rain: Enhanced Density-Aware Cross-Scale Transformer for Robust Object Classification Under Diverse Rainfall Conditions
DOI:
https://doi.org/10.30871/jaic.v10i1.11590Keywords:
Rain Degradation, Robust Classification, Vision Transformer, Weather-Aware Computer Vision, Autonomous Systems, Atmospheric Occlusion, Density-Aware NetworksAbstract
Rain degradation significantly impairs object classification systems, causing accuracy drops of 40-60% under severe conditions and limiting autonomous vehicle deployment. While preprocessing approaches attempt deraining before classification, they suffer from error propagation and computational overhead. This paper introduces EDCST-Rain, an Enhanced Density-Aware Cross-Scale Transformer specifically designed for robust classification under diverse rain conditions. The architecture consists of five integrated components: a Rain Density Encoding Module that captures rain streak density, accumulation, and orientation; a Swin-Tiny Backbone for hierarchical feature extraction; and three rain-specific mechanisms: directional attention modules adapting to rain streak orientation, accumulation-aware processing handling lens droplet distortions, and adaptive cross-scale fusion integrating multi-resolution information. We develop a comprehensive physics-based rain simulation framework covering four rain types (drizzle, moderate, heavy, storm) and implement a curriculum learning strategy that progressively introduces rain complexity during training. Extensive experiments on CIFAR-10 demonstrate that EDCST-Rain achieves 83.1% clean accuracy while maintaining 71.8% under severe rain (86.4% retention), representing a 10-percentage-point improvement over state-of-the-art methods. With 15.8 million parameters and a 14.3 ms GPU inference time, enabling real-time operation, EDCST-Rain provides a practical, weather-robust perception framework suitable for autonomous systems operating under adverse weather conditions.
Downloads
References
[1] D. Hendrycks and T. Dietterich, "Benchmarking neural network robustness to common corruptions and perturbations," in International Conference on Learning Representations, 2019.
[2] C. Michaelis, B. Mitzkus, R. Geirhos, E. Rusak, O. Bringmann, A. S. Ecker, M. Bethge, and W. Brendel, "Benchmarking robustness in object detection: Autonomous driving when winter is coming," arXiv preprint arXiv:1907.07484, 2019.
[3] S. G. Narasimhan and S. K. Nayar, "Vision and the atmosphere," International Journal of Computer Vision, vol. 48, no. 3, pp. 233–254, 2002.
[4] K. Garg and S. K. Nayar, "Vision and rain," International Journal of Computer Vision, vol. 75, no. 1, pp. 3–27, 2007.
[5] R. Qian, R. T. Tan, W. Yang, J. Su, and J. Liu, "Attentive generative adversarial network for raindrop removal from a single image," in IEEE Conference on Computer Vision and Pattern Recognition, pp. 2482–2491, 2018.
[6] S. G. Narasimhan and S. K. Nayar, "Contrast restoration of weather degraded images," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 25, no. 6, pp. 713–724, 2003.
[7] P. C. Barnum, S. Narasimhan, and T. Kanade, "Analysis of rain and snow in frequency space," International Journal of Computer Vision, vol. 86, no. 2-3, pp. 256–274, 2010.
[8] J. Bossu, N. Hautière, and J.-P. Tarel, "Rain or snow detection in image sequences through use of a histogram of orientation of streaks," International Journal of Computer Vision, vol. 93, no. 3, pp. 348–367, 2011.
[9] Y. Luo, Y. Xu, and H. Ji, "Removing rain from a single image via discriminative sparse coding," in IEEE International Conference on Computer Vision, pp. 3397–3405, 2015.
[10] Y. Li, R. T. Tan, X. Guo, J. Lu, and M. S. Brown, "Rain streak removal using layer priors," in IEEE Conference on Computer Vision and Pattern Recognition, pp. 2736–2744, 2016.
[11] K. He, J. Sun, and X. Tang, "Single image haze removal using dark channel prior," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 33, no. 12, pp. 2341–2353, 2011.
[12] X. Li, J. Wu, Z. Lin, H. Liu, and H. Zha, "Recurrent squeeze-and-excitation context aggregation net for single image deraining," in European Conference on Computer Vision, pp. 254–269, 2018.
[13] W. Yang, R. T. Tan, J. Feng, J. Liu, Z. Guo, and S. Yan, "Deep joint rain detection and removal from a single image," in IEEE Conference on Computer Vision and Pattern Recognition, pp. 1357–1366, 2017.
[14] S. W. Zamir, A. Arora, S. Gupta, F. S. Khan, J. Sun, L. Shao, et al., "Restormer: Efficient transformer for high-resolution image restoration," in IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5728–5739, 2022.
[15] Z. Wang, X. Cun, J. Bao, W. Zhou, J. Liu, and H. Li, "Uformer: A general U-shaped transformer for image restoration," in IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 17683–17693, 2022.
[16] H. Zhang, V. Sindagi, and V. M. Patel, "Learning multiple adverse weather removal via two-stage knowledge learning and multi-contrastive regularization," in IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 17653–17662, 2022.
[17] Z. Wang, A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli, "Image quality assessment: from error visibility to structural similarity," IEEE Transactions on Image Processing, vol. 13, no. 4, pp. 600–612, 2004.
[18] C. Shorten and T. M. Khoshgoftaar, "A survey on image data augmentation for deep learning," Journal of Big Data, vol. 6, no. 1, pp. 1–48, 2019.
[19] E. D. Cubuk, B. Zoph, D. Mane, V. Vasudevan, and Q. V. Le, "AutoAugment: Learning augmentation strategies from data," in IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 113–123, 2019.
[20] C. Sakaridis, D. Dai, and L. Van Gool, "Semantic foggy scene understanding with synthetic data," International Journal of Computer Vision, vol. 126, no. 9, pp. 973–992, 2018.
[21] C. Sakaridis, D. Dai, and L. Van Gool, "Map-guided curriculum domain adaptation and uncertainty-aware evaluation for semantic nighttime image segmentation," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 42, no. 7, pp. 1768–1783, 2020.
[22] C. Sakaridis, D. Dai, and L. Van Gool, "ACDC: The adverse conditions dataset with correspondences for semantic driving scene understanding," in IEEE/CVF International Conference on Computer Vision, pp. 10765–10775, 2021.
[23] A. Madry, A. Makelov, L. Schmidt, D. Tsipras, and A. Vladu, "Towards deep learning models resistant to adversarial attacks," in International Conference on Learning Representations, 2018.
[24] H. Zhang and V. M. Patel, "Density-aware single image de-raining using a multi-stream dense network," in IEEE Conference on Computer Vision and Pattern Recognition, pp. 695–704, 2018.
[25] M. Bijelic, T. Gruber, F. Mannan, F. Kraus, W. Ritter, K. Dietmayer, and F. Heide, "Seeing through fog without seeing fog: Deep multimodal sensor fusion in unseen adverse weather," in IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11682–11692, 2020.
[26] J. Hu, L. Shen, and G. Sun, "Squeeze-and-excitation networks," in IEEE Conference on Computer Vision and Pattern Recognition, pp. 7132–7141, 2018.
[27] S. Woo, J. Park, J.-Y. Lee, and I. S. Kweon, "CBAM: Convolutional block attention module," in European Conference on Computer Vision, pp. 3–19, 2018.
[28] X. Hu, C.-W. Fu, L. Zhu, and P.-A. Heng, "Depth-attentional features for single-image rain removal," in IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8022–8031, 2019.
[29] A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly, et al., "An image is worth 16x16 words: Transformers for image recognition at scale," in International Conference on Learning Representations, 2021.
[30] S. Bhojanapalli, A. Chakrabarti, D. Glasner, D. Li, T. Unterthiner, and A. Veit, "Understanding robustness of transformers for image classification," in IEEE/CVF International Conference on Computer Vision, pp. 10231–10241, 2021.
[31] Z. Liu, Y. Lin, Y. Cao, H. Hu, Y. Wei, Z. Zhang, S. Lin, and B. Guo, "Swin transformer: Hierarchical vision transformer using shifted windows," in IEEE/CVF International Conference on Computer Vision, pp. 10012–10022, 2021.
[32] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, "Attention is all you need," in Advances in Neural Information Processing Systems, vol. 30, 2017.
[33] Y. Bengio, J. Louradour, R. Collobert, and J. Weston, "Curriculum learning," in International Conference on Machine Learning, pp. 41–48, 2009.
[34] D. Weinshall, G. Cohen, and D. Amir, "Curriculum learning by transfer learning: Theory and experiments with deep networks," in International Conference on Machine Learning, pp. 5235–5243, 2018.
[35] E. A. Platanios, O. Stretcu, G. Neubig, B. Poczos, and T. M. Mitchell, "Competence-based curriculum learning for neural machine translation," in Conference of the North American Chapter of the Association for Computational Linguistics, pp. 1162–1172, 2019.
[36] E. Mintun, A. Kirillov, and S. Xie, "On interaction between augmentations and corruptions in natural corruption robustness," Advances in Neural Information Processing Systems, vol. 34, pp. 3571–3583, 2021.
[37] P. Soviany, R. T. Ionescu, P. Rota, and N. Sebe, "Curriculum learning: A survey," International Journal of Computer Vision, vol. 130, no. 6, pp. 1526–1565, 2022.
[38] [38] H. R. Pruppacher and J. D. Klett, Microphysics of clouds and precipitation, 2nd ed. Kluwer Academic Publishers, 1997.
[39] J. S. Marshall and W. M. K. Palmer, "The distribution of raindrops with size," Journal of Meteorology, vol. 5, no. 4, pp. 165–166, 1948.
[40] Xiaowei Hu, Chi-Wing Fu, Lei Zhu, and Pheng-Ann Heng. Depth-attentional features for single-image rain removal. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 8022–8031, 2019.
[41] Hao Zhang, Vishwanath Sindagi, and Vishal M Patel. Learning multiple adverse weather removal via two-stage knowledge learning and multi-contrastive regularization. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 17653–17662, 2022.
[42] Jie Chen, Cheen-Hau Tan, Junhui Hou, Lap-Pui Chau, and He Li. Robust video content alignment and compensation for rain removal in a CNN framework. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 6286–6295, 2018.
[43] F. Oshasha, D. A. Saint Jean, M. K. Franklin, S. S. Tege, B. K. Jirince, M. K. Arsene, T. N. Tresor, and D. K. Dieu merci, "EDCST: Enhanced Density-Aware Cross-Scale Transformer for Robust Object Classification Under Fog Conditions," SSRN Electronic Journal, 2025. [Online]. Available: https://ssrn.com/abstract=5773267. DOI: 10.2139/ssrn.5773267
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2026 Fiston OSHASHA OSHASHA, Djungu Ahuka Saint Jean, Mwamba Kande Franklin, Simboni Simboni Tege, Biaba Kuya Jirince, Muka Kabeya Arsene, Tietia Ndengo Tresor, Dumbi Kabangu Dieu merci

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License (Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) ) that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).








