Improving Efficient Ship Detection Performance Using Contextual Transformers for Maritime Surveillance

Marsel Marhaen Wungow; Dayen Manoppo; Ni Made Shavitri Mustikayani; Muhammad Dwisnanto Putro

doi:10.30871/jaic.v9i6.11189

Authors

Marsel Marhaen Wungow Jurusan Teknik Elektro, Universitas Sam Ratulangi, Kota Manado
Dayen Manoppo Jurusan Teknik Elektro, Universitas Sam Ratulangi, Kota Manado
Ni Made Shavitri Mustikayani Jurusan Teknik Elektro, Universitas Sam Ratulangi, Kota Manado
Muhammad Dwisnanto Putro Jurusan Teknik Elektro, Universitas Sam Ratulangi, Kota Manado

DOI:

https://doi.org/10.30871/jaic.v9i6.11189

Keywords:

Ship Detection, Deep Learning, Modified YOLO11, Contextual Transformer, Efficient Model

Abstract

Ship surveillance plays a crucial role in enhancing defense systems in coastal areas. An automatic vessel detection system is necessary to accurately identify vessels and their categories, typically utilizing a reliable computer vision system. The nano version of YOLO11 has emerged as one of the object detection methods that officially provides lightweight computing, but still has limitations in extracting complex features. Contextual Transformer (CoT) efficiently utilizes long-range relationships, thereby enhancing feature discrimination performance. This study proposes a vessel detection system by modifying the YOLO11 architecture using the Contextual Transformer block. This work introduces YOLO11-Pico, a lighter version of nano, with channel size adjustments at certain stages for further efficiency. The proposed CoT block applies fewer multiplication mapping operations, which are capable of representing global features to obtain richer contextual information. The SeaShips dataset is used as the source of data for model training and evaluation. Experimental results demonstrate that the proposed model YOLO11-pico-CoT achieves superior performance compared to prominent lightweight YOLO architectures, including the YOLO11n baseline, YOLOv5n, YOLOv10n, and the latest YOLOv12n. The integration of CoT contributes positively to improving the accuracy of ship category and location predictions, achieving 0.964 mAP50 and 0.714 mAP50:95. Additionally, efficiency evaluations show that the proposed module is computationally lighter and has fewer parameters, specifically 1,711,250 parameters while operating at 3.97 FPS, giving it an advantage in terms of capabilities over the comparison methods.

Downloads

Download data is not yet available.

References

[1] L. Qian et al., “A new method of inland water ship trajectory prediction based on long short-term memory network optimized by genetic algorithm,” Appl. Sci., vol. 12, no. 8, art. 4073, 2022.

[2] M. Zhu et al., “YOLO-HPSD: A high-precision ship target detection model based on YOLOv10,” PLOS ONE, vol. 20, no. 1, e0321863, Jan. 2025.

[3] L. Shen et al., “YOLO-LPSS: A Lightweight and Precise Detection Model for Small Sea Ships,” J. Mar. Sci. Eng., vol. 13, no. 2, pp. 115–127, Feb. 2025.

[4] Y. Li and S. Wang, “EGM-YOLOv8: A Lightweight Ship Detection Model with Efficient Global–Local Feature Fusion and Attention,” Proc. Int. Conf. Image Process., Oct. 2025, pp. 543–550.

[5] L. Min, F. Dou, Y. Zhang, D. Shao, L. Li and B. Wang, "CM-YOLO: Context Modulated Representation Learning for Ship Detection," in IEEE Transactions on Geoscience and Remote Sensing, vol. 63, pp. 1-14, 2025, Art no. 4202414, doi: 10.1109/TGRS.2025.3538848.

[6] L. Shen et al., “DSONet: A Dual-Scale Occlusion-Aware Framework for Multiscale Ship Detection,” IEEE Trans. Multimedia, vol. 27, no. 3, pp. 1023–1033, Mar. 2025.

[7] Y. Kumar and P. Lee, “Adaptive Head Pruning for Multi-Head Attention in Maritime Object Detection,” IEEE Trans. Neural Netw. Learn. Syst., vol. 35, no. 5, pp. 2410–2422, May 2024.

[8] X. Chen et al., “Spotlighting on Objects: Prior-Knowledge-Driven Maritime Image Dehazing and Object Detection Framework,” J. Mar. Sci. Eng., vol. 11, no. 4, pp. 339–351, Apr. 2024.

[9] A. M. Rekavandi, L. Xu, F. Boussaid, A.-K. Seghouane, S. Hoefs, dan M. Bennamoun, “A Guide to Image- and Video-Based Small Object Detection Using Deep Learning: Case Study of Maritime Surveillance,” IEEE Trans. Intell. Transp. Syst., vol. 26, no. 3, hlm. 2851–2869, Mar. 2025, doi:10.1109/TITS.2025.3530678.

[10] S. Tan et al., “DAShip: A Large-Scale Annotated Dataset for Ship Detection Using Distributed Acoustic Sensing,” Remote Sens., vol. 16, no. 2, p. 210, Feb. 2024.

[11] Y. Yuan et al., “AFF-LightNet: A Lightweight Ship Detection Architecture Based on Attentional Feature Fusion,” J. Mar. Sci. Eng., vol. 13, art. 44, Jan. 2025.

[12] Y. Wang et al., “YOLO-StarLS: A Ship Detection Algorithm Based on Wavelet Transform and Multi-Scale Feature Extraction for Complex Environments,” Symmetry, vol. 17, no. 7, art. 1116, Jul. 2025.

[13] K. Lan et al., “High-Efficiency and High-Precision Ship Detection Algorithm Based on Improved YOLOv8n,” Mathematics, vol. 12, no. 7, art. 1072, Apr. 2024.

[14] Y. Li, T. Yao, Y. Pan and T. Mei, "Contextual Transformer Networks for Visual Recognition," in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 45, no. 2, pp. 1489-1500, 1 Feb. 2023.

[15] J. Hu, L. Shen and G. Sun, "Squeeze-and-Excitation Networks," 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 2018, pp. 7132-7141.

[16] W. Wu et al., “CGDU-DETR: An End-to-End Detection Model for Ship Detection in Day–Night Transition Environments,” J. Mar. Sci. Eng., vol. 13, no. 6, art. 1155, Jun. 2025.

[17] S. Sun et al., “Research on the Automatic Detection of Ship Targets Based on an Improved YOLO v5 Algorithm,” Mathematics, vol. 12, no. 11, art. 1714, Nov. 2024.

[18] Q. Qiu et al., “YOLOv7oSAR: A Lightweight High-Precision Ship Detection Model for SAR Images Based on the YOLOv7 Algorithm,” Remote Sens., vol. 17, no. 4, art. 835, Apr. 2025.

[19] R. D. Kwon, J. Kim, and T. Yoon, “Enhancing Ship Classification in Optical Satellite Imagery: Integrating CBAM with ResNet for Improved Performance,” arXiv preprint arXiv:2404.02135, Apr. 2024.

[20] V. S. Sanikommu et al., “Edge Computing for Maritime Ship-Port Detection Using YOLO,” Front. Artif. Intell., vol. 8, art. 1508664, Feb. 2025.

[21] Y. Zhou et al., “YOLOv7-Ship: A Lightweight Algorithm for Ship Object Detection in Complex Marine Environments,” J. Mar. Sci. Eng., vol. 13, no. 3, art. 389, Mar. 2025.

[22] N. Toprak and Y. Yalman, “Ship Detection from Optical Satellite Images Using CNN,” Turk. J. Eng., vol. 9, no. 2, pp. 88–97, Jun. 2025.

[23] B. E. Ayesha, T. Khan, and R. U. Khan, “Ship Detection in Remote Sensing Imagery for Arbitrarily Oriented Object Detection,” arXiv preprint arXiv:2503.14534, Mar. 2025.

[24] P. Lu et al., “LH-YOLO: A Lightweight and High-Precision Ship Detection Model Based on Improved YOLOv8n,” Remote Sens., vol. 16, no. 22, art. 4340, Nov. 2024.

[25] Y. Wu, Z. Chen, and W. Zhang, “Ship-YOLO: A Deep Learning Approach for Ship Detection in Remote Sensing Images,” arXiv preprint arXiv:2503.14534, Mar. 2025.

[26] S. Kang, Z. Hu, L. Liu, K. Zhang, and Z. Cao, ‘Object detection YOLO algorithms and their industrial applications: Overview and comparative analysis’, Electronics, vol. 14, no. 6, p. 1104, 2025.

[27] P. Zhao, ‘SPFFNet: Strip perception and feature fusion spatial pyramid pooling for fabric defect detection’, arXiv preprint arXiv:2502. 01445, 2025.

[28] J. Jing and C. Li, ‘Identification of lightweight rail defects based on YOLOv11 improvement’, in Fifth International Conference on Optical Imaging and Image Processing (ICOIP 2025), 2025, vol. 13688, pp. 512–517

[29] X. Chen, N. Jiang, Z. Yu, W. Qian, and T. Huang, ‘Citrus leaf disease detection based on improved YOLO11 with C3K2’, in International Conference on Computer Graphics, Artificial Intelligence, and Data Processing (ICCAID 2024), 2025, vol. 13560, pp. 746–751.

[30] J. Song, J. Xie, Q. Wang, and T. Shen, ‘An improved YOLO-based method with lightweight C3 modules for object detection in resource-constrained environments’, The Journal of Supercomputing, vol. 81, no. 5, p. 702, 2025

[31] Wang, A., Chen, H., Liu, L., Chen, K., Lin, Z., & Han, J. (2024). Yolov10: Real-time end-to-end object detection. Advances in Neural Information Processing Systems, 37, 107984-108011.

[32] Tian, Y., Ye, Q., & Doermann, D. (2025). Yolov12: Attention-centric real-time object detectors. arXiv preprint arXiv:2502.12524.

Improving Efficient Ship Detection Performance Using Contextual Transformers for Maritime Surveillance

Authors

DOI:

Keywords:

Abstract

Downloads

References

Downloads

Published

How to Cite

Issue

Section

License

Similar Articles

submit

tools

issn