Unmanned aerial vehicles (UAVs) are increasingly utilized in urban management, environmental monitoring, and emergency response, yet accurate detection of small objects remains highly challenging due to scale variations, complex backgrounds, and dense target distributions. To address these issues, we propose Aero-LiteNet, an efficient detection framework designed for UAV-based small object detection. The network incorporates four key innovations. First, a multi-scale spatial pyramid fusion (MSPF) module extends feature representations through parallel multi-resolution branches and coordinate attention, effectively retaining fine-grained details of small targets. Second, a cross-dimensional aerial small-object attention module (CASAM) establishes collaborative interactions across channel and spatial dimensions, enhancing the discriminability of weak features under occlusion and background clutter. Third, an adaptive spatial feature fusion (ASFF) strategy aligns multi-level features via scale normalization and learnable weight allocation, thereby mitigating inconsistencies and redundancy across feature layers. Finally, a neighborhood-aware IoU (NAIoU) loss function introduces local constraints to penalize excessive overlap in dense scenes, significantly improving bounding box regression accuracy. Comprehensive experiments on the VisDrone2019 benchmark demonstrate that Aero-LiteNet achieves an mAP@50 of 48.1%, surpassing YOLOv8s and several state-of-the-art efficient detectors while maintaining low complexity. Additional validation on RSOD and TT100K datasets further confirms the generalization capability of the proposed model. To validate practical deployment feasibility, we conducted extensive embedded platform experiments on a Rockchip RK3588 system-on-chip (SoC), demonstrating that the INT8-quantized Aero-LiteNet achieves 53 FPS inference throughput while maintaining a competitive mAP@50 of 46.8%, substantially exceeding the 25 FPS real-time requirement for UAV missions. These results indicate that Aero-LiteNet provides an effective and efficient solution for UAV-based small object detection, balancing accuracy and real-time performance for deployment on resource-constrained platforms.
Luo, W. & Yuan, S. Enhanced YOLOv8 for small-object detection in multiscale UAV imagery: Innovations in detection accuracy and efficiency. Digit. Signal Proc. 158, 104964 (2025).
Yuan, Z. et al. Small object detection in uav remote sensing images based on intra-group multi-scale fusion attention and adaptive weighted feature fusion mechanism. Remote Sens. 16(22), 4265 (2024).
Wu, H., Zhu, Y. & Cao, M. An algorithm for detecting dense small objects in aerial photography based on coordinate position attention module. IET Image Proc. 18(7), 1759’1767 (2024).
Du, D. et al. VisDrone-DET2019: The vision meets drone object detection in image challenge results. In Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops (2019).
Rezatofighi, H. et al. Generalized intersection over union: A metric and a loss for bounding box regression. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. (2019).