HF-NeRF: UAV remote sensing image reconstruction via height-augmented representation and frequency-adaptive sampling - Scientific Reports
HF-NeRF: UAV remote sensing image reconstruction via height-augmented representation and frequency-adaptive sampling - Scientific Reports

Existing neural radiance field (NeRF) 3D reconstruction methods have demonstrated significant potential in scene modeling. However, although NeRF shows good performance in 3D reconstruction, it still faces practical challenges when processing remote sensing images captured by Unmanned Aerial Vehicle, such as differences in geometric properties between the height and horizontal dimensions and insufficient efficiency in ray sampling. To address this, this paper proposes a novel UAV remote sensing scene reconstruction framework, HF-NeRF, which incorporates a Height-Enhanced Representation (HER) module. This module performs independent encoding and MLP mapping for the height dimension z, thereby capturing geometric features across different height layers. In parallel, a Frequency-Adaptive Sampling (FAS) approach, which analyzes energy distribution in the frequency domain, adaptively modulates sampling density to improve computational performance and preserve fine-grained details. Experimental results on both a proprietary UAV remote sensing dataset and publicly available datasets demonstrate that the proposed reconstruction approach excels over leading NeRF derivatives in overall performance for remote sensing imagery.

The raw data supporting the findings of this study include: (1) The Sydney Opera House and High-rise scenes were obtained from the outdoor multi-modal dataset (OMMO), publicly available at: DOI: https://doi.org/10.1109/ICCV51070.2023.00695; and (2) additional self-collected datasets, including The Pavilion, Playground and House scenes, were captured and annotated by the authors, which are available from the first author upon reasonable request via email: 2312391076@st.gxu.edu.cn.

Wallace, L., Lucieer, A., Malenovsk”, Z., Turner, D. & Vop’nka, P. Assessment of forest structure using two uav techniques: A comparison of airborne laser scanning and structure from motion (sfm) point clouds. Forests 7, 62 (2016).

Seitz, S.“M., Curless, B., Diebel, J., Scharstein, D. & Szeliski, R. A comparison and evaluation of multi-view stereo reconstruction algorithms. In 2006 IEEE computer society conference on computer vision and pattern recognition (CVPR’06), vol.”1, 519’528 (IEEE, 2006).

Yu, A., Ye, V., Tancik, M. & Kanazawa, A. pixelnerf: Neural radiance fields from one or few images. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 4578’4587 (2021).

Remondino, F. Accurate and detailed image-based 3d documentation of large sites and complex objects. Digital imaging for cultural heritage preservation: analysis, restoration, and reconstruction of ancient artworks, F. Stanco, S. Battiato and G. Gallo, eds., Boca Raton 127’158 (2011).

Published on 4/15/2026