Learning A Room with the Occ-SDF Hybrid: Signed Distance Function Mingled with Occupancy Aids Scene Representation

International Conference on Computer Vision (ICCV), 2023

1The University of Hong Kong, 2Zhejiang University, 3Harbin Institute of Technology, 4DJI

Occ-SDF-Hybrid can reconstruct more detailed structures with pseudo geometry constraints.

Abstract

Implicit neural rendering, using signed distance function (SDF) representation with geometric priors like depth or surface normal, has made impressive strides in the surface reconstruction of large-scale scenes. However, applying this method to reconstruct a room-level scene from images may miss structures in low-intensity areas and/or small, thin objects. We have conducted experiments on three datasets to identify limitations of the original color rendering loss and priors-embedded SDF scene representation.

Our findings show that the color rendering loss creates an optimization bias against low-intensity areas, resulting in gradient vanishing and leaving these areas unoptimized. To address this issue, we propose a feature-based color rendering loss that utilizes non-zero feature values to bring back optimization signals. Additionally, the SDF representation can be influenced by objects along a ray path, disrupting the monotonic change of SDF values when a single object is present. Accordingly, we explore using the occupancy representation, which encodes each point separately and is unaffected by objects along a querying ray. Our experimental results demonstrate that the joint forces of the feature-based rendering loss and Occ-SDF hybrid representation scheme can provide high-quality reconstruction results, especially in challenging room-level scenarios.

Method

Feature Rendering

We find that the well-adopted color-based rendering formula in MonoSDF and NeuRIS will induce optimization bias against low-intensity areas, leaving these areas under-optimized and resulting in missing reconstructions. Accordingly, we propose a simple yet effective feature-based rendering formula to address the problem. More details can be found in the main paper.

Network Structure.

The trend for the gradient during optimization.


Occ-SDF Hybrid Representation

The left part of (a), where we stand in front of the yellow cylinder to observe the entire scene, is a widespread scenario for room-level scale scenes. Unlike the single object scenario, where the distribution of SDF value is a monotonic decreasing function from the observed position to the object, the room-level scenario has complex distributions with multi peaks/valleys along the single ray (c). Following the Laplace density function, the density distributions of different situations are shown in (d), where room-level scenes have a secondary peak near small objects but the single object scene only has one peak. It is because of the existence of this peak, the weights in the room-level scene (e) exhibit a multi-model distribution, while for the single object case a uni-modal distribution. As such, we note that the rendering depth D̂ deviates from the ground truth in the room-level scene but is close to the ground truth depth object-level scene. (b) means the effect of supervised signal in three different representations. More details can be found in the main paper.

Results

BibTeX

 @inproceedings{Lyu2023occsdf,
    title={Learning A Room with the Occ-SDF Hybrid: Signed Distance Function Mingled with Occupancy Aids Scene Representation},
    author={Xiaoyang Lyu, Peng Dai, Zizhang Li, Dongyu Yan, Yi Lin, Yifan Peng, Xiaojuan Qi},
    booktitle={Proceedings of the IEEE/CVF international conference on computer vision},
    year={2023}
    }