Gen3DSR

Gen3DSR: Generalizable 3D Scene Reconstruction via
Divide and Conquer from a Single View

3DV 2025

Andreea Ardelean_{_Dogaru} Mert Özer Bernhard Egger

Friedrich-Alexander-Universität Erlangen-Nürnberg

Abstract

Single-view 3D reconstruction is currently approached from two dominant perspectives: reconstruction of scenes with limited diversity using 3D data supervision or reconstruction of diverse singular objects using large image priors. However, real-world scenarios are far more complex and exceed the capabilities of these methods. We therefore propose a hybrid method following a divide-and-conquer strategy. We first process the scene holistically, extracting depth and semantic information, and then leverage an object-level method for the detailed reconstruction of individual components. By splitting the problem into simpler tasks, our system is able to generalize to various types of scenes without retraining or fine-tuning. We purposely design our pipeline to be highly modular with independent, self-contained modules, to avoid the need for end-to-end training of the whole system. This enables the pipeline to naturally improve as future methods can replace the individual modules. We demonstrate the reconstruction performance of our approach on both synthetic and real-world scenes, comparing favorable against prior works.

Method

Our method takes as input a single RGB image and predicts the full 3D scene reconstruction represented as a collection of triangle meshes. First, we parse the image of the scene by finding the composing instances, and estimating the depth and camera parameters. Then, we separate the identified entities in stuff (amorphus shapes) and things (characteristic shapes). To recover the full view of each object, we perform amodal completion on the masked crops of the instances. Each object is reconstructed individually in a normalized space and aligned to the view space using the scene layout guides from the depth map. Importantly, we address the differences in focal length, principal point, and camera-to-object distance between the two spaces through reprojection. Finally, we model the background as the surface that approximates the stuff entities collectively.

Gen3DSR: Generalizable 3D Scene Reconstruction via
Divide and Conquer from a Single View

3DV 2025

Andreea Ardelean_{_Dogaru} Mert Özer Bernhard Egger

Abstract

Method

Results

Input image

Reconstructed scene (interactive)

Citation

Acknowledgement

Gen3DSR: Generalizable 3D Scene Reconstruction via Divide and Conquer from a Single View

3DV 2025

Andreea ArdeleanDogaru Mert Özer Bernhard Egger

Abstract

Method

Results

Input image

Reconstructed scene (interactive)

Citation

Acknowledgement

Gen3DSR: Generalizable 3D Scene Reconstruction via
Divide and Conquer from a Single View

Andreea Ardelean_{_Dogaru} Mert Özer Bernhard Egger