IM2CAD takes a single photo of a real scene (left), and automatically reconstructs a 3D CAD model (right) that is similar to the real scene.
Given a single photo of a room and a large database of furniture CAD models, our goal is to reconstruct a scene that is as similar as possible to the scene depicted in the photograph, and composed of objects drawn from the database. We present a completely automatic system to address this IM2CAD problem that produces high quality results on challenging imagery from interior home design and remodeling websites. Our approach iteratively optimizes the placement and scale of objects in the room to best match scene renderings to the input photo, using image comparison metrics trained via deep convolutional neural nets. By operating jointly on the full scene at once, we account for inter-object occlusions. We also show the applicability of our method in standard scene understanding benchmarks where we obtain significant improvement.
Keywords: 3D scene reconstruction, convolutional neural network, reconstruction-recognition, scene optimization via render-and-match, single view geometry, room layout estimation, 3D CAD models.
In each example the left image is the real input image and the right image is the rendered 3D CAD model produced by IM2CAD.