Robot Perception and Learning: [CMU RI Thesis] On the Multi-View Fitting and Construction of Dense Deformable Face Models

Title: On the Multi-View Fitting and Construction of Dense Deformable Face Models

Author: K. Ramnath

Abstract:
Active Appearance Models (AAMs) are generative, parametric models that have been successfully used in the past to model deformable objects such as human faces. Fitting an AAM to an image consists of minimizing the error between the input image and the closest model instance; i.e. solving a nonlinear optimization problem. In this thesis we study three important topics related to deformable face models such as AAMs: (1) multi-view 3D face model fitting, (2) multi-view 3D face model construction, and (3) automatic dense deformable face model construction.

The original AAMs formulation was 2D, but they have recently been extended to include a 3D shape model. A variety of single-view algorithms exist for fitting and constructing 3D AAMs but one area that has not been studied is multi-view algorithms. In the first part of this thesis we describe an algorithm for fitting a single AAM to multiple images, captured simultaneously by cameras with arbitrary locations, rotations, and response functions. This algorithm uses the scaled orthographic imaging model used by previous authors, and in the process of fitting computes, or calibrates, the scaled orthographic camera matrices. We also describe an extension of this algorithm to calibrate weak perspective (or full perspective) camera models for each of the cameras. In essence, we use the human face as a (nonrigid) calibration grid. We demonstrate that the performance of this algorithm is roughly comparable to a standard algorithm using a calibration grid. We then show how camera calibration improves the performance of AAM fitting.

A variety of non-rigid structure-from-motion algorithms, both single-view and multiview, have been proposed that can be used to construct the corresponding 3D non-rigid shape models of a 2D AAM. In the second part of this thesis we show that constructing a 3D face model using non-rigid structure-from-motion suffers from the Bas-Relief ambiguity and may result in a �scaled� (stretched/compressed) model. We outline a robust non-rigid motion-stereo algorithm for calibrated multi-view 3D AAM construction and show how using calibrated multi-view motion-stereo can eliminate the Bas-Relief ambiguity and yield face models with higher 3D fidelity.

An important step in computing dense deformable face models such as 3D Morphable Models (3DMMs) is to register the input texture maps using optical flow. However, optical flow algorithms perform poorly on images of faces because of the appearance and disappearance of structure such as teeth and wrinkles, and because of the non-Lambertian, textureless cheek regions. In the final part of this thesis we propose a different approach to building dense face models. Our algorithm iteratively builds a face model, fits the model to the input image data, and then refines the model. The refinement consists of three steps: (1) the addition of more mesh points to increase the density, (2) image consistent re-triangulation of the mesh, and (3) refinement of the shape modes. Using a carefully collected dataset containing hidden marker ground-truth, we show that our algorithm generates dense models that are quantitatively better than those obtained using off the shelf optical flow algorithms. We also show how our algorithm can be used to construct dense deformable models automatically, starting with a rigid planar model of the face that is subsequently refined to model the non-planarity and the non-rigid components.

The full text can be found here.

Robot Perception and Learning

Tuesday, January 01, 2008

[CMU RI Thesis] On the Multi-View Fitting and Construction of Dense Deformable Face Models

No comments: