This thesis deals with monocular vision-based object recognition and 3-D pose estimation based on conic features. Conic features including circles and ellipses are frequently observed in many man-made objects in real word as well as have the merit of robustness potentially in feature extraction in vision-based applications. Although the 3-D pose estimation problem of conic features in 3-D space has been studied well since 1990, the previous work has not provided a unique solution completely for full 3-D pose parameters (i.e., 3-orientations and 3-positions) due to complexity from high nonlinearity of a general conic.
This thesis, therefore, renews conic features in a new perspective on geometric invariants in both 3-D space and 2-D projective space, incorporating other geometric features with conics. First, as the most essential step in dealing with conics, this thesis shows that the pose parameters of a circular feature in 3-D space can be derived analytically from incorporating a coplanar point.
A procedure of pose parameter recovery is described in detail, and its performance is evaluated and discussed in view of pose estimation errors and sensitivity. Second, it is also revealed that the pose of an elliptic feature can be resolved when two coplanar points are incorporated on the basis of the polarity of two points for a conic in 2-D projective space. This thesis proposes a series of algorithms to determine the 3-D pose parameters uniquely, and evaluates the proposed method through a measure of estimation performance and sensitivity depending on point locations. Third, a pair of two conics is dealt with, which is regarded as an extension of the idea of the incorporation scheme to another conic feature from point features.
Under the polarity concept, this thesis proves that the problem involving a pair of two conics can be formulated with the problem of one ellipse with two points so that its solution is derived in the same form as in the ellipse case.
In order to treat two or more conic objects as well as to deal with an object recognition problem, the rest of thesis concentrates on the theoretical foundation of multiple object recognition. First, some effective modeling approaches are described. A general object model is specially designed to model multiple objects for object recognition and pose recovery in view of spatial geometry. In particular, this thesis defines a pairwise conic model that can describes the geometrical relation between two conics invariantly in 2-D projective space, which consists of a pairwise conic (PC), a pairwise conic invariant (PCI), and a pairwise conic pole (PCP). Based on the two kinds of models, an object learning and recognition system is proposed as a general framework for multiple object recognition.
Considering simplicity and flexibility in object learning stage, this thesis introduces a semi-automatic learning scheme to construct the multiple object model from a model image at once. To utilize geometric relations among multiple objects effectively in object recognition, this thesis specifies some feature functions based on the pairwise conic model, and then describes an object recognition method in a fashion of linear-chain conditional random field (CRF). In particular, as a post refinement step of the recognition, a geometric alignment procedure is also proposed in algorithmic details to improve recognition performance against noisy conditions.
Last, the multiple object recognition method is evaluated intensively through two practical applications that deal with a place recognition and an elevator button recognition problem for service robots. A series of experiment results supports the effectiveness of the proposed method, maintaining reliable performance against noisy conditions in the presence of perspective distortion and partial object occlusions.