1、1,A Study of Approaches for Object Recognition,Presented by Wyman Wong12/9/2005,2,Outlines,IntroductionModel-Based Object Recognition AAM Inverse Composition AAMView-Based Object Recognition Recognition based on boundary fragments Recognition based on SIFTProposed ResearchConclusion and Future Work,
2、3,Introduction,Object Recognition A task of finding 3D objects from 2D images (or even video) and classifying them into one of the many known object types Closely related to the success of many computer vision applications robotics, surveillance, registration etc. A difficult problem that a general
3、and comprehensive solution to this problem has not been made,4,Introduction,Two main streams of approaches: Model-Based Object Recognition 3D model of the object being recognized is available Compare the 2D representation of the structure of an object with the 2D projection of the modelView-Based Ob
4、ject Recognition 2D representations of the same object viewed at different angles and distances when available Extract features (as the representations of object) and compare them to the features in the feature database,5,Introduction,Pros and Cons of each main stream: Model-Based Object Recognition
5、 Model features can be predicted from just a few detected features based on the geometric constraints Models sacrifice its generalityView-Based Object Recognition Greater generality and more easily trainable from visual data Matching is done by comparing the entire objects, some methods may be sensi
6、tive to clutter and occlusion,6,Model-Based Object Recognition,Commonly used in face recognitionGeneral Steps: Locate the object, locate and label its structure, adjust the models parameters until the model generates an image similar enough to the real object.Active Appearance Models (AAM) have been
7、 proved to be highly useful models for face recognition,7,Active Appearance Models,They model shape and appearance of objects separatelyShape: the vertex locations of a mesh Appearance: the pixels values of a mesh Both of the parameters above used PCA to generalize the face recognition to generic fa
8、ceFitting an AAM: non-linear optimization solution is applied which iteratively solve for incremental additive updates to the shape and appearance coefficients,8,Inverse Compositional AAMs,The major difference of these models with AAMs is the fitting algorithm AAM: additive incremental update shape
9、and appearance parameters ICAAM: inverse compositional update The algorithm updates the entire warp by composing the current warp with the computed incremental warp,9,View-Based Object Recognition,Common approaches: Correlation-based template matching (Li, W. et al. 95) SEA, PDE, etc Not effective w
10、hen the following happens: illumination of environment changes Posture and scale of object changes Occlusion Color Histogram (Swain, M.J. 90) Construct histogram for an object and match it over image It is robust to changing of viewpoint and occlusion But it requires good isolation and segmentation
11、of objects,10,View-Based Object Recognition,Common approaches: Feature based Extract features from the image that are salient and match only to those features when searching all location for matchesFeature types: groupings of edges, SIFT etc Features property preferences: View invariant Detected fre
12、quently enough for reliable recognition DistinctiveImage descriptor is created based on detected features to increase the matching performance Image descriptor = Key / Index to database of features Descriptors property preferences: Invariant to scaling, rotation, illumination, affine transformation
13、and noise,11,Nelsons Approach,Recognition based on 2D Boundary Fragments Prepare 53 clean images for each object and build 3D recognition database:,Object,Camera,12,Nelsons Approach,Test images used in Nelsons experiment and their features,13,Nelsons Approach,Nelsons experiment has shown his approac
14、h has high accuracy 97.0% success rate for 24 objects database under the following conditions: Large number of images Clean images Very different objects No occlusion and clutter,14,Lowes Approach,Recognition based on Scale Invariant Feature Transform (SIFT) SIFT generates distinctive invariant feat
15、ures SIFT based image descriptors are generally most resistant to common image deformations (Mikolajczyk 2005) SIFT four steps: Scale-space extrema detection Keypoint localization Orientation assignment Keypoint descriptor computation,15,Scale-space extrema detection,DOG LOG Search over all sample p
16、oints in all scales and find extrema that are local maxima or minima in laplacian space,Small keypoints Solve occlusion problem Large keypoints Robust to noise and image blur,16,Keypoint localization,Reject keypoints with the following properties: Low contrast (sensitive to noise) Localized along ed
17、ge (sliding effect) Solution: Filter points with value D below 0.03 Apply Hessian edge detector,17,Orientation assignment,Pre-compute the gradient magnitude and orientationUse them to construct keypoint descriptor,18,Keypoint descriptor computation,Create orientation histogram over 4x4 sample region
18、s around the keypoint locations Each histogram contains 8 orientation bins 4x4x8 = 128 elements vectors (distinctively representing a feature),19,Object Recognition based on SIFT,Nearest-neighbor algorithm Matching: assign features to objects There can be many wrong matches Solution Identify cluster
19、s of features Generalized Hough transform Determine pose of object and then discard outliers,20,Proposed Research,Personally, I think model-based approach does have better performanceSuccess of model-based approach requires: All models of objects to be detected Automatically construct models Automat
20、ically select the best modelHow do the system know which 3D model to be used on a specific image of object? By view-based approach Human looks at an image of object for a moment and then realize which model to be used on that object Then use the specific model to refine the identification of the spe
21、cific object,21,Hybrid of bottom-up and top-down,View-based approaches just presented are bottom-up approaches Features: edges, extrema (Low Level) Descriptors of features Matching Identification of object (High Level)Can it be like that? Features Matching (Lower Level) Guessing of object (Higher Le
22、vel) Matching (Lower Level) Guessing of object (Higher Level) Identification of object,22,Hierarchy of features,Lowes system All features have equal weight in voting of object during identification of object (subject to be verified by examining the opened source code)Special features do not have eno
23、ugh voting power to shift the result to the correct oneConsider the following scenario: Two objects have many similar features, a1 to a100 are similar to b1 to b100, and have just one very different feature, a* for object A and b* for object BMany a1 to a100 may be poorly captured by imaging device
24、and mismatched as b1 to b100 , even we can still recognize the feature a*, the system may still think the object is B,Object A,Object B,23,Extension of SIFT,Color descriptors Local texture measures incorporated into feature descriptors Scale-invariant edge groupings *Generic object class recognition
25、,24,Conclusion and Future Work,Discussed the different approaches in object recognition Discussed what is SIFT and how it works Discussed the possible extensions to SIFT Design hybrid approach Design extensions,25,Q & A,Thank you very much!,26,Things to be understood,Find extrema over same scale space is good, why need to find over different scale?,