1、 Reference numberISO/IEC TR 15938-8:2002/Amd.2:2006(E)ISO/IEC 2006Information technology Multimedia content description interface Part 8: Extraction and use of MPEG-7 descriptions AMENDMENT 2: Extraction and use of MPEG-7 perceptual 3D shape descriptor Technologies de linformation Interface de descr
2、iption du contenu multimdia Partie 8: Extraction et utilisation des descriptions MPEG-7 AMENDEMENT 2: Extraction et emploi du descripteur de forme 3D perceptuel MPEG-7 Amendment 2:2007 toNational Standard of CanadaCAN/CSA-ISO/IEC TR 15938-8:04Amendment 2:2006 to International Standard ISO/IEC TR 159
3、38-8:2002 has been adopted withoutmodification (IDT) as Amendment 2:2007 to CAN/CSA-ISO/IEC TR 15938-8:04. This Amendment wasreviewed by the CSA Technical Committee on Information Technology (TCIT) under the jurisdiction of theStrategic Steering Committee on Information Technology and deemed accepta
4、ble for use in Canada.August 2007 International Organization for Standardization (ISO), 2006. All rights reserved. International Electrotechnical Commission (IEC), 2006. All rights reserved. NOT FOR RESALE. ISO/IEC TR 15938-8:2002/Amd.2:2006(E) PDF disclaimer This PDF file may contain embedded typef
5、aces. In accordance with Adobes licensing policy, this file may be printed or viewed but shall not be edited unless the typefaces which are embedded are licensed to and installed on the computer performing the editing. In downloading this file, parties accept therein the responsibility of not infrin
6、ging Adobes licensing policy. The ISO Central Secretariat accepts no liability in this area. Adobe is a trademark of Adobe Systems Incorporated. Details of the software products used to create this PDF file can be found in the General Info relative to the file; the PDF-creation parameters were optim
7、ized for printing. Every care has been taken to ensure that the file is suitable for use by ISO member bodies. In the unlikely event that a problem relating to it is found, please inform the Central Secretariat at the address given below. ISO/IEC 2006 All rights reserved. Unless otherwise specified,
8、 no part of this publication may be reproduced or utilized in any form or by any means, electronic or mechanical, including photocopying and microfilm, without permission in writing from either ISO at the address below or ISOs member body in the country of the requester. ISO copyright office Case po
9、stale 56 CH-1211 Geneva 20 Tel. + 41 22 749 01 11 Fax + 41 22 749 09 47 E-mail copyrightiso.org Web www.iso.orgii ISO/IEC 2006 All rights reservedAmendment 2:2007 toCAN/CSA-ISO/IEC TR 15938-8:04ISO/IEC TR 15938-8:2002/Amd.2:2006(E) ISO/IEC 2006 All rights reserved iiiForeword ISO (the International
10、Organization for Standardization) and IEC (the International Electrotechnical Commission) form the specialized system for worldwide standardization. National bodies that are members of ISO or IEC participate in the development of International Standards through technical committees established by th
11、e respective organization to deal with particular fields of technical activity. ISO and IEC technical committees collaborate in fields of mutual interest. Other international organizations, governmental and non-governmental, in liaison with ISO and IEC, also take part in the work. In the field of in
12、formation technology, ISO and IEC have established a joint technical committee, ISO/IEC JTC 1. International Standards are drafted in accordance with the rules given in the ISO/IEC Directives, Part 2. The main task of the joint technical committee is to prepare International Standards. Draft Interna
13、tional Standards adopted by the joint technical committee are circulated to national bodies for voting. Publication as an International Standard requires approval by at least 75 % of the national bodies casting a vote. In exceptional circumstances, the joint technical committee may propose the publi
14、cation of a Technical Report of one of the following types: type 1, when the required support cannot be obtained for the publication of an International Standard, despite repeated efforts; type 2, when the subject is still under technical development or where for any other reason there is the future
15、 but not immediate possibility of an agreement on an International Standard; type 3, when the joint technical committee has collected data of a different kind from that which is normally published as an International Standard (“state of the art”, for example). Technical Reports of types 1 and 2 are
16、subject to review within three years of publication, to decide whether they can be transformed into International Standards. Technical Reports of type 3 do not necessarily have to be reviewed until the data they provide are considered to be no longer valid or useful. Attention is drawn to the possib
17、ility that some of the elements of this document may be the subject of patent rights. ISO and IEC shall not be held responsible for identifying any or all such patent rights. Amendment 2 to ISO/IEC TR 15938-8:2002 was prepared by Joint Technical Committee ISO/IEC JTC 1, Information technology, Subco
18、mmittee SC 29, Coding of audio, picture, multimedia and hypermedia information. NOTE This document preserves the sectioning of ISO/IEC TR 15938-8:2002 and its amendments. The text and figures given below are currently being considered as additions and/or modifications to those corresponding sections
19、 in ISO/IEC TR 15938-8:2002 and its amendments. Amendment 2:2007 toCAN/CSA-ISO/IEC TR 15938-8:04 ISO/IEC TR 15938-8:2002/Amd.2:2006(E) ISO/IEC 2006 All rights reserved 1Information technology Multimedia content description interface Part 8: Extraction and use of MPEG-7 descriptions AMENDMENT 2: Extr
20、action and use of MPEG-7 perceptual 3D shape descriptor Add after 2.2.2.49: 2.2.2.50 Attributed Relational Graph (ARG) A graph whose nodes (vertices) and edges (links) contain unary attributes and dyadic attributes (describing the relation between the nodes), respectively. The graph is described in
21、the form of a vector. 2.2.2.51 Constrained Morphological Decomposition (CMD) An algorithm, based on the mathematical concepts of morphology and convexity, to decompose a voxelized 3-D object into several parts. 2.2.2.52 Weighted Convexity (WC) A volume-weighted sum of each parts convexity. 2.2.2.53
22、Weighted Convexity Difference (WCD) A difference of two WCs before and after merging of two parts. 2.2.2.54 Initial Decomposition Stage (IDS) The procedure of applying the CMD to a voxelized 3-D object, once. 2.2.2.55 Recursive Decomposition Stage (RDS) The procedure of applying the CMD recursively
23、to the result of the IDS or a previous RDS. 2.2.2.56 Iterative Merging Stage (IMS) The procedure of merging parts in the result of the RDS iteratively using the WCD. 2.2.2.57 Earth Movers Distance (EMD) A kind of distance measure based on a solution AMD2-2 to the transportation problem in graph theo
24、ry. Amendment 2:2007 toCAN/CSA-ISO/IEC TR 15938-8:04ISO/IEC TR 15938-8:2002/Amd.2:2006(E) 2 ISO/IEC 2006 All rights reserved2.2.2.58 Query by Example A query to a content (e.g. image, 3D object, etc.) retrieval system whereby the information need is expressed visually, by providing an example of the
25、 kind of target content desired. This can be useful when the user has difficulty forming a query using key words or when text descriptions are not present in the database. For example if the user wants to find images of beaches, he/she can use any available image of a beach as the query and the retr
26、ieval system is expected to return images of beaches as results. 2.2.2.59 Query by Sketch A query by example whereby the example content is a sketch, drawn by the user, reflecting the key visual attributes of the information need. 2.2.2.60 Query by Modified Example A query by example whereby the exa
27、mple content is created by modifying an existing example (for example, using a graphical editing tool) so that it best expresses the information need. Add after subclause 8.5: 8.6 Perceptual 3D shape The Perceptual 3D Shape descriptor is a part-based representation of a 3D object expressed as a grap
28、h. In this context “node” is a vertex in the graph representation corresponding to a part in the 3D model. Such a representation facilitates object description consistent with human perception. The Perceptual 3D Shape descriptor supports Query by example. Furthermore, it provides unique functionalit
29、ies, such as Query by sketch and Query by modified example, which make the content-based retrieval system more interactive and intuitive in querying and retrieving similar 3D objects. 8.6.1 Part-based representation Part-based representation of 3D objects enables perceptual recognition that is robus
30、t in the presence of rotation, translation, deformation, deletion, and inhomogeneous scaling of a 3D object. More specifically, deletion and inhomogeneous scaling involve the removal of parts and growth or shrinkage of the specific part, respectively. In the task of forming a high-level object repre
31、sentation from low-level object features, parts serve as an intermediate representation. The decomposition scheme AMD2-1 is used to generate the attributed relational graph (ARG) of a 3D object. The proposed scheme recursively performs the constrained morphological decomposition (CMD) based on the m
32、athematical morphology and weighted convexity. Then, a merging criterion based on the weighted convexity difference (WCD), which determines whether connected parts should be merged or not, is adopted for compact graph representation. The block diagram of the proposed scheme, in terms of three stages
33、, is presented in Figure AMD2-1. The recursive decomposition stage (RDS) will be launched after the initial decomposition stage (IDS) and performed until QUEUE I is empty. Then, the iterative merging stage (IMS) is applied to parts in QUEUE II for the compact graph representation. Figure AMD2-2 show
34、s the procedure of the proposed scheme for a cow step by step. Figure AMD2-2 (a) and (b) show the cow represented by rendered meshes and voxels, respectively. Then, Figure AMD2-2 (c), (d), and (e) show results of IDS, RDS, and IMS, respectively. Finally, the simple ARG representation is presented in
35、 Figure AMD2-2 (f), where the ellipsoidal node and edge represent the corresponding part and connectivity between parts, respectively. Amendment 2:2007 toCAN/CSA-ISO/IEC TR 15938-8:04 ISO/IEC TR 15938-8:2002/Amd.2:2006(E) ISO/IEC 2006 All rights reserved 3Figure AMD2-1 The block diagram of the decom
36、position scheme (a) (b) (c) IDS: Initial Decomposition Stage RDS: Recursive Decomposition Stage IMS: Iterative Merging Stage Voxelized 3-D Object Constrained morphological decomposition Queue I Queue II Queue I Is this part to be split? Yes Merging procedure with the merging criterion Queue II ARG r
37、epresentation Constrained morphological decomposition No Amendment 2:2007 toCAN/CSA-ISO/IEC TR 15938-8:04 ISO/IEC TR 15938-8:2002/Amd.2:2006(E) 4 ISO/IEC 2006 All rights reserved(d) (e) (f) Figure AMD2-2 The procedure of generating a part-based representation 8.6.2 Feature extraction As described in
38、 the previous subclause, the Perceptual 3D Shape descriptor has the form of an ARG, composed of nodes and edges. A node represents a meaningful part of the model with unary attributes, while an edge implies binary relations between nodes. In order to obtain all attributes, principal component analys
39、is (PCA) is performed on every part of the 3D model to find three principle axes, where the 1st principal axis corresponds to the principal direction with biggest variance, and the 3rd axis corresponds to the direction with smallest variance. Afterwards, 4 unary attributes and 3 binary relations are
40、 extracted to form a Perceptual 3D Shape descriptor. In detail, a node is parameterized by volume v, convexity c, and two eccentricity values e1and e2. More specifically, the convexity is defined as the ratio of the volume in a node to that in its convex hull, and the eccentricity is composed of two
41、 coefficients, 221/1 ace =and 222/1 bce =, where a, b, and c (a b c) are the maximum ranges along 1st, 2nd, and 3rd principal axes, respectively. Then edge attributes, i.e. binary relations between two nodes, are extracted from the geometric relation between two nodes, in which the distance between
42、centers of connected nodes and two angles are adopted. The first angle is the angle between the 1st principal axes of the connected nodes and the other is between their 2nd principal axes. All the unary attributes and binary relations are normalized into the interval 0, 1. However, to adopt Query by
43、 sketch in the retrieval system, the Perceptual 3D Shape descriptor is required to be represented by the set of ellipsoids. In this context, each ellipsoid contains three properties, such as Volume, Max (i.e. maximum range along each principle axes) and Convexity, which can easily be converted into
44、the 4 unary attributes. Next, the Perceptual 3D Shape descriptor contains three properties, such as Center, PCA_Axis_1 and PCA_Axis_2 (i.e. 1st and 2nd principle axis) from which the 3 binary relations can be computed. Therefore, an actual Perceptual 3D Shape descriptor is created, as shown in Binar
45、y Representation Syntax. Note that Volume, Center, Max and Convexity are in the interval 0, 1, while the components in PCA_Axis_1 and PCA_Axis_2 are in the interval -1, 1. 8.6.3 Similarity matching The one-to-one comparison of two Perceptual 3D Shape descriptors consists of four steps: (1) Forming a
46、n ARG from every descriptor (Suppose that they are named as “query graph” and “model graph”, respectively), (2) For each node in both graphs, defining Volume in Binary Representation Syntax as weight, (3) Calculating a distance matrix, where every element is the difference (distance) between any nod
47、e-pair formed by any query graph node (named Nq) and any model graph node (named Nm), (4) Comparing the query and model graphs by employing the conventional Earth Movers Distance (EMD) algorithm AMD2-2, taking the node weights (from step-2) and the distance matrix (from step-3) as the input. In step
48、-3 of this procedure, the calculation of the distance between Nqand Nmis fulfilled also by employing the EMD algorithm. This employment is named “Inner EMD”, and the employment in step-4 is named “Outer EMD”, thus the P3DS matching algorithm is named nested-EMD (nEMD). Only step-3 needs more explanation. During this step, the distance between Nqand Nmis calculated as follows (from step-A to step-H): (A), their unary attributes are compared to give a “unary-distance”. (B), a point set (named “Query Point Set”) is constructed by Nqand all its con