1、Information technology Multimedia content description interface Part 8: Extraction and use of MPEG-7 descriptions AMENDMENT 3: Technologies for digital photo management using MPEG-7 visual toolsAmendment 3:2013 (IDT) toNational Standard of CanadaCAN/CSA-ISO/IEC TR 15938-8-04(ISO/IEC TR 15938-8:2002,
2、 IDT)NOT FOR RESALE. / PUBLICATION NON DESTINE LA REVENTE.Standards Update ServiceAmendment 3:2013 toCAN/CSA-ISO/IEC TR 15938-8-04January 2013Title:Information technology Multimedia content description interface Part 8: Extraction and use of MPEG-7 descriptions AMENDMENT 3: Technologies for digital
3、photo management using MPEG-7 visual toolsPagination:37 pages (iii preliminary and 34 text)To register for e-mail notification about any updates to this publicationgo to shop.csa.caclick on CSA Update ServiceThe List ID that you will need to register for updates to this publication is 2422260.If you
4、 require assistance, please e-mail techsupportcsagroup.org or call 416-747-2233.Visit CSA Groups policy on privacy at csagroup.org/legal to find out how we protect your personal information.Reference numberISO/IEC TR 15938-8:2002/Amd 3:2007(E)ISO/IEC 2007INTERNATIONAL STANDARD ISO/IECTR15938-8First
5、edition2002-12-15AMENDMENT 32007-12-15Information technology Multimedia content description interface Part 8: Extraction and use of MPEG-7 descriptions AMENDMENT 3: Technologies for digital photo management using MPEG-7 visual tools Technologies de linformation Interface de description du contenu mu
6、ltimdia Partie 8: Extraction et utilisation des descriptions MPEG-7 AMENDEMENT 3: Technologies pour la gestion des photos numriques laide des outils visuels MPEG-7 ISO/IEC TR 15938-8:2002/Amd.3:2007(E) PDF disclaimer This PDF file may contain embedded typefaces. In accordance with Adobes licensing p
7、olicy, this file may be printed or viewed but shall not be edited unless the typefaces which are embedded are licensed to and installed on the computer performing the editing. In downloading this file, parties accept therein the responsibility of not infringing Adobes licensing policy. The ISO Centr
8、al Secretariat accepts no liability in this area. Adobe is a trademark of Adobe Systems Incorporated. Details of the software products used to create this PDF file can be found in the General Info relative to the file; the PDF-creation parameters were optimized for printing. Every care has been take
9、n to ensure that the file is suitable for use by ISO member bodies. In the unlikely event that a problem relating to it is found, please inform the Central Secretariat at the address given below. COPYRIGHT PROTECTED DOCUMENT ISO/IEC 2007 All rights reserved. Unless otherwise specified, no part of th
10、is publication may be reproduced or utilized in any form or by any means, electronic or mechanical, including photocopying and microfilm, without permission in writing from either ISO at the address below or ISOs member body in the country of the requester. ISO copyright office Case postale 56 CH-12
11、11 Geneva 20 Tel. + 41 22 749 01 11 Fax + 41 22 749 09 47 E-mail copyrightiso.org Web www.iso.org ii ISO/IEC 2007 All rights reservedAmendment 3:2013 to CAN/CSA-ISO/IEC TR 15938-8-04ISO/IEC TR 15938-8:2002/Amd.3:2007(E) ISO/IEC 2007 All rights reserved iiiForeword ISO (the International Organization
12、 for Standardization) and IEC (the International Electrotechnical Commission) form the specialized system for worldwide standardization. National bodies that are members of ISO or IEC participate in the development of International Standards through technical committees established by the respective
13、 organization to deal with particular fields of technical activity. ISO and IEC technical committees collaborate in fields of mutual interest. Other international organizations, governmental and non-governmental, in liaison with ISO and IEC, also take part in the work. In the field of information te
14、chnology, ISO and IEC have established a joint technical committee, ISO/IEC JTC 1. International Standards are drafted in accordance with the rules given in the ISO/IEC Directives, Part 2. The main task of the joint technical committee is to prepare International Standards. Draft International Stand
15、ards adopted by the joint technical committee are circulated to national bodies for voting. Publication as an International Standard requires approval by at least 75 % of the national bodies casting a vote. In exceptional circumstances, the joint technical committee may propose the publication of a
16、Technical Report of one of the following types: type 1, when the required support cannot be obtained for the publication of an International Standard, despite repeated efforts; type 2, when the subject is still under technical development or where for any other reason there is the future but not imm
17、ediate possibility of an agreement on an International Standard; type 3, when the joint technical committee has collected data of a different kind from that which is normally published as an International Standard (“state of the art”, for example). Technical Reports of types 1 and 2 are subject to r
18、eview within three years of publication, to decide whether they can be transformed into International Standards. Technical Reports of type 3 do not necessarily have to be reviewed until the data they provide are considered to be no longer valid or useful. Attention is drawn to the possibility that s
19、ome of the elements of this document may be the subject of patent rights. ISO and IEC shall not be held responsible for identifying any or all such patent rights. Amendment 3 to ISO/IEC TR 15938-8:2002 was prepared by Joint Technical Committee ISO/IEC JTC 1, Information technology, Subcommittee SC 2
20、9, Coding of audio, picture, multimedia and hypermedia Information. Amendment 3:2013 to CAN/CSA-ISO/IEC TR 15938-8-04Amendment 3:2013 to CAN/CSA-ISO/IEC TR 15938-8-04ISO/IEC TR 15938-8:2002/Amd.3:2007(E) ISO/IEC 2007 All rights reserved 1Information technology Multimedia content description interfac
21、e Part 8: Extraction and use of MPEG-7 descriptions AMENDMENT 3: Technologies for digital photo management using MPEG-7 visual tools Add after subclause 4.2.3.3: 4.2.3.4 Dominant Color Temperature 4.2.3.4.1 General This subclause provides an advanced use scenario of the Dominant Color descriptor. Th
22、e Dominant Color Temperature is a variation of Dominant Color, but suitable to implement perceptual similarity based retrieval. Images usually have one of a few dominant color temperatures perceived by users when they look at them. Dominant Color Temperatures enable users to search for images in sce
23、narios such as query by example or query by value, and for image browsing regarding their color temperature. It can be useful for users who want to find images which look similar according to color temperature rather than to find images which have similar color regions. 4.2.3.4.2 Use scenario Domina
24、nt Color Temperatures can be used in query by example and query by value search scenarios. Examples of such queries are depicted in Figure AMD3.1. In a query by example a user inputs an example image or draws a colored sketch (query by sketch) and the search application returns the most similar imag
25、es regarding their color temperature. In a query by value a user chooses a temperature value, and the system retrieves images in which the appearance of color temperature is closest to the user choice. Amendment 3:2013 to CAN/CSA-ISO/IEC TR 15938-8-04ISO/IEC TR 15938-8:2002/Amd.3:2007(E) 2 ISO/IEC 2
26、007 All rights reserveda) b)Figure AMD3.1 Examples of image retrieval using Dominant Color Temperatures: a) query by example; b) query by color temperature value given in kelvins 4.2.3.4.3 Feature extraction The Dominant Color Temperature, which consists of a maximum of eight pairs of color temperat
27、ure and percentage, is obtained by the following steps. 1. Get RGB color values and percentages of dominant colors from a Dominant Color descriptor instance. 2. Convert each dominant color value from RGB to color temperature using the relevant method specified in the feature extraction method of Col
28、or Temperature descriptor subclause 6.9.1.1. The number of obtained color temperatures cannot, therefore, exceed the number of dominant colors in the Dominant Color descriptor instance. The colors that do not have significant color temperature (colors having luminance values below the luminance thre
29、shold specified in the extraction method of the Color Temperature descriptor) should be omitted. 3. Use the obtained color temperatures and their percentages given by the Dominant Color descriptor instance in queries: query by example, query by color temperature value, ranking search results, and ot
30、hers. 4.2.3.4.4 Similarity matching The similarity is based on a distance function which is defined as an integral of absolute difference between two percentage distributions of dominant color temperature. The percentage distributions of dominant color temperature should be obtained first in the fol
31、lowing steps: 1. Convert color temperature values Tiof Dominant Color Temperature description to Reciprocal Megakelvin scale RTiMK-1 = 1000000/TiK. 2. Sort, in ascending order, the dominant color temperatures expressed in reciprocal scale. 3. Create the percentage distribution of dominant color temp
32、erature Di(RTi) using the following equations: D(RT) = 0 for RT 40 34 30 16 12 15 12 17 12 17 12 14 2 6 4 4 2 1 7 5 3 2 1 6 4 2 2 2 5 4 5 3 1 5 5 6 5 2 6 5 4 4 1 6 4 4 4 0 6 3 5 2 1 5 5 6 6 4 2 3 6 7 3 2 5 5 7 3 2 4 4 7 1 5 6 4 6 1 5 7 4 5 1 6 4 6 5 1 3 4 7 6 4.8.2 Grouping Technologies 4.8.2.1 Situ
33、ation-based clustering 4.8.2.1.1 General A simple but very effective structure is to group images by the occasion on which they were taken. This is natural for the user since they will often remember the context of the situation much better than a date, time or explicit label attached to the picture
34、. It is possible to automatically cluster images into such “situations” by using MPEG-7 visual description, together with the time stamp of the image. Based on the assumption that each situation is contiguous in time, the organisational structure can be represented by the time-sequence of images, wi
35、th a flag or marker to indicate the boundaries between situations (cf. Figure AMD3.9). This provides the user with a simple, intuitive and effective means to browse through their collection, without placing any Amendment 3:2013 to CAN/CSA-ISO/IEC TR 15938-8-04ISO/IEC TR 15938-8:2002/Amd.3:2007(E) IS
36、O/IEC 2007 All rights reserved 11additional burden on them to spend time organising it. Two methods are presented for situation based clustering. Boundaries (vertical bars) are inserted between adjacent images in the sequence to denote the grouping. Figure AMD3.9 One representation of the grouped se
37、quence of images 4.8.2.1.2 Use scenario This kind of clustering can easily be implemented in traditional photo-browsing software applications. For the user, it is very simple to use the extraction and matching of MPEG-7 descriptors and detection of the boundaries is fully automatic, so the tool is e
38、ssentially “one-click”. Of course, some users may choose to adjust and refine the automatic output to match their individual preferences. This process would still be far easier than organising all the photos manually. The clustering information can be used to access and manipulate the image content
39、in a variety of ways: Browsing: o Display a cluster of images per page, or o Display a single thumbnail / icon for each cluster Annotation o User can easily assign a single label to all the images in a cluster Sharing: o User can select images by cluster and o Print o Copy o Upload to website Amendm
40、ent 3:2013 to CAN/CSA-ISO/IEC TR 15938-8-04ISO/IEC TR 15938-8:2002/Amd.3:2007(E) 12 ISO/IEC 2007 All rights reserved4.8.2.1.3 Method1: Simple Linear Clustering 4.8.2.1.3.1 General This method achieves good clustering performance with minimal complexity. The additional computation (after extracting a
41、nd matching MPEG-7 visual descriptors) consists of a simple weighted linear summation. It is therefore well-suited to applications where MPEG-7 descriptors have been extracted from images but resources are not available for higher-level processing (for example, in low-complexity devices). The input
42、parameters to the algorithm are also simple and therefore easy to adapt - for example, to different applications or user preferences. 4.8.2.1.3.2 Tools to be used Six tools defined in ISO/IEC 15938-3 shall be instantiated in StillRegionFeatureDS: z Dominant Color (DC) z Scalable Color (SC) z Color L
43、ayout (CL) z Color Structure (CS) z Homogeneous Texture (HT) z Edge Histogram (EH) Also, capturing date/time information should be included. If an image is encoded in Exif file format (JEITA CP-3451), this information can be obtained from the Exif header. z EXIF DateTime tag (ID36867) Alternatively,
44、 the same information can be captured using z CreationInformation/CreationCoordinates/Date (mpeg7:TimeType) 4.8.2.1.3.3 Clustering Algorithm The images are ordered by their time stamps and each potential boundary in the sequence is evaluated in turn. To determine the presence or absence of a boundar
45、y, a number of pair-wise comparisons are made amongst images lying in a window either side of the transition. This neighbourhood and the comparisons used are illustrated in Figure AMD3.10. Amendment 3:2013 to CAN/CSA-ISO/IEC TR 15938-8-04ISO/IEC TR 15938-8:2002/Amd.3:2007(E) ISO/IEC 2007 All rights
46、reserved 13?j-2 j-1 j j+1 j+2 j+3 Figure AMD3.10 Neighbourhood comparisons evaluated to determine if a boundary is present Comparison of images consists of computing the descriptor distances (by the respective methods suggested in ISO/IEC 15938-8 TR) and calculating the time difference. The latter i
47、s measured on a logarithmic scale, to compress the range of this feature and allow meaningful comparisons. Time distance is therefore defined as: ( )( )iiTTTiiD +=+1510ln)1,( The unit of time used for iT is days. The natural logarithm is applied to normalise the range of time distances potential tim
48、e differences will vary over several orders of magnitude. After this transformation, the variation of the time distance is comparable to the remaining features. The constant, 510, meanwhile, chooses the minimum scale of the distance just under one second, in this case. It also ensures that ln(0) doe
49、s not occur. The input to the algorithm includes the first-, second- and third-order distances in a short time interval around the boundary to be tested. Here “first-order” refers to the difference, for any given feature, between two images that are adjacent in the sequence i.e. )1,( +iiDF. A second-order distance is the difference between two images that are separated by one other image i.e. )2,( +iiDF. Similarly, a third-order distance is the difference between two