1、BRITISH STANDARD BS ISO/IEC 14496-19:2004 Information technology Coding of audio-visual objects Part 19: Synthesized texture stream ICS 35.040 BS ISO/IEC 14496-19:2004 This British Standard was published under the authority of the Standards Policy and Strategy Committee on 9 July 2004 BSI 9 July 200
2、4 ISBN 0 580 44029 X National foreword This British Standard reproduces verbatim ISO/IEC 14496-19:2004 and implements it as the UK national standard. The UK participation in its preparation was entrusted to Technical Committee IST/37, Coding of picture, audio, multimedia and hypermedia information,
3、which has the responsibility to: A list of organizations represented on this committee can be obtained on request to its secretary. Cross-references The British Standards which implement international or European publications referred to in this document may be found in the BSI Catalogue under the s
4、ection entitled “International Standards Correspondence Index”, or by using the “Search” facility of the BSI Electronic Catalogue or of British Standards Online. This publication does not purport to include all the necessary provisions of a contract. Users are responsible for its correct application
5、. Compliance with a British Standard does not of itself confer immunity from legal obligations. aid enquirers to understand the text; present to the responsible international/European committee any enquiries on the interpretation, or proposals for change, and keep the UK interests informed; monitor
6、related international and European developments and promulgate them in the UK. Summary of pages This document comprises a front cover, an inside front cover, the ISO/IEC title page, pages ii to vii, a blank page, pages 1 to 86, an inside back cover and a back cover. The BSI copyright notice displaye
7、d in this document indicates when the document was last issued. Amendments issued since publication Amd. No. Date Comments Reference number ISO/IEC 14496-19:2004(E) OSI4002 CEI/INTERNATIONAL STANDARD ISO/IEC 14496-19 First edition 2004-07-01 Information technology Coding of audio-visual objects Part
8、 19: Synthesized texture stream Technologies de linformation Codage des objets audiovisuels Partie 19: Flux de texture synthtis BSISO/IEC1449619:2004IS/OIE69441 C-:9102(40)E DPlcsid Fremia ihTs PDF file may ctnoian emdebt dedyfepcaes. In ccaocnadrw eith Aebods licensilop gnic,y this file mairp eb yn
9、ted iv roweb detu slahl ton ide ebtlnu deess the typefaces whice era hml era deddebicsnede to i dnanstlaled t noeh computfrep reormign tide ehtin.g In wodlnidaot gnhis file, trapise atpecc tiereht nser ehnopsiiblity fo not infriigngn Aebods licensilop gnic.y ehT ISO tneClar Secrteiraat caceptl on si
10、ibality in this .aera Ai ebods a tredamafo kr Aebod SystemI sncotaropr.de teDails fo teh softwacudorp erts sut deo crtaee this PDF file cna f ebi dnuon tlareneG eh Info leratit evo the file; tP ehDc-Frtaeion marapterew setpo erimizde for irpnti.gn Evyre caer neeb sah taken to sneeru that the file is
11、 suitlbae fosu re yb ISO memdob rebeis. In tlnu ehikletneve y ttah lborp aem leratit gno it is f,dnuo plsaee inform ttneC ehlar Secrteiraat ta the serddaig sleb nevwo. ISO/IE4002 C All rights reserved. Unless otherwise specified, no part of this publication may be reproduced or utilized in any form
12、or by any means, lecetrinoc ro mecinahcal, inclidung tohpcoiypodna gn micrfoilm, wittuoh repmissii non writign from ietI rehSa Ot tsserdda eh ebolw or ISOs memreb i ydobn the cnuotrfo y ttseuqer ehe.r ISO cirypothg fofice saCe tsopale 65 eneG 1121-HC 02 av leT. 4 + 10 947 22 1 11 xaF0 947 22 14 + 9
13、74 E-mail coirypthgiso.o gr We bwww.is.o gro Pulbisdehi n Switlrez dnaii ISO/IE 4002 C Allr ithgsser ervdeBSISO/IEC1449619:2004IS/OIE69441 C-:9102(40)E I SO/IE4002 C All irthgs serervde iiiContents Page Foreword. v Introduction vii 1 Scope 1 2 Normative References 1 3 Synthesized Texture Compression
14、 Technology 1 3.1 Functionality and Semantics . 1 4 Coding and Bitstream. 46 4.1 Overview 46 4.2 Global Input Bitstream and Decoding Context 46 4.3 Header Block (H) Decoding 48 4.4 Scene Block (S) Decoding 49 4.5 Object Block (C) Decoding. 50 4.6 Texture Block (A) Decoding . 51 4.7 Skeleton Decoding
15、 73 4.8 Animation Decoding . 76 4.9 Camera Decoding 80 4.10 Quantization 81 4.11 Sub-Streams 82 5 SynthesizedTexture Data Stream 86 5.1 Structure of the SynthesizedTexture Data Stream 86 5.2 Access Unit Definition 86 BSISO/IEC1449619:2004IS/OIE69441 C-:9102(40)E iv I SO/IE4002 C All irthgs serervdeF
16、oreword ISO (the International Organization for Standardization) and IEC (the International Electrotechnical Commission) form the specialized system for worldwide standardization. National bodies that are members of ISO or IEC participate in the development of International Standards through technic
17、al committees established by the respective organization to deal with particular fields of technical activity. ISO and IEC technical committees collaborate in fields of mutual interest. Other international organizations, governmental and non-governmental, in liaison with ISO and IEC, also take part
18、in the work. In the field of information technology, ISO and IEC have established a joint technical committee, ISO/IEC JTC 1. International Standards are drafted in accordance with the rules given in the ISO/IEC Directives, Part 2. The main task of the joint technical committee is to prepare Interna
19、tional Standards. Draft International Standards adopted by the joint technical committee are circulated to national bodies for voting. Publication as an International Standard requires approval by at least 75 % of the national bodies casting a vote. ISO/IEC 14496-19 was prepared by Joint Technical C
20、ommittee ISO/IEC JTC 1, Information technology, Subcommittee SC 29, Coding of audio, picture, multimedia and hypermedia information. ISO/IEC 14496 consists of the following parts, under the general title Information technology Coding of audio-visual objects: Part 1: Systems Part 2: Visual Part 3: Au
21、dio Part 4: Conformance testing Part 5: Reference software Part 6: Delivery Multimedia Integration Framework (DMIF) Part 7: Optimized reference software for coding of audio-visual objects Part 8: Carriage of ISO/IEC 14496 contents over IP networks Part 9: Reference hardware description Part 10: Adva
22、nced Video Coding Part 11: Scene description and application engine Part 12: ISO base media file format Part 13: Intellectual Property Management and Protection (IPMP) extensions Part 14: MP4 file format Part 15: Advanced Video Coding (AVC) file format Part 16: Animation Framework eXtension (AFX) BS
23、ISO/IEC1449619:2004IS/OIE69441 C-:9102(40)E I SO/IE4002 C All irthgs serervde v Part 17: Streaming text format Part 18: Font compression and streaming Part 19: Synthesized texture stream BSISO/IEC1449619:2004IS/OIE69441 C-:9102(40)E vi I SO/IE4002 C All irthgs serervdeIntroduction ISO/IEC 14496 spec
24、ifies a system for the communication of interactive audio-visual scenes. The specification includes the following elements: 1. the coded representation of natural or synthetic, two-dimensional (2D) or three-dimensional (3D) objects that can be manifested audibly and/or visually (audio-visual objects
25、) (specified in part 1,2 and 3 of ISO/IEC 14496); 2. the coded representation of the spatio-temporal positioning of audio-visual objects as well as their behavior in response to interaction (scene description, specified in part 11 of ISO/IEC 14496); 3. the coded representation of information related
26、 to the management of data streams (synchronization, identification, description and association of stream content, specified in part 11 of ISO/IEC 14496); 4. a generic interface to the data stream delivery layer functionality (specified in part 6 of ISO/IEC 14496); 5. an application engine for prog
27、rammatic control of the player: format, delivery of downloadable Java byte code as well as its execution lifecycle and behavior through APIs (specified in part 11 of ISO/IEC 14496); and 6. a file format to contain the media information of an ISO/IEC 14496 presentation in a flexible, extensible forma
28、t to facilitate interchange, management, editing, and presentation of the media. The information representation, specified in ISO/IEC 14496-1 and in ISO/IEC 14496-11, describes the means to create an interactive audio-visual scene in terms of coded audio-visual information and associated scene descr
29、iption information. The encoded content is presented to a terminal as the collection of elementary streams. Elementary streams contain the coded representation of either audio or visual data or scene description information or user interaction data. Elementary streams may as well themselves convey i
30、nformation to identify streams, to describe logical dependencies between streams, or to describe information related to the content of the streams. Each elementary stream contains only one type of data. Elementary streams are decoded using their respective stream-specific decoders. The audio-visual
31、objects are composed according to the scene description information and presented by the terminals presentation device(s). All these processes are synchronized according to the systems decoder model (SDM) using the synchronization information provided at the synchronization layer. The scene descript
32、ion stream identifies different types of objects, such as audio, visual, 2D and 3D graphics, etc. that define a scene composition of the content. Synthesized Textures streams provide for photo-realistic animations that can be transmitted using very low bitrates. These type of aniumamtions can be use
33、d in combination with other streams to enhance any scene. The International Organization for Standardization (ISO) and International Electrotechnical Commission (IEC) draw attention to the fact that it is claimed that compliance with this document may involve the use of patents. The ISO and IEC take
34、 no position concerning the evidence, validity and scope of this patent right. BSISO/IEC1449619:2004IS/OIE69441 C-:9102(40)E I SO/IE4002 C All irthgs serervde viiThe holder of this patent right has assured the ISO and IEC that he is willing to negotiate licences under reasonable and non-discriminato
35、ry terms and conditions with applicants throughout the world. In this respect, the statement of the holder of this patent right is registered with the ISO and IEC. Information may be obtained from: Vimatix Inc. 5 Oppenheimer St. Rehovot 76701 Israel Attention is drawn to the possibility that some of
36、 the elements of this document may be the subject of patent rights other than those identified above. ISO and IEC shall not be held responsible for identifying any or all such patent rights. BSISO/IEC1449619:20044002:9169441CEI/OSISBINTENRATIONAL TSANDADR IS/OIE69441 C-:9102(40)EI SO/IE4002 C All ir
37、thgs serervde 1Information technology Coding of audio-visual objects Part 19: Synthesized texture stream 1 Scope This part of ISO/IEC 14496 specifies functionalities for the transmission of Synthesized Texture data as part of the MPEG-4 encoded audio-visual presentation. More specifically, it define
38、s: 1. The synthesized texture format representation that is utilized for Synthesized Texture data encoding 2. The coded representation of Synthesized Texture data streams. 2 Normative References The following referenced documents are indispensable for the application of this document. For dated refe
39、rences, only the edition cited applies. For undated references, the latest edition of the referenced document (including any amendments) applies. ISO/IEC 14496-1, Information technology Coding of audio-visual objects Part 1: Systems ISO/IEC 14496-11, Information technology Coding of audio-visual obj
40、ects Part 11: Scene description and application engine 3 Synthesized Texture Compression Technology 3.1 Functionality and Semantics 3.1.1 Overview Synthesized Textures represent photo-realistic textures by describing color information of vectors. Synthesized Texture streams are used for creation of
41、very low bit rate synthetic video clips. Synthesized Texture clips are built using key frame based animations of skeletons that affect photorealistic textures whose color information is modeled by equations. A texture top-level Synthesized Texture Node (STNode) can be defined for playing Synthesized
42、Textures, see ISO/IEC 14496-11 for additional details . The STNode itself is similar to the MovieTexture, and uses url field to reference an Object Descriptor describing the associated stream(s). The stream contains both the object textures and their animation descriptions . The STNode also exposes
43、control points that can be used to manipulate via affine transforms the objects carried in its associated stream. By this way STNode can implement synthesized interactive SynthesizedTextures. As any texture, the resulting texture can be mapped onto any 2D or 3D surface. BSISO/IEC1449619:2004IS/OIE69
44、441 C-:9102(40)E 2 I SO/IE4002 C All irthgs serervde3.1.1.1 SynthesizedTexture Elements The SynthesizedTexture is a collection of animated Objects (also called Actors) sharing a common Stage, Camera and Timeline. SynthesizedTexture Object Texture Skeleton Animation Object Texture Skeleton Animation
45、Object Texture Skeleton AnimationFigure 1 Synthesized Texture structure The Object is comprised of a Texture, A Skeleton and an Animation. The objects Texture represents the objects skin. The texture is comprised of primitive vector-style entities, belonging to a small number of primitive types such
46、 as Lines and Area Color Points. The pixel representation of the texture is reconstructed through the process of Texture Rendering. The texture is divided into mutually exclusive sub-textures called Layers. The Skeleton represents the kinematic capabilities of the object relative to itself and contr
47、ols the shape and appearance of the skin. The skeleton is comprised of a topology of Bones whose geometric configuration is controlled by the objects animation. The skeleton is attached to the textures layers, and controls their position and shape within the objects plane. This ultimately affects th
48、e layout of the texture primitives on the plain, as the skeleton geometry changes. Re-rendering the texture based on a new layout of the texture primitives eventually results in a realistic warping effect called Texture Warping. The Animation represents the spatial behavior of a single object along
49、time. The animation of Objects is formed by an extrinsic motion of the entire object relative to the world, and an intrinsic motion of Layers relative to the object they are part of. BSISO/IEC1449619:2004IS/OIE69441 C-:9102(40)E I SO/IE4002 C All irthgs serervde 3The intrinsic motion is controlled by the Skeleton geometry, as described above. Extrinsic motion of each