1、 INCITS/ISO/IEC TR 144967:2004R2014 (ISO/IEC TR 14496-7:2004, IDT) Information technology coding of audio-visual objects part 7: Optimized reference software for coding of audio-visual objects INCITS/ISO/IEC TR 14496-7:2004 R2014 PDF disclaimer This PDF file may contain embedded typefaces. In accord
2、ance with Adobes licensing policy, this file may be printed or viewed but shall not be edited unless the typefaces which are embedded are licensed to and installed on the computer performing the editing. In downloading this file, parties accept therein the responsibility of not infringing Adobes lic
3、ensing policy. The ISO Central Secretariat accepts no liability in this area. Adobe is a trademark of Adobe Systems Incorporated. Details of the software products used to create this PDF file can be found in the General Info relative to the file; the PDF-creation parameters were optimized for printi
4、ng. Every care has been taken to ensure that the file is suitable for use by ISO member bodies. In the unlikely event that a problem relating to it is found, please inform the Central Secretariat at the address given below. Adopted by INCITS (InterNational Committee for Information Technology Standa
5、rds) as an American National Standard. Date of ANSI Approval: 10/17/2014 Published by American National Standards Institute, 25 West 43rd Street, New York, New York 10036 Copyright 2014 by Information Technology Industry Council (ITI). All rights reserved. These materials are subject to copyright cl
6、aims of International Standardization Organization (ISO), International Electrotechnical Commission (IEC), American National Standards Institute (ANSI), and Information Technology Industry Council (ITI). Not for resale. No part of this publication may be reproduced in any form, including an electron
7、ic retrieval system, without the prior written permission of ITI. All requests pertaining to this standard should be submitted to ITI, 1101 K Street NW, Suite 610, Washington DC 20005. Printed in the United States of America ii ITIC 2014 All rights reserved ISO/IEC TR 14496-7:2004(E) ISO/IEC 2004 Al
8、l rights reserved iiiContents Page Foreword .iv Introduction.vi 1 Scope.1 2 Fast Motion Estimation.1 2.1 Introduction to Motion Adaptive Fast Motion Estimation 1 2.2 Technical Description of Core Technology MVFAST 2 2.2.1 Detection of stationary blocks.2 2.2.2 Determination of local motion activity
9、2 2.2.3 Search Center3 2.2.4 Search Strategy.4 2.2.5 Perspectives on implementing MVFAST.4 2.2.6 Special Acknowledgements.5 2.3 Technical Description of PMVFAST.5 2.3.1 Introduction.5 2.3.2 Technical Description of PMVFAST.6 2.3.3 Special Acknowledgement.7 2.4 Conclusions 7 3 Fast Global Motion Esti
10、mation.8 3.1 Introduction to Feature-based Fast and Robust Global Motion Estimation Technique 8 3.2 Technical Description of FFRGMET 9 3.2.1 Outlier Exclusion 9 3.2.2 Robust Object Function .9 3.2.3 Feature Selection 10 3.2.4 Algorithm Description 10 3.2.5 Perspectives on implementing FFRGMET 11 3.2
11、.6 Special Acknowledgements.11 3.3 Conclusions 11 4 Fast and Robust Sprite Generation .11 4.1 Introduction to Fast and Robust Sprite Generation .11 4.2 Algorithm Description 11 4.2.1 Outline of Algorithm .11 4.2.2 Image Region Division12 4.2.3 Fast and Robust Motion Estimation 13 4.2.4 Image Segment
12、ation.14 4.2.5 Image Blending .14 4.3 Conclusions 15 5 Optimised Reference Software For Simple Profile and Error Resilience Tools.15 5.1 Scope.15 5.2 Integration and Optimization of the Reference Software.15 5.2.1 Introduction.15 5.2.2 Removal of the unused procedures, parameters, and data structure
13、s 16 5.2.3 Revision of the code bases for saving the execution time and code sizes16 5.2.4 Use of the existing fast algorithms for the computational burden modules21 5.2.5 Optimised Simple Profile encoder and decoder.25 5.2.6 Experimental Results25 5.3 Error Resilience Tools29 5.3.1 Abbreviations29
14、5.3.2 New Processing / functionalities .29 6 Contact Information31 Bibliography.32 ISO/IEC TR 14496-7:2004(E) iv ISO/IEC 2004 All rights reservedForeword ISO (the International Organization for Standardization) and IEC (the International Electrotechnical Commission) form the specialized system for w
15、orldwide standardization. National bodies that are members of ISO or IEC participate in the development of International Standards through technical committees established by the respective organization to deal with particular fields of technical activity. ISO and IEC technical committees collaborat
16、e in fields of mutual interest. Other international organizations, governmental and non-governmental, in liaison with ISO and IEC, also take part in the work. In the field of information technology, ISO and IEC have established a joint technical committee, ISO/IEC JTC 1. International Standards are
17、drafted in accordance with the rules given in the ISO/IEC Directives, Part 2. The main task of the joint technical committee is to prepare International Standards. Draft International Standards adopted by the joint technical committee are circulated to national bodies for voting. Publication as an I
18、nternational Standard requires approval by at least 75 % of the national bodies casting a vote. In exceptional circumstances, the joint technical committee may propose the publication of a Technical Report of one of the following types: type 1, when the required support cannot be obtained for the pu
19、blication of an International Standard, despite repeated efforts; type 2, when the subject is still under technical development or where for any other reason there is the future but not immediate possibility of an agreement on an International Standard; type 3, when the joint technical committee has
20、 collected data of a different kind from that which is normally published as an International Standard (“state of the art”, for example). Technical Reports of types 1 and 2 are subject to review within three years of publication, to decide whether they can be transformed into International Standards
21、. Technical Reports of type 3 do not necessarily have to be reviewed until the data they provide are considered to be no longer valid or useful. Attention is drawn to the possibility that some of the elements of this document may be the subject of patent rights. ISO and IEC shall not be held respons
22、ible for identifying any or all such patent rights. ISO/IEC TR 14496-7, which is a Technical Report of type 3, was prepared by Joint Technical Committee ISO/IEC JTC 1, Information technology, Subcommittee SC 29, Coding of audio, picture, multimedia and hypermedia information. This second edition can
23、cels and replaces the first edition (ISO/IEC 14496-7:2002) which has been technically revised. ISO/IEC TR 14496 consists of the following parts, under the general title Information technology Coding of audio-visual objects: Part 1: Systems Part 2: Visual Part 3: Audio Part 4: Conformance testing Par
24、t 5: Reference software ISO/IEC TR 14496-7:2004(E) ISO/IEC 2004 All rights reserved v Part 6: Delivery Multimedia Integration Framework (DMIF) Part 7: Optimized reference software for coding of audio-visual objects Technical Report Part 8: Carriage of ISO/IEC 14496 contents over IP networks Part 9:
25、Reference hardware description Technical Report Part 10: Advanced Video Coding Part 11: Scene description and application engine Part 12: ISO base media file format Part 13: Intellectual Property Management and Protection (IPMP) extensions Part 14: MP4 file format Part 15: Advanced Video Coding (AVC
26、) file format Part 16: Animation Framework eXtension (AFX) Part 17: Streaming text format Part 18: Font compression and streaming Part 19: Synthesized texture stream ISO/IEC TR 14496-7:2004(E) vi ISO/IEC 2004 All rights reservedIntroduction Purpose This part of ISO/IEC 14496 was developed in respons
27、e to the growing need for optimized reference software that provides both improved visual quality and faster execution while compliance is preserved. The goal is to provide non-normative tools that are essential for implementations of the normative parts of the ISO/IEC 14496 specifications. For exam
28、ple, Part 5 of the ISO/IEC 14496 specifications uses a full search motion estimation which is theoretical optimum in coding efficiency but impractical for commercial implementation. In the past, the industry needs to create its own encoding tools for its target products. In this part, we provide a w
29、ell-tested set of encoding tools that can enhance the performance but should not be standardized. The following recommended tools would be up to the individual organization to decide if it wishes to adopt or adapt these tools for its specific needs. This part provides significant reduction in the ti
30、me-to-market and provides a reference benchmark for commercial ISO/IEC 14496 compliant products. TECHNICAL REPORT ISO/IEC TR 14496-7:2004(E) ISO/IEC 2004 All rights reserved 1Information technology Coding of audio-visual objects Part 7: Optimized reference software for coding of audio-visual objects
31、 1 Scope This part of ISO/IEC 14496 specifies the encoding tools that enhance both the execution and quality for the coding of visual objects as defined in ISO/IEC 14496-2. The tool set is not limited to visual objects but at this point all the recommended tools are visual encoding tools. There are
32、four tools that have been described in this technical report. Fast Motion Estimation Fast Global Motion Estimation Fast and Robust Sprite Generation Fast Variable Length Decoder Using Hierarchical Table Lookup These tools have been demonstrated as robust tools with source codes for both MoMusys and
33、Microsoft implementations. In the current implementations, there is single software that includes all tools existed in the ISO/IEC 14496-2. This is obviously inefficient in terms of code size and execution speed. To address this issue, the optimized reference software has compilation switches such t
34、hat only selected tools as defined by the profiles and levels are included. Such level of optimization is performed at high level programming language. The platform specific optimization is currently not addressed by this part. 2 Fast Motion Estimation 2.1 Introduction to Motion Adaptive Fast Motion
35、 Estimation The optimization of fast motion estimation is essentially a multi-dimensional problem. The key dimensions concerned in this problem are: Rate, Quality (PSNR), Speed-up (or Computational Gain), Algorithmic Complexity, Memory Size and Memory Bandwidth (see Figure 1). There always exists a
36、trade-off among all these five key dimensions. Therefore, it is highly desirable to have an adaptive fast motion estimation core algorithm with scalable structure, which can be adaptively optimized with respect to all or selected aspects for various coding environment and requirements. Since the rat
37、e control is used to fix the bit-rate, the optimization problem is reduced by one dimension to four dimensions. Motion Vector Field Adaptive Search Technique (MVFAST) 1 is a generic algorithm of the family of motion-adaptive fast search techniques, originally proposed by Kai-Kuang Ma and Prabhudev I
38、rappa Hosur from Nanyang Technological University (NTU), Singapore. The MVFAST offers high performance both in quality and speed and does not require memory to store the searched points and motion vectors. The MVFAST has been adopted by MPEG-4 Part 7 in the Noordwijkerhout MPEG meeting (March 2000)
39、as the core technology for fast motion estimation. A derivative of MVFAST, called Predictive MVFAST (PMVFAST) 2, is considered as an optional approach that might benefit in special coding situations. PMVFAST incorporates a set of thresholds into MVFAST to trade higher speed-up at the cost of memory
40、size, memory bandwidth and additional algorithmic complexity. In PMVFAST, the threshold values are adjusted based on the 54 test cases specified by MPEG-4. However, the coding performance and sensitivity of PMVFAST using these thresholds for the video sequences and encoding conditions outside the MP
41、EG-4 test set has not been studied and verified. ISO/IEC TR 14496-7:2004(E) 2 ISO/IEC 2004 All rights reservedBit-rate Quality Speed Memory (Size and Bandwidth)Algorithmic complexity Figure 1 Five dimensional optimization problem of fast motion estimation 2.2 Technical Description of Core Technology
42、 MVFAST 2.2.1 Detection of stationary blocks A large number of MBs in the video sequences (e.g., “talking head” video sequences) with low-motion content tend to have motion vectors equal to (0,0). Such MBs in the regions of no-motion activity can be detected simply based on the sum of absolute diffe
43、rence (SAD) at the origin. Therefore, we exploit an optional phase, called early elimination of search, as the first step in MVFAST as follows. The search for a MB will be terminated immediately, if its SAD value obtained at (0,0) is less than a threshold T, and the motion vector is assigned as (0,0
44、). Through extensive simulations, we found that among those zero-motion blocks identified, about 98% of them have their SAD at position (0,0) less than 512. Hence, we choose T = 512 to enable the mechanism of early elimination of search. Since this early elimination of search phase is optional, it c
45、an be turned off or disabled by imposing T = 0. 2.2.2 Determination of local motion activity The local motion vector field at a macroblock (MB) position is defined as the set of motion vectors in the region of support (ROS) of that MB. The ROS of a MB includes the n neighborhood MBs. In MVFAST, the
46、ROS with n = 3 is shown in Figure 2. Let V=V0, V1, .Vn, where V0 = (0,0), and Vi (and i 0) is the motion vector of MBiin the ROS (see Figure 2). The cityblock length of Vi=(xi, yi) is defined as lvi = |xi| + |yi|. Let L = MAXlvi for all Vi. The motion activity at the current MB position is defined a
47、s follows. Motion Activity = Low, if L L1; = Medium, if L1 L2 ; (1) where L1and L2are integer constants. We choose L1and L2as the cityblock distance from the center point of the pattern to any other point on the small and large search patterns (see Figure 3), respectively. Thus, L1=1 and L2=2. ISO/I
48、EC TR 14496-7:2004(E) ISO/IEC 2004 All rights reserved 3Figure 2 Region of support (ROS) for the current MB consists of MB1, MB2 and MB3 Figure 3 Example of distribution of motion vectors belonging to set V. In this case, lv1 = 2, lv2 = 1, lv3 = 6; thus L = MAXlv1, lv2, lv3 = 6 2.2.3 Search Center T
49、he choice of the search center depends on the local motion activity at the current MB position. If the motion activity is low or medium, the search center is the origin. Otherwise, the vector belonging to set V that yields the minimum sum of absolute difference (SAD) is chosen as the search center. (a) (b) Figure 4 (a) Large Diamond Search Pattern (LDSP)
copyright@ 2008-2019 麦多课文库(www.mydoc123.com)网站版权所有
备案/许可证编号:苏ICP备17064731号-1