1、INTERNATIONAL TELECOMMUNICATION UN ION ITU-T TELECOMMUNICATION STANDARDIZATION SECTOR OF ITU H.262 (0212000) SERIES H: AUDIOVISUAL AND MULTIMEDIA SYSTEMS Infrastructure of audiovisual services - Coding of moving video Information technology - Generic coding of moving pictures and associated audio in
2、formation: Video ITU-T Recommendation H.262 (Previously CCITT Recommendation) COPYRIGHT International Telecommunications Union/ITU TelecommunicationsLicensed by Information Handling Services - STD-ITU-T RECMN H.Zb2-ENGL 2000 D 118b2591 Ob73887 390 I ITU-T H-SERIES RECOMMENDATIONS AUDIOVISUAL AND MUL
3、TIMEDIA SYSTEMS Characteristics of transmission channels used for other than telephone purposes Use of telephone-type circuits for voice-frequency telegraphy Telephone circuits or cables used for various types of telegraph transmission or simultaneous transmission Telephone-type circuits used for fa
4、csimile telegraphy Characteristics of data signals CHARACTERISTICS OF VISUAL TELEPHONE SYSTEMS INFRASTRUCTURE OF AUDIOVISUAL SERVICES General Transmission multiplexing and synchronization Systems aspects Communication procedures Coding of moving video Related systems aspects Systems and terminal equ
5、ipment for audiovisual services Supplementary services for multimedia H. 1 0-H. 1 9 H .20-H. 29 H. 30-H.3 9 H.40-H.49 H. 50-H.99 H, 1 00-H. 199 H.200-H.219 H.220-H.229 H.230-H.239 H.240-H.259 H.260-H.279 H.280-H.299 H.300-H.399 H.450-H.499 For further details, please refer to ITU-T List of Recommend
6、ations. COPYRIGHT International Telecommunications Union/ITU TelecommunicationsLicensed by Information Handling Services STD-ITU-T RECMN H-2b2-ENGL 2000 m 4862593 Ob73888 227 INTERNATIONAL STANDARD 13818-2 ITU-T RECOMMENDATION H.262 INFORMATION TECHNOLOGY - GENERIC CODING OF MOVING PICTURES AND ASSO
7、CIATED AUDIO INFORMATION: VIDEO Summary This Recommendation 1 International Standard specifies coded representation of video data and the decoding process required to reconstruct pictures. It provides a generic video coding scheme which serves a wide range of applications, bit rates, picture resolut
8、ions and qualities. Its basic coding algorithm is a hybrid of motion compensated prediction and DCT. Pictures to be coded can be either interlaced or progressive. Necessary algorithmic elements are integrated into a single syntax, and a limited number of subsets are defined in terms of Profile (func
9、tionalities) and Level (parameters) to facilitate practical use of this generic video coding standard. This second edition of this Recommendation I International Standard consists of ITU-T Rec. H.262 (1995) 1 ISOAEC 1381 8-2: 1996, subsequently altered by two comgenda and six amendments: A first cor
10、rigendum adding a slice picture identifier, allowing an application to define default colour description parameters, removing a prohibition of field-structured DCT coding in progressive fiames, clarimng ambiguity on restricted range of reconstructed motion vectors, clarimng ambiguity of VBV at bound
11、aries of sequences, and making various minor corrections. 2. A second corrigendum altering the inverse discrete cosine transform requirements, altering temporal-reference for low delay, and making various minor corrections. 3. A first amendment providing a method of obtaining and registering copyrig
12、ht identifiers. 4. A second amendment defining a 4:2:2 profile. 5. A third amendment adding a camera parameters extension and a multi-view profile. 6. A fourth amendment adding an ITU-T extension. 7. A fifth amendment adding a high level to the 4:2:2 profile. 8. A sixth amendment reducing the upper
13、bound for the number of lines per frame in the high level of all profiles from 1152 to 1088. 1. Source ITU-T Recommendation H.262 was approved on 17 February 2000. The identical text is also published as ISO/IEC International Standard 1381 8-2. This edition of ITU-T H.262 consolidates H.262 (07/1995
14、) and its Amendments 1 and 2 (1 1/1996), 3 and 4 (02/1998), 5 (05/1999), 6 (02/2000) and Corrigenda 1 and 2 (1 1/1996). ITU-T Rec. H.262 (2000 E) 1 COPYRIGHT International Telecommunications Union/ITU TelecommunicationsLicensed by Information Handling Services- - STD-ITU-T RECMN HeZbZ-ENGL 2000 48b2
15、59L Ob73889 Lb3 FOREWORD ITU (International Telecommunication Union) is the United Nations Specialized Agency in the field of telecommuni- cations. The ITU Telecommunication standardization Sector (ITU-T) is a permanent organ of the ITU. The ITU-T is responsible for studying technical, operating and
16、 tariff questions and issuing Recommendations on them with a view to standardizing telecommunications on a worldwide basis. The World Telecommunication Standardization Conference (WTSC), which meets every four years, establishes the topics for study by the ITU-T Study Groups which, in their turn, pr
17、oduce Recommendations on these topics. The approval of Recommendations by the Members of the ITU-T is covered by the procedure laid down in WTSC Resolution No. 1. In some areas of information technology which fall within ITU-Ts purview, the necessary standards are prepared on a collaborative basis w
18、ith IS0 and IEC. NOTE In this Recommendation, the expression “Administration” is used for conciseness to indicate both a telecommunication administration and a recognized operating agency. INTELLECTUAL PROPERTY RIGHTS The ITU draws attention to the possibility that the practice or implementation of
19、this Recommendation may involve the use of a claimed Intellectual Property Right. The ITU takes no position concerning the evidence, validity or applicability of claimed Intellectual Property Rights, whether asserted by ITU members or others outside of the Recommendation development process. As of t
20、he date of approval of this Recommendation, the ITU had not received notice of intellectual property, protected by patents, which may be required to implement this Recommendation. However, implementors are cautioned that this may not represent the latest information and are therefore strongly urged
21、to consult the TSB patent database. O ITU 2000 All rights reserved. No part of this publication may be reproduced or utilized in any form or by any means, electronic or mechanical, including photocopying and microfilm, without permission in writing from the ITU. 11 ITU-T Rec. H.262 (2000 E) COPYRIGH
22、T International Telecommunications Union/ITU TelecommunicationsLicensed by Information Handling ServicesSTD=ITU-T RECMN HsZbZ-ENGL 2000 m 48bS59L Ob73890 985 1 2 3 4 5 6 7 8 9 CONTENTS Intro . 1 Purpose Intro . 2 Application Intro . 3 Profiles and levels . Intro . 4 The scalable and the non-scalable
23、 syntax . Scope Normative references Definitions Abbreviations and symbols . 4.1 Arithmetic operators 4.2 Logical operators . 4.4 Bitwise operators . 4.6 Mnemonics 4.7 Constants . Conventions 5.1 5.2 Definition of functions 5.3 4.3 Relational operators . 4.5 Assignment Method of describing bitstream
24、 syntax Reserved, forbidden and marker-bit 5.4 Arithmetic precision Video bitstream syntax and semantics 6.1 Structure of coded video data 6.2 Video bitstream syntax 6.3 Video bitstream semantics . The video decoding process 7.1 Higher syntactic structures 7.2 Variable length decoding . 7.3 Inverse
25、scan . 7.4 Inverse quantisation . 7.5 Inverse DCT 7.6 Motion compensation 7.7 Spatial scalability . 7.9 Temporal scalability 7.8 SNR scalability 7.10 Data partitioning 7.1 1 Hybrid scalability 7.12 Output ofthe decoding process . Profiles and levels . 8.1 ISO/IEC 1 1172-2 compatibility . 8.2 Relatio
26、nship between defined profiles . 8.3 Relationship between defined levels . 8.4 Scalable layers . 8.5 Parameter values for defined profiles, levels and layers 8.6 Compatibility requirements on decoders . 9.1 General 9.2 Implementation of a Registration Authority (RA) . Registration of Copyright Ident
27、ifiers ITU-T Rec . H.262 (2000 E) Page V V V vi 1 1 2 7 7 8 8 8 8 8 9 9 9 10 10 11 11 11 21 36 61 61 62 64 66 69 69 83 92 99 102 103 104 106 109 109 111 111 114 115 117 117 118 . 111 COPYRIGHT International Telecommunications Union/ITU TelecommunicationsLicensed by Information Handling ServicesSTDmI
28、TU-T RECMN H-ZbZ-ENGL 2000 48b2591 Ob73891 811 W Annex A . Inverse discrete transform . Annex B - Variable length code tables B.l Macroblock addressing B.2 Macroblock type B.3 Macroblock pattern B.4 Motion vectors . B.5 DCT coefficients . Annex C - Video buffering verifier . Annex D - Features suppo
29、rted by the algorithm D.l Overview . D.2 Video formats D.3 Picture quality D.4 Data rate control D.5 Low delay mode D.7 Scalability D.8 Compatibility . D.9 Differences between this Specification and ISO/IEC 11 172-2 D . 1 O D.11 Editing encoded bitstreams D.12 Trick modes . D . 13 Error resilience D
30、 . 14 Concatenated sequences Annex E - Profile and level restrictions E.l Syntax element restrictions in profiles E.2 Permissible layer combinations . Annex F - Bibliography . Annex G - Registration Procedure Procedure for the request of a Registered Identifier (RID) Responsibilities of the Registra
31、tion Authority . Responsibilities of parties requesting an RID Appeal procedure for denied applications . Annex H - Registration Application Form H . 1 Contact information of organization requesting a Registered Identifier (RID) . H.2 Statement of an intention to apply the assigned RID . H.3 Date of
32、 intended implementation of the RID . H.4 Authorized representative H.5 For official use only of the Registration Authority Annex J - 4:2:2 Profile test results J.l Introduction . D.6 Random accesskhannel hopping . Complexity G.l G.2 G.3 G.4 Page 119 121 121 122 127 128 129 138 143 143 143 144 144 1
33、44 145 145 151 151 154 154 154 155 162 163 163 175 197 198 198 198 198 199 200 200 200 200 200 200 202 202 iv ITU-T Rec . H.262 (2000 E) COPYRIGHT International Telecommunications Union/ITU TelecommunicationsLicensed by Information Handling ServicesIntroduction Intro. 1 Purpose This Part of this Rec
34、ommendation 1 International Standard was developed in response to the growing need for a generic coding method of moving pictures and of associated sound for various applications such as digital storage media, television broadcasting and communication. The use of this Specification means that motion
35、 video can be manipulated as a form of computer data and can be stored on various storage media, transmitted and received over existing and future networks and distributed on existing and future broadcasting channels. Intro. 2 Application The applications of this Specification cover, but are not lim
36、ited to, such areas as listed below: BSS CATV CDAD Cable Digital Audio Distribution DSB DTTB Digital Terrestrial Television Broadcasting EC Electronic Cinema ENG FSS HTT Home Television Theatre IPC ISM MMM Multimedia Mailing NCA News and Current Affairs NDB RVS Remote Video Surveillance SSM Broadcas
37、ting Satellite Service (to the home) Cable TV Distribution on optical networks, copper, etc. Digital Sound Broadcasting (terrestrial and satellite broadcasting) Electronic News Gathering (including SNG, Satellite News Gathering) Fixed Satellite Service (e.g. to head ends) Interpersonal Communication
38、s (videoconferencing, videophone, etc.) Interactive Storage Media (optical disks, etc.) Networked Database Services (via ATM, etc.) Serial Storage Media (digital VTR, etc.) Intro. 3 Profiles and levels This Specification is intended to be generic in the sense that it serves a wide range of applicati
39、ons, bitrates, resolutions, qualities and services. Applications should cover, among other things, digital storage media, television broadcasting and communications. In the course of creating this Specification, various requirements from typical applications have been considered, necessary algorithm
40、ic elements have been developed, and they have been integrated into a single syntax. Hence, this Specification will facilitate the bitstream interchange among different applications. Considering the practicality of implementing the full syntax of this Specification, however, a limited number of subs
41、ets of the syntax are also stipulated by means of “profile“ and “level“. These and other related terms are formally defined in clause 3. A “profile“ is a defined subset of the entire bitstream syntax that is defined by this Specification. Within the bounds imposed by the syntax of a given profile it
42、 is still possible to require a very large variation in the performance of encoders and decoders depending upon the values taken by parameters in the bitstream. For instance, it is possible to specify frame sizes as large as (approximately) 214 samples wide by 214 lines high. It is currently neither
43、 practical nor economic to implement a decoder capable of dealing with all possible fiame sizes. In order to deal with this problem, “levels“ are defined within each profile. A level is a defined set of constraints imposed on parameters in the bitstream. These constraints may be simple limits on num
44、bers. Alternatively they may take the form of constraints on arithmetic combinations of the parameters (e.g. frame width multiplied by frame height multiplied by frame rate). Bitstreams complying with this Specification use a common syntax. In order to achieve a subset of the complete syntax, flags
45、and parameters are included in the bitstream that signal the presence or otherwise of syntactic elements that occur later in the bitstream. In order to specify constraints on the syntax (and hence define a profile), it is thus only necessary to constrain the values of these flags and parameters that
46、 specie the presence of later syntactic elements. ITU-T Rec. H.262 (2000 E) v COPYRIGHT International Telecommunications Union/ITU TelecommunicationsLicensed by Information Handling Services STDmITU-T RECMN HmZbZ-ENGL 2000 48bS59L Ob73893 b94 Intro. 4 The scalable and the non-scalable syntax The ful
47、l syntax can be divided into two major categories: One is the non-scalable syntax, which is structured as a super set of the syntax defined in ISOAEC 1 1 172-2. The main feature of the non-scalable syntax is the extra compression tools for interlaced video signals. The second is the scalable syntax,
48、 the key property of which is to enable the reconstruction of useful video from pieces of a total bitstream. This is achieved by structuring the total bitstream in two or more layers, starting from a standalone base layer and adding a number of enhancement layers. The base layer can use the non- sca
49、lable syntax, or in some situations conform to the ISOAEC 11 172-2 syntax. Intro. 4.1 Overview of the non-scalable syntax The coded representation defined in the non-scalable syntax achieves a high compression ratio while preserving good image quality. The algorithm is not lossless as the exact sample values are not preserved during coding. Obtaining good image quality at the bitrates of interest demands very high compression, which is not achievable with intra picture coding alone. The need for random access, however, is best satisfied with pure intra picture cod