1、BSI Standards Publication Information technology Coding of audio-visual objects Part 29: Web video coding BS ISO/IEC 14496-29:2015National foreword This British Standard is the UK implementation of ISO/IEC 14496-29:2015. The UK participation in its preparation was entrusted to Technical Committee IS
2、T/37, Coding of picture, audio, multimedia and hypermedia information. A list of organizations represented on this committee can be obtained on request to its secretary. This publication does not purport to include all the necessary provisions of a contract. Users are responsible for its correct app
3、lication. The British Standards Institution 2015. Published by BSI Standards Limited 2015 ISBN 978 0 580 79219 9 ICS 35.040 Compliance with a British Standard cannot confer immunity from legal obligations. This British Standard was published under the authority of the Standards Policy and Strategy C
4、ommittee on 31 May 2015. Amendments/corrigenda issued since publication Date Text affected BRITISH STANDARD BS ISO/IEC 14496-29:2015 Reference number ISO/IEC 14496-29:2015(E) ISO/IEC 2015INTERNATIONAL STANDARD ISO/IEC 14496-29 First edition 2015-04-01 Information technology Coding of audio-visual ob
5、jects Part 29: Web video coding Technologies de linformation Codage des objets audiovisuels Partie 29: Codage vido Web BS ISO/IEC 14496-29:2015ISO/IEC 14496-29:2015(E) COPYRIGHT PROTECTED DOCUMENT ISO/IEC 2014 All rights reserved. Unless otherwise specified, no part of this publication may be reprod
6、uced or utilized otherwise in any form or by any means, electronic or mechanical, including photocopying, or posting on the internet or an intranet, without prior written permission. Permission can be requested from either ISO at the address below or ISOs member body in the country of the requester.
7、 ISO copyright office Case postale 56 CH-1211 Geneva 20 Tel. + 41 22 749 01 11 Fax + 41 22 749 09 47 E-mail copyrightiso.org Web www.iso.org Published in Switzerland ii ISO/IEC 2015 All rights reserved BS ISO/IEC 14496-29:2015ISO/IEC 14496-29:2015(E) ISO/IEC 2014 All rights reserved iii Contents Pag
8、e 1 Scope .1 2 Normative references .1 3 Definitions.1 4 Abbreviations .7 5 Conventions 8 5.1 Arithmetic operators 8 5.2 Logical operators . 8 5.3 Relational operators . 8 5.4 Bit-wise operators 9 5.5 Assignment operators . 9 5.6 Range notation . 9 5.7 Mathematical functions 9 5.8 Order of operation
9、 precedence 10 5.9 Variables, syntax elements, and tables 11 5.10 Text description of logical operations . 12 5.11 Processes . 13 6 Source, coded, decoded and output data formats, scanning processes, and neighbouring relationships . 13 6.1 Bitstream formats 13 6.2 Source, decoded, and output picture
10、 formats . 14 6.3 Spatial subdivision of pictures and slices 15 6.4 Inverse scanning processes and derivation processes for neighbours . 16 7 Syntax and semantics . 26 7.1 Normative Syntax and Semantics . 26 7.2 Specification of syntax functions, categories, and descriptors 28 7.3 Syntax in tabular
11、form 30 7.4 Semantics 42 8 Decoding process 70 8.1 NAL unit decoding process 71 8.2 Slice decoding process . 72 8.3 Intra prediction process 82 8.4 Inter prediction process 95 8.5 Transform coefficient decoding process and picture construction process prior to deblocking filter process . 107 8.6 (vo
12、id) 118 8.7 Deblocking filter process . 118 9 Parsing process . 126 9.1 Parsing process for Exp-Golomb codes 127 9.2 CAVLC parsing process for transform coefficient levels . 131 Annex A (normative) Profiles and levels . 142 A.1 Requirements on video decoder capability 142 A.2 Profiles 142 A.3 Levels
13、 143 Annex B (normative) Byte stream format. 155 B.1 Byte stream NAL unit syntax and semantics . 155 B.2 Byte stream NAL unit decoding process . 156 B.3 Decoder byte-alignment recovery (informative) 156 BS ISO/IEC 14496-29:2015ISO/IEC 14496-29:2015(E) iv ISO/IEC 2015 All rights reserved Annex C (nor
14、mative) Hypothetical reference decoder . 158 C.1 Operation of coded picture buffer (CPB) . 161 C.2 Operation of the decoded picture buffer (DPB) 163 C.3 Bitstream conformance . 165 C.4 Decoder conformance . 166 Annex D (normative) Supplemental enhancement information 170 Annex E (normative) Video us
15、ability information . 171 E.1 VUI syntax . 172 E.2 VUI semantics 173 BS ISO/IEC 14496-29:2015 ISO/IEC 2015 All rights reserved v Foreword ISO (the International Organization for Standardization) and IEC (the International Electrotechnical Commission) form the specialized system for worldwide standar
16、dization. National bodies that are members of ISO or IEC participate in the development of International Standards through technical committees established by the respective organization to deal with particular fields of technical activity. ISO and IEC technical committees collaborate in fields of m
17、utual interest. Other international organizations, governmental and non-governmental, in liaison with ISO and IEC, also take part in the work. In the field of information technology, ISO and IEC have established a joint technical committee, ISO/IEC JTC 1. The procedures used to develop this document
18、 and those intended for its further maintenance are described in the ISO/IEC Directives, Part 1. In particular the different approval criteria needed for the different types of document should be noted. This document was drafted in accordance with the editorial rules of the ISO/IEC Directives, Part
19、2 (see www.iso.org/directives). Attention is drawn to the possibility that some of the elements of this document may be the subject of patent rights. ISO and IEC shall not be held responsible for identifying any or all such patent rights. Details of any patent rights identified during the developmen
20、t of the document will be in the Introduction and/or on the ISO list of patent declarations received (see www.iso.org/patents). Any trade name used in this document is information given for the convenience of users and does not constitute an endorsement. For an explanation on the meaning of ISO spec
21、ific terms and expressions related to conformity assessment, as well as information about ISOs adherence to the WTO principles in the Technical Barriers to Trade (TBT) see the following URL: Foreword - Supplementary information The committee responsible for this document is ISO/IEC JTC 1, Informatio
22、n technology, SC 29, Coding of audio, picture, multimedia and hypermedia information. ISO/IEC 14496 consists of the following parts, under the general title Information technology Coding of audio-visual objects: Part 1: Systems Part 2: Visual Part 3: Audio Part 4: Conformance testing Part 5: Referen
23、ce software Part 6: Delivery Multimedia Integration Framework (DMIF) Part 7: Optimized reference software for coding of audio-visual objects Part 8: Carriage of ISO/IEC 14496 contents over IP networks Part 9: Reference hardware description ISO/IEC 14496-29:2015(E) BS ISO/IEC 14496-29:2015ISO/IEC 144
24、96-29:2015(E) vi ISO/IEC 2015 All rights reserved Part 10: Advanced Video Coding Part 11: Scene description and application engine Part 12: ISO base media file format Part 13: Intellectual Property Management and Protection (IPMP) extensions Part 14: MP4 file format Part 15: Advanced Video Coding (A
25、VC) file format Part 16: Animation Framework eXtension (AFX) Part 17: Streaming text format Part 18: Font compression and streaming Part 19: Synthesized texture stream Part 20: Lightweight Application Scene Representation (LASeR) and Simple Aggregation Format (SAF) Part 21: MPEG-J Graphics Framework
26、 eXtensions (GFX) Part 22: Open Font Format Part 23: Symbolic Music Representation Part 24: Audio and systems interaction Part 25: 3D Graphics Compression Model Part 26: Audio conformance Part 27: 3D Graphics conformance Part 28: Composite font representation Part 29: Web video coding BS ISO/IEC 144
27、96-29:2015 ISO/IEC 2015 All rights reserved vii Introduction This International Standard specifies Web Video Coding, a technology that is compatible with the Constrained Baseline Profile of ISO/IEC 14996-10. Only the subset that is specified in Annex A for the Constrained Baseline Profile is a norma
28、tive specification, while all remaining aspects are informative. This text is derived from ISO/IEC 14996-10, with which the section numbers in this specification are aligned, and that specification may additionally be consulted if desired, as an aid to understanding this S pecification. ISO/IEC 1449
29、6-29:2015(E) BS ISO/IEC 14496-29:2015INTERNATIONAL STANDARD ISO/IEC 14496-29:2015(E) ISO/IEC 2015 All rights reserved 1 Information technology Coding of audio-visual objects Part 29: Web video coding 1 Scope This Part of ISO/IEC 14496 specifies Web Video Coding for coding of audio -visual objects. 2
30、 Normative references The following referenced documents are indispensable for the application of this document. For dated references, only the edition cited applies. For undated references, the latest edition of the referenced document (including any amendments) applies. ISO 11664-1, Colorimetry Pa
31、rt 1: CIE standard colorimetric observers. ISO/IEC 14496 -10: Information technology Coding of audio-visual objects Part 10: Advanced Video Coding 3 Definitions For the purposes of this document, the following definitions apply: 3.1 access unit: A set of NAL units that are conse cutive in decoding o
32、rder and contain exactly one primary coded picture. In addition to the primary coded picture, an access unit may also contain one auxiliary coded picture, or other NAL units not containing slices of a coded picture. The decoding of an access unit always results in a decoded picture. 3.2 AC transform
33、 coefficient : Any transform coefficient for which the frequency index in one or both dimensions is non-zero. 3.3 bitstream: A sequence of bits that forms the representation of coded pictures and associated data forming one or more coded video sequences. Bitstream is a collective term used to refer
34、either to a NAL unit stream or a byte stream. 3.4 block: An MxN (M-column by N-row) array of samples, or an MxN array of transform coefficients. 3.5 void 3.6 broken link: A location in a bitstream at which it is indicated that some subsequent pictures in decoding order may contain serious visual art
35、efacts due to unspecified operations performed in the generation of the bitstream. 3.7 byte: A sequence of 8 bits, written and read with the most significant bit on the left and the least significant bit on the right. When represented in a sequence of data bits, the most significant bit of a byte is
36、 first. 3.8 byte-aligned: A position in a bitstream is byte-aligned when the position is an integer multiple of 8 bits from the position of the first bit in the bitstream. A bit or byte or syntax element is said to be byte -aligned when the position at which it appears in a bitstream is byte-aligned
37、. BS ISO/IEC 14496-29:2015ISO/IEC 14496-29:2015(E) 2 ISO/IEC 2015 All rights reserved 3.9 byte stream: An encapsulation of a NAL unit stream containing start code prefixes and NAL units as specified in Annex B. 3.10 can: A term used to refer to behaviour that is allowed, but not necessarily required
38、 . 3.11 void 3.12 chroma: An adjective specifying that a sample array or single sample is representing one of the two colour difference signals related to the primary colours. The symbols used for a chroma array or sample are Cb and Cr. NOTE The term chroma is used rather than the term chrominance i
39、n order to avoid the implication of the use of linear light transfer characteristics that is often associated with the term chrominance. 3.13 coded frame: A coded representation of a frame. 3.14 coded picture: A coded representation of a picture. 3.15 coded picture buffer (CPB) : A first-in first-ou
40、t buffer containing access units in decoding order specified in the hypothetical reference decoder in Annex C. 3.16 coded representation: A data element as represented in its coded form. 3.17 void 3.18 coded slice NAL unit: A NAL unit containing a slice that is not a slice of an auxiliary coded pict
41、ure. 3.19 coded video sequence: A sequence of access units that consists, in decoding order, of an IDR access unit followed by zero or more non -IDR accessunits including all subsequent access units up to but not including any subsequent IDR access unit. 3.20 component: An array or single sample fro
42、m one of the three arrays ( luma and two chroma) that make up a frame in 4:2:0 colour format. 3.21 DC transform coefficient: A transform coefficient for which the frequency index is zero in all dimensions. 3.22 decoded picture: A decoded picture is derived by decoding a coded picture. A decoded pict
43、ure is a decoded frame. 3.23 decoded picture buffer (DPB): A buffer holding decoded pictures for reference, output reordering, or output delay specified for the hypothetical reference decoder in Annex C. 3.24 decoder: An embodiment of a decoding process. 3.25 decoder under test (DUT) : A decoder tha
44、t is tested for conform ance to this International Standard by operating the hypothetical stream scheduler to deliver a conforming bitstream to the decoder and to the hypothetical reference decoder and comparing the values and timing of the output of the two decoders. 3.26 decoding order: The order
45、in which syntax elements are processed by the decoding process. 3.27 decoding process: The process specified in this International Standard that reads a bitstream and derives decoded pictures from it. 3.28 void 3.29 display process: A process not specifi ed in this International Standard having, as
46、its input, the cropped decoded pictures that are the output of the decoding process. 3.30 emulation prevention byte : A byte equal to 0x03 that may be present within a NAL unit. The presence of emulation prevention by tes ensures that no sequence of consecutive byte-aligned bytes in the NAL unit con
47、tains a start code prefix. 3.31 encoder: An embodiment of an encoding process. 3.32 encoding process: A process, not specified in this International Standard, that produces a bitstream conforming to this International Standard. BS ISO/IEC 14496-29:2015 ISO/IEC 2015 All rights reserved 3 3.33 flag: A
48、 variable that can take one of the two possible values 0 and 1. 3.34 frame: A frame contains an array of luma samples and two corresponding arrays of chroma samples in 4:2:0 format. 3.35 frame macroblock: A macroblock representing samples of a coded frame. All macroblocks of a coded frame are frame
49、macroblocks. 3.36 void 3.37 frequency index: A one-dimensional or two-dimensional index associated with a transform coefficient prior to an inverse transform part of the decoding process. 3.38 hypothetical reference decoder (HRD) : A hypothetical decoder model that specifies constraints on the variability of conforming NAL unit streams or conforming byte streams that an encoding process may produce. 3.39 hypothetical stream scheduler (HSS) : A hypoth