SMPTE RDD 36-2015 Apple ProRes Bitstream Syntax and Decoding Process.pdf

资源描述

1、 Copyright 2015 by THE SOCIETY OF MOTION PICTURE AND TELEVISION ENGINERS 3 Barker Avenue, White Plains, NY 10601 (914) 761-1100 Approved October 26, 2015 The atached document is a Registered Disclosure Document prepared by the sponsor identified below. It has ben examined by the apropriate SMPTE Tec

2、hnology Comittee and is believed to contain adequate information to satisfy the objectives defined in the Scope, and to be technically consistent. This document is NOT a Standard, Recomended Practice or Enginering Guideline, and does NOT imply a finding or representation of the Society. Every atempt

3、 has ben made to ensure that the information contained in this document is accurate. Errors in this document should be reported to the proponent identified below, with a copy to engsmpte.org. All other inquiries in respect of this document, including inquiries as to intelectual property requirements

4、 that may be attached to use of the disclosed technology, should be addressed to the proponent identified below. Proponent contact information: ProRes Program Ofice Apple Inc. 1 Infinite Loop, MS: 77-2YAK Cupertino, CA 95014 USA Email: ProR SMPTE REGISTERED DISCLOSURE DOCUMENT Aple ProRes Bitstream

5、Syntax and Decoding Process Page 1 of 39 pages SMPTE RD 36:2015 SMPTE RD 36:2015 Page 2 of 39 pages Table of Contents Page Introduction 4 1 Scope . 4 2 References . 4 3 Notation 4 3.1 Arithmetic Operators . 4 3.2 Logical Operators 5 3.3 Relational Operators . 5 3.4 Bitwise Operators 5 3.5 Assignment

6、 Operators . 5 3.6 Mathematical Functions 5 3.7 Constants 6 4 ProRes Frame Structure 6 5 Bitstream Syntax 7 5.1 Frame Syntax 9 5.1.1 Frame Header Syntax . 10 5.1.2 Stufing Syntax 11 5.2 Picture Syntax . 11 5.2.1 Picture Header Syntax 11 5.2.2 Slice Table Syntax 11 5.3 Slice Syntax . 12 5.3.1 Slice H

7、eader Syntax . 13 5.3.2 Scaned Coeficients Syntax 14 5.3.3 Scaned Alpha Syntax . 15 6 Bitstream Semantics 15 6.1 Frame Semantics 15 6.1.1 Frame Header Semantics . 15 6.1.2 Stufing Semantics 20 6.2 Picture Semantics . 21 6.2.1 Picture Header Semantics 22 6.2.2 Slice Table Semantics 22 6.3 Slice Seman

8、tics . 22 6.3.1 Slice Header Semantics . 22 6.3.2 Scaned Coeficients Semantics 23 6.3.3 Scaned Alpha Semantics 23 SMPTE RD 36:2015 Page 3 of 39 pages 6.4 Bitstream Versions, Version Variants, and Compatibility 23 7 Decoding Proces 24 7.1 Entropy Decoding 25 7.1.1 Scaned Coeficients 25 7.1.1.1 Golomb

9、 Combination Codes 25 7.1.1.2 Signed Golomb Combination Codes 26 7.1.1.3 DC Coeficients 26 7.1.1.4 AC Coeficients 27 7.1.2 Scaned Alpha . 28 7.2 Inverse Scaning . 31 7.2.1 Slice Scaning 31 7.2.2 Block Scaning . 32 7.3 Inverse Quantization . 33 7.4 Inverse Transform . 34 7.5 Pixel Component Sample Ge

10、neration and Pixel Output . 35 7.5.1 Color Component Samples . 35 7.5.2 Alpha Component Samples 35 7.5.3 Pixel Arrangement 36 Annex A IDCT Implementation Acuracy Qualification . 38 SMPTE RD 36:2015 Page 4 of 39 pages Introduction Apple ProRes is a video compresion scheme developed by Aple Inc. for u

11、se in workflows that require high quality and eficient performance. It is an intra-frame codec that can encode progressive or interlaced frames with arbitrary dimensions and either 4:2:2 or 4:4:4 chroma sampling. It operates on YCbCr video data; the pixel component samples can have bit depths of 12

12、or even more bits per sample, which enables ProRes to be used for RGB video data (via conversion to YCbCr) with high quality results. Frames can also include an alpha channel, with up to 16 bits per alpha sample, which ProRes encodes loslesly. 1 Scope This SMPTE Registered Disclosure Document (RD) i

13、ncludes specifications for the Aple ProRes bitstream syntax, the bitstream element semantics, and the decoding proces used to produce decompressed images. A reference impleentation that reads ProRes bitstreams from a file and decompreses the bitstreams is part of the contribution. Sample bitstreams

14、and the resulting decompresed images have also ben contributed for exercising the reference implementation. This RD does not describe the Aple QuickTime file format or the details of storing ProRes bitstreams in QuickTime files. 2 References IEE Std 180-1990, IEE Standard Specifications for the Impl

15、ementations of 8x8 Inverse Discrete Cosine Transform. ISO/IEC 13818-2:2000, Information technology Generic coding of moving pictures and associated audio information: Video. Recomendation ITU-R BT.601-7, Studio encoding parameters of digital television for standard 4:3 and wide-scren 16:9 aspect rat

16、ios. Recomendation ITU-R BT.709-5, Parameter values for the HDTV standards for production and international programme exchange. Recomendation ITU-R BT.2020-1, Parameter values for ultra-high definition television systems for production and international programe exchange. Recomendation ITU-T H.264 (

17、02/2014), Advanced video coding for generic audiovisual services. SMPTE ST 2084:2014, High Dynamic Range Electro-Optical Transfer Function of Mastering Reference Displays. 3 Notation 3.1 Arithmetic Operators + Addition Subtraction (as a binary operator) or negation (as a unary prefix operator) * Mul

18、tiplication Division (used in mathematical equations where no truncation or rounding is intended) SMPTE RD 36:2015 Page 5 of 39 pages xyDivision, x y / Integer division with truncation of the result toward negative infinity: x / y = flor(x y) n mod m Modulo operator with moduls m. Defined only for i

19、ntegers n and m with m 0. Result is remainder r after integer division of n by m, r = n flor(n m) * m; 0 r m 1. xyExponentiation, x raised to the power y f(i)n2i = n1Sumation of f(i), where i, n1, and n2are integers and n1 i n23.2 Logical Operators v = sliceSize) slice_size_in_mbj+ = sliceSize numMb

20、sRemainingInRow -= sliceSize sliceSize /= 2 while (numMbsRemainingInRow 0) number_of_slices_per_mb_row = j number_of_slices_per_mb_row The number of slices in a single macroblock row of the encoded picture. This coresponds to the number of entries in the slice_size_in_mb aray; like that aray, it is

21、the same for every SMPTE RD 36:2015 Page 2 of 39 pages macroblock row. It is calculated as part of the procedure to determine the slice_size_in_mb array entries, as shown above. 6.2.1 Picture Header Semantics picture_header_size The total size of the picture header in bytes (including the picture_he

22、ader_size element itself). Decoders shal use this value to determine the start of the slice table following the picture header in the bitstream. picture_size The total size of the compresed picture in bytes (including the picture header). If the com-pressed frame contains two compresed pictures, dec

23、oders shal use this value from the first compressed picture to determine the start of the second compressed picture in the bitstream. deprecated_number_of_slices The product of the picture height in macroblocks and the number of slices per macroblock row when that product is 6535 or les, otherwise 0

24、. Decoders shal ignore this element. log2_desired_slice_size_in_mb The base-2 logarithm of the desired number of macroblocks constituting a slice. Permisible values for this element are 0, 1, 2, and 3, which correspond respectively to 1, 2, 4, and 8 macroblocks per slice. 6.2.2 Slice Table Semantics

25、 coded_size_of_slice Two-dimensional array of compressed slice sizes in bytes. These follow the same ordering as the slices themselves: coded_size_of_sliceij gives the size of slice(i, j) in the bitstream. 6.3 Slice Semantics 6.3.1 Slice Header Semantics slice_header_size The total size of the slice

26、 header in bytes (including the slice_header_size element itself). Decoders shal use this value to determine the start of the compressed luma component data folowing the slice header in the bitstream. quantization_index A code that specifies the quantization scale factor, qScale. Permisible values a

27、re 1, , 24; all other values are reserved. coded_size_of_y_data The size of the compresed luma (Y) component data in bytes. coded_size_of_cb_data The size of the compressed blue chroma (Cb) component data in bytes. SMPTE RD 36:2015 Page 23 of 39 pages coded_size_of_cr_data The size of the compresed

28、red chroma (Cr) component data in bytes. Note: This element is present only if the value of the alpha_channel_type syntax element is non-zero. 6.3.2 Scaned Coefficients Semantics first_dc_coeff The first quantized DC coeficient in the scaned coeficient aray. dc_coef_diference The diference betwen th

29、e curent quantized DC coefficient and the previous one. run The number of consecutive zero-valued quantized AC coefficients in the scanned coefficient array preceding one that is non-zero. abs_level_minus_1 One les than the absolute value of the non-zero quantized AC coeficient that terminates the p

30、receding run of zero-valued coeficients. sign A code indicating the sign of the non-zero quantized AC coeficient that terminates the preced-ing run of zero-valued coeficients. A value of 0 means the coeficient is positive; a value of 1 means it is negative. zero_bit A single bit with value 0. Used t

31、o ensure that the compressed color component data comprise an integral number of bytes. zero_byte An eight-bit number with value zero (0x00). This syntax element serves no useful purpose but wil ocasionaly apear in ProRes bitstreams produced by older encoders. 6.3.3 Scaned Alpha Semantics alpha_dife

32、rence The diference (exact or modulo) between the current alpha value and the previous one. run The number of consecutive ocurences of the curent alpha value in the scaned alpha value array. zero_bit A single bit with value 0. Used to ensure that the compressed alpha component data com-prise an inte

33、gral number of bytes. 6.4 Bitstream Versions, Version Variants, and Compatibility ProRes bitstreams can incorporate future changes to syntax or semantics. There are two approaches to accommodating this: bitstream versioning and version variants. Diferent SMPTE RD 36:2015 Page 24 of 39 pages bitstrea

34、m versions denote intrinsic changes to the decoding process, while diferent version variants corespond to non-essential distinctions. A new bitstream version is required if a desired change in bitstream syntax or semantics breaks compatibility with existing decoders, i.e., if existing decoders canot

35、 properly decode such bitstreams. The bitstream version wil be incremented for each such change. A decoder that can decode a ProRes bitstream with a particular bitstream version shal be able to decode a bitstream with any earlier (lower) bitstream version. A decoder shall refuse to decode a ProRes b

36、itstream with an unsupported bitstream version. To maximize decoder compatibility, encoders should use the lowest bitstream version appropriate for the frame being encoded and the encoding parameters in effect. Version variants corespond to the adition of informative data to ProRes bitstreams. Such

37、additional data wil not include information that is required for correct decoding, and furthermore wil be added in a manner that does not prevent correct decoding by existing decoders that would otherwise be capable of decoding the bitstream. As a consequence all version variants of a ProRes bitstre

38、am version can be decoded by any decoder compatible with that bitstream version. ProRes bitstreams contain no explicit identification of version variant. Because unrecognized version variant data can be present in a ProRes bitstream, for syntax structures with size specified in the bitstream, decode

39、rs shal use the specified sizerather than inference from the syntax itselfto deterine the start of the imediately folowing syntax structure. This specification describes bitstream versions 0 and 1. Version 0 bitstreams wil have a value of 2 (4:2:2 sampling) for the chroma_format syntax element and a

40、 value of 0 (no encoded alpha) for the alpha_chanel_type element; version 1 bitstreas can have any permisible value for those elements. No version variants have ben defined for either bitstream version. 7 Decoding Proces This section describes the proces that a decoder shall follow to reconstruct a

41、frame from a ProRes bitstream. The proces is caried out for each compresed slice in the bitstream and consists of these steps: Entropy decoding is aplied to each of the compresed video components of the slice to produce arrays of scaned color component quantized discrete cosine transform (DCT) coefi

42、cients and, if the ProRes bitstrea includes an encoded alpha chanel, an array of raster-scaned alpha values; Inverse scaning is aplied to each of the scaned color component quantized DCT coeficient arays to produce blocks of color component quantized DCT coeficients; Inverse quantization is aplied t

43、o each of the color component quantized DCT coefficient blocks to produce blocks of color component DCT coefficients; An inverse discrete cosine transform (IDCT) is aplied to each of the color component DCT coeficient blocks to produce blocks of reconstructed color component values; Each of the reco

44、nstructed color component values is converted to an integral sample of desired bit depth and is writen to the apropriate location in the decoded frame bufer (as are the decoded alpha values, if any). SMPTE RD 36:2015 Page 25 of 39 pages 7.1 Entropy Decoding 7.1.1 Scaned Coefficients A ProRes compres

45、ed slice contains an entropy-coded aray of scaned quantized DCT coeficients for each color component (Y, Cb, and Cr). Quantized DC coefficients are encoded diferentially, while quantized AC coefficients are run-length encoded. Variable-length coding is applied to the results using codebooks based on

46、 the Golomb-Rice and exponential-Golomb coding schemes. 7.1.1.1 Golomb Combination Codes Golomb-Rice and exponential-Golomb codes are families of codeboks parameterized by a non-negative integer order. All members of both families have as their symbol alphabets the non-negative integers. Their codew

47、ords consist of three parts: a unary prefix consisting solely of 0 bits (the length of which is referred to as the code level); a separator consisting of a single 1 bit; and a binary suffix. Decoding is accomplished by counting the number of prefix bits and then apropriately combining that count wit

48、h the suffix to recover the encoded symbol. The Golomb-Rice code of order k encodes a non-negative integer symbol n by first calculating the quotient and remainder of n with respect to 2k, q = flor(n 2k) and r = n mod 2k. Then the codeword for n consists of a prefix of q 0 bits, a single 1 (separato

49、r) bit, and a k-bit suffix containing the binary representation of r; the length of the codeword is q + 1 + k. To decode an order-k Golomb-Rice codeword, the quotient/code level q is determined by counting the number of 0 bits preceding the first 1 bit, ignoring that 1 bit, taking the last k bits as the remainder r, and finally reconstructing the encoded symbol n as q * 2k+ r. The exponential-Golomb codes have a slightly more complex structure. For these the number of 0 bits in the codeword prefixthe cod

展开阅读全文