1、 Table of contents 1 Scope 2 Normative references 3 Introduction 4 Encoding 5 Decoding Annex A Subsampling filter Annex B Channel shuffling Annex C Discrete cosine transform and zigzag scan Annex D VLC tables Annex E System overview Annex F Bibliography 1 Scope This standard specifies the compressio
2、n of a high-definition source format to a dual-channel packetized data stream format which is suitable for recording on disc and tape storage devices including type D-11 tape recorder. The specification includes a number of basic packetizing operations including the shuffling of the source data prio
3、r to compression both to aid compression performance and to allow error concealment processing in the decoder. The standard also includes the processes required to decode the compressed type D-11 packetized data format into a high- definition output signal. This standard supports high-definition sou
4、rce formats using 1920*1080 pixels and the sampling structures specified in SMPTE 274M and SMPTE RP 211 at the following picture rates: 241.001/PsF, 24/PsF, 25/PsF, 301.001/PsF, 50/I and 601.001/I (where PsF indicates progressive segmented frame and I indicates interlaced). The data packet format sp
5、ecified by this standard is used as the source data stream for the associated document which maps this type D-11 packetized data stream format together with AES3 data over SDTI. 2 Normative references The following standards contain provisions which, through reference in this text, constitute provis
6、ions of this standard. At the time of publication, the editions indicated were valid. All standards are subject to revision, SMPTE STANDARD for Television Type D-11 Picture Compression and Data Stream Format Page 1 of 48 pages SMPTE 367M-2002 Copyright 2002 by THE SOCIETY OF MOTION PICTURE AND TELEV
7、ISION ENGINEERS 595 W. Hartsdale Ave., White Plains, NY 10607 (914) 761-1100 Approved March 28, 2002 SMPTE 367M-2002 Page 2 of 48 pages and parties to agreements based on this standard are encouraged to investigate the possibility of applying the most recent edition of the standards indicated below.
8、 SMPTE 12M-1999, Television, Audio and Film Time and Control Code SMPTE 274M-1998, Television 1920 x 1080 Scanning and Analog and Parallel Digital Interfaces for Multiple Picture Rates SMPTE 292M-1998, Television Bit-Serial Digital Interface for High-Definition Television Systems SMPTE RP 188-1999,
9、Transmission of Time Code and Control Code in the Ancillary Data Space of a Digital Television Data Stream SMPTE RP 211-2000, Implementation of 24P, 25P and 30P Segmented Frames for 1920 x 1080 Production Format 3 Introduction This standard specifies the encoding and decoding of high-definition sour
10、ce formats via compression into a bit rate in the range 112140 Mb/s for recording on a type D-11 digital tape recorder. The recorded bit rate is related to the source picture rate as given in table 1. Table 1 Data rates associated with source picture rates Picture rate Base data rate (Mb/s) 241.001/
11、PsF 111.863 24/PsF 111.975 25/PsF 116.640 301.001/PsF 139.828 50/I 116.640 601.001/I 139.828 Annex E gives the system overview of the documents which comprise the full type D-11 specification. This document specifies the parts identified by the number 1. The other documents identified as 2 and 3 spe
12、cify, respectively, the following parts: The SDTI definition for direct data input and output from the type D-11 recorder. The mapping of the compressed data format from either this document or the data interface document onto the type D-11 helical tracks as the VTR format. In common with other comp
13、ression systems, the type D-11 encoding process uses intraframe coding (i.e., the coding is bound by the frame period) using the discrete cosine transform (DCT) to provide the data decorrelation required for efficient compression. The coefficients are quantized and variable length coded (VLC) to pro
14、duce the basic output data format. The source pictures are subsampled prior to compression coding. This reduces the number of coded pixels and allows the number of bits-per-pixel value to be raised in proportion. The luminance source sampling grid of 1920*1080 pixels is reduced to 1440*1080 pixels.
15、For each chrominance channel, the source sampling grid of 960*1080 pixels is reduced to 480*1080 pixels. In the decoder, the output pixel sample grid is restored back to the source format of 1920*1080 pixels by interpolation following the compression decoding process. SMPTE 367M-2002 Page 3 of 48 pa
16、ges The compressed data format specified by the output of the compression encoder is of a form which allows direct mapping into the basic block structure as defined in the type D-11 digital recorder document. 4 Encoding 4.1 Overview The type D-11 source data for compression shall comprise only the p
17、roduction aperture area as defined by SMPTE 274M. NOTE The DCT coding uses a data block size which allows exactly 1080 lines to be coded. The source formats comprise luminance (Y) and chrominance (CB, CR) component signals as defined by SMPTE 274M and SMPTE RP 211. The type D-11 source picture rates
18、 for compression shall be constrained to the following values: 241.001 frames per second in the segmented format as defined by SMPTE RP 211. 24 frames per second in the segmented format as defined by SMPTE RP 211. 25 frames per second in the segmented format as defined by SMPTE RP 211. 301.001 frame
19、s per second in the segmented format as defined by SMPTE RP 211. 50 fields per second in the interlaced format (a.k.a. 50/I) as defined by SMPTE 274M. 601.001 fields per second in the interlaced format (a.k.a. 60/I) as defined by SMPTE 274M. The active picture data for compression shall be prefilter
20、ed and then subsampled from a source representation to a subsampled representation. The reduced active data shall then be split into two identical channels for processing as shown in figure 1 and table 2. The total picture data in each channel shall be divided into 20,250 8*8 blocks, each formed fro
21、m 8 samples of 8 consecutive lines in a frame. The 8*8 blocks for each channel shall then be shuffled within the frame boundary to produce 270 code blocks each comprising 45 luminance (Y) 8*8 blocks and 30 chrominance 8*8 blocks (15 CBand 15 CR). The picture data in each code block shall be compress
22、ed by the application of the discrete cosine transform, quantization and VLC encoding. Each code block shall be separately encoded and there shall be no data sharing between code blocks. The data from the compression output shall be packed into the code block space of 1080 bytes. Each code block sha
23、ll be segmented into five basic blocks each comprising 216 compressed data bytes. Each basic block nominally contains the compressed data for 9 luminance 8*8 blocks and 6 chrominance 8*8 blocks (3 CBand 3 CR). Data overflow from one basic block can be shared with other basic blocks in the same code
24、block. NOTE The 8*8 blocks may be coded by a single 8*8 DCT block, by two 8*4 DCT blocks, or by two 4*8 DCT blocks depending on the mode of operation (see 4.4). The 270 code blocks for each channel shall be divided into six equal segments of 45 code blocks per segment. Each segment shall contain one
25、 auxiliary basic block prior to the compressed data basic blocks. All auxiliary basic blocks in one channel shall be identical with the exception of the segment identification SMPTE 367M-2002 Page 4 of 48 pages number. The auxiliary basic block shall contain utility data for the segment. The distrib
26、ution of a channel into code blocks and basic blocks is illustrated in figure 2. All basic blocks shall have a total length of 219 bytes. The data for the basic blocks in a code block shall be 216 bytes in length, allowing 3 bytes for the basic block header. The data for the auxiliary basic block in
27、 each segment shall be 217 bytes in length, allowing 2 bytes for the basic block header. Figure 1 Encoding block diagram Table 2 Definition of signal sampling parameters Parameter Source sampling Subsampling Channel division Y 1920 1440 720 Number of samples per line CB, CR960 480 240 Number of acti
28、ve lines per frame 1080 1080 1080 Quantization 10-bit (01023) 8-bit (0255) 8-bit (0255) Peak range 4 to 1019 1 to 254 1 to 254 Peak white level: 940 Peak white level: 235 Peak white level: 235 Black level: 64 Black level: 16 Black level: 16 Y Total levels: 877 Total levels: 220 Total levels: 220 Sig
29、nal level: 512 448 Signal level: 128 112 Signal level: 128 112 Sample levels CB, CRTotal levels: 897 Total levels: 225 Total levels: 225 SMPTE 367M-2002 Page 5 of 48 pages Figure 2 Code blocks and basic blocks in channel 4.2 Preprocessing The source picture shall be the production aperture as define
30、d in SMPTE 274M having a luminance structure of 1920*1080 pixels and a multiplexed chrominance structure of 960*1080 pixels for each chrominance component. The source interface has a sample resolution of 10 bits which shall be reduced to 8 bits after the horizontal subsampling process. 4.2.1 Vertica
31、l sampling process For 1080/I systems, 540 lines for Y, CB, CRsignals from each interlaced field shall be processed. The coding lines for each interlaced field are illustrated in figure 2. For 1080/PsF systems, 1080 lines for Y, CB, CRsignals from each whole frame shall be processed. The coding line
32、s for the segmented frame are illustrated in figure 3. 4.2.2 Horizontal subsampling process For the luminance component, all 1920 active samples per line shall be subsampled to 1440 samples per line after a bandwidth limitation filtering process. For each of the two chrominance components, all 960 a
33、ctive samples per line shall be subsampled to 480 samples per line after a bandwidth limitation filtering process. The basic sample parameters for luminance (Y) and the two chrominance signals (CB, CR) of the source and subsampled component signals are described in table 2. Figure 3 depicts the resa
34、mpled spatial positions of the subsampled components for 1080/I and 1080/PsF line scanning systems. SMPTE 367M-2002 Page 6 of 48 pages NOTE T is the period of the luminance horizontal sampling. Figure 3 Sampling relationships for 1080/I and 1080/PsF source and subsampled systems SMPTE 367M-2002 Page
35、 7 of 48 pages The subsampled data in each frame shall be divided into two identical channels; an even sample channel and an odd sample channel as illustrated in figure 4. Figure 4 Channel division of subsampled 1080/I and 1080/PsF signals Let r be horizontal sample position number in the subsampled
36、 Y, CB, CR source. For Y samples r = 0, 1, 2, 3, . , 1439 For CB, CRsamples r = 0, 1, 2, 3, . , 479 Those samples that have r as an even number, depicted as a white circle in figure 4, shall be distributed to channel 0. Those samples that have r as an odd number, depicted as a gray circle in figure
37、4, shall be distributed to channel 1. Each luminance (Y) sample channel has a rectangular area of 720 samples by 1080 lines. Each chrominance (CB, CR) sample channel has a rectangular area of 240 samples by 1080 lines respectively as illustrated in figure 5. Figure 4 shows the overall structure of t
38、he subsampling process. SMPTE 367M-2002 Page 8 of 48 pages To avoid alias artifacts, the source format shall be prefiltered with a filter operating in the horizontal dimension only. The templates for the overall filtering characteristics of the subsampling process are defined in annex A. NOTE The fi
39、ltering and subsampling processes are implemented as one combined operation. Figure 5 Channel distribution SMPTE 367M-2002 Page 9 of 48 pages 4.3 Shuffling Each subsampled input picture shall be split into two channels each comprising 12,150 luminance (Y) and 8,100 chrominance (CBand CR) 8*8 blocks,
40、 as shown in figure 5. The 12,150 luminance blocks are taken from the array of 135*90 8*8 blocks. The 8,100 chrominance blocks are taken from the array of 135*30*2 8*8 blocks. The input format prior to shuffling for both channels shall be as shown in figure 5. The shuffling rearranges the 8*8 blocks
41、 according to the algorithm defined in annex B. After shuffling, the blocks for each channel shall be allocated to six segments each containing 45 code blocks. Each code block shall be subdivided into five shuffle blocks as shown in figure 2. NOTE The contents of the 5 shuffle blocks are uncompresse
42、d signal data. The data in the 5 shuffle blocks which form a code block are then compressed and packed into 5 corresponding basic blocks as described in 4.9. Each shuffle block, defined at the output of the shuffle algorithm, comprises 3 header bytes, 9 luminance 8*8 blocks, and 6 chrominance 8*8 bl
43、ocks, as shown in figure 6. 963 Bytes 961 Bytes1 1 1 576 Bytes (9 Y 8*8 blocks) 384 Bytes (3 CB+ 3 CR8*8 blocks) 8 bit BID0BID1HD LUMINANCE CHROMINANCE Figure 6 Shuffle block format The first header byte, BID0, shall define the shuffle block number from figure 2 as an 8-bit unsigned integer in the r
44、ange 0 to 224. Figure 7a defines the bit allocation for the shuffle block number. The second byte (BID1) defines the shuffle block mode information as shown in figure 7b. Bit 7 (SPF) defines the shuffle pattern flag which identifies the two states specified in annex B. Bit 6 shall be 0. Bit 5 define
45、s the field-frame mode flag as described in 4.4. Bits 4 to 2 define the 3-bit segment number (values 0 to 5) with SG2as the MSB. Bit 1 defines even channel (value 0) or odd channel (value 1). Bit 0 shall be 0. The third byte (HD) defines encoding information as shown in figure 7c. Bit 7 shall have a
46、 default value of 0. Bit 6 defines the overflow flag described in 4.9. Bits 5 to 0 define the 6-bit quantizer base described in 4.6 with QB5as the MSB. SMPTE 367M-2002 Page 10 of 48 pages a) BID0byte LSB 0 1 2 3 4 5 6 7 MSB SB0SB1SB2SB3SB4SB5SB6SB7Shuffle block number b) BID1byte LSB 0 1 2 3 4 5 6 7
47、 MSB 0 CH SG0SG1SG2FRM 0 SPF Fixed Chan Segment number Mode Fixed Pattern c) HD byte LSB 0 1 2 3 4 5 6 7 MSB QB0QB1QB2QB3QB4QB5OVF 0 Quantizer base Over Fixed Figure 7 Shuffle block header byte descriptions 4.4 Field-frame decision 4.4.1 Overview The picture data in each channel shall be processed t
48、o select field or frame mode encoding, indicated by bit 5 of the BID1byte. Every shuffle block of any one channel comprising six segments shall be formatted as either field mode or frame mode as specified in 4.4.2 and 4.4.3. 4.4.2 Frame mode reformat In frame mode encoding, bit 5 of BID1shall be set
49、 to the value 1. The nine luminance 8*8 blocks in each basic block shall not be reformatted and shall remain as nine 8Hx8VDCT blocks. The six chrominance 8*8 blocks shall be reformatted into three pairs of 4Hx8VCBDCT blocks and three pairs of 4Hx8VCRDCT blocks. The splitting of 8*8 blocks into 4Hx8Vblock pairs is shown in figure 8. Input 8*8 CBor CRblock 1st 4x8 DCT block 2nd4x8 DCT block Samples 03 Samples 47 Samples 03 Samples 47 LN