1、SMPTE 314M Page 1 of 52 pages Table of contents 1 Scope 2 Normative references 3 Acronyms 4 Interface 5 Video compression Annex A Differences between IEC 61834 and SMPTE 314M Annex B Bibliography 1 Scope This standard defines the DV-based data structure for the interface of digital audio, subcode da
2、ta, and compressed video with the following parameters: 525/60 system 4:1:1 image sampling structure, 25 Mb/s data rate 525/60 system 4:2:2 image sampling structure, 50 Mb/s data rate 625/50 system 4:1:1 image sampling structure, 25 Mb/s data rate 625/50 system 4:2:2 image sampling structure, 50 Mb/
3、s data rate The standard does not define the DV-compliant data structure for the interface of digital audio, subcode data, and compressed video with the following parameters: 625/50 system 4:2:0 image sampling structure, 25 Mb/s data rate The compression algorithm and the DIF structure conform to th
4、e DV data structure as defined in IEC 61834. The differences between the DV-based data structure defined in this standard and IEC 61834 are shown in annex A. 2 Normative references The following standards, through reference in this text, constitute provisions of this standard. All standards are subj
5、ect to revision, and parties to agreements based on this standard are encouraged to investigate the possibility of applying the most recent edition of the standards indicated below. IEC 61834-1 (1997), Recording Helical-Scan Digital Video Cassette Recording System Using 6,35 mm Magnetic Tape for Con
6、sumer Use (525-60, 625-50, 1125-60, and 1250-50 Systems) Part 1: General Specifications IEC 61834-2 (1997), Recording Helical-Scan Digital Video Cassette Recording System Using 6,35 mm Magnetic Tape for Consumer Use (525-60, 625-50, 1125-60, and 1250-50 Systems) Part 2: SD Format for 525-60 and 625-
7、50 Systems Page 1 of 52 pages SMPTE 314M-2005Revision ofSMPTE 314M-1999Copyright 2005 by THE SOCIETY OF MOTION PICTURE AND TELEVISION ENGINEERS 3 Barker Avenue, White Plains, NY 10601 (914) 761-1100 Approved September 5, 2005 SMPTE STANDARD for Television Data Structure for DV-Based Audio, Data and
8、Compressed Video 25 and 50 Mb/s SMPTE 314M-2005 Page 2 of 52 pages SMPTE 12M-1999, Television, Audio and Film Time and Control Code ITU-R BT.470-6 (11/98), Conventional Television Systems ITU-R BT.601-5 (10/95), Studio Encoding Parameters of Digital Television for Standard 4:3 and Wide-Screen 16:9 A
9、spect Ratios 3 Acronyms AAUX Audio auxiliary data AP1 Audio application ID AP2 Video application ID AP3 Subcode application ID APT Track application ID Arb Arbitrary AS AAUX source pack ASC AAUX source control pack B/W Black-and-white flag CGMS Copy generation management system CM Compressed macro b
10、lock DBN DIF block number DCT Discrete cosine transform DIF Digital interface DRF Direction flag Dseq DIF sequence number DSF DIF sequence flag DV Identification of a compression family EFC Emphasis audio channel flag EOB End of block FR Identification for the first or second half of each channel FS
11、C Identification of a DIF block in each channel LF Locked mode flag QNO Quantization number QU Quantization Res Reserved for future use SCT Section type SMP Sampling frequency SSYB Subcode sync block STA Status of the compressed macro block STYPE Signal type (see note) Syb Subcode sync block number
12、TF Transmitting flag VAUX Video auxiliary data VLC Variable length coding VS VAUX source pack VSC VAUX source control pack NOTE STYPE as used in this standard is different from that in ANSI/IEEE 1394. 4 Interface 4.1 Introduction As shown in figure 1, processed audio, video, and subcode data are out
13、put for different applications through a digital interface port. SMPTE 314M-2005 Page 3 of 52 pages 4.2 Data structure The data structure of the compressed stream at the digital interface is shown in figures 2 and 3. Figure 2 shows the data structure for a 50 Mb/s structure, and figure 3 shows the d
14、ata structure for a 25 Mb/s structure. In the 50 Mb/s structure, the data of one video frame are divided into two channels. Each channel is divided into 10 DIF sequences for the 525/60 system and 12 DIF sequences for the 625/50 system. In the 25 Mb/s structure, the data of one video frame are divide
15、d into 10 DIF sequences for the 525/60 system and 12 DIF sequences for the 625/50 system. Each DIF sequence consists of a header section, subcode section, VAUX section, audio section, and video section with the following DIF blocks respectively: Header section: 1 DIF block Subcode section: 2 DIF blo
16、cks VAUX section: 3 DIF blocks Audio section: 9 DIF blocks Video section: 135 DIF blocks As shown in figures 2 and 3, each DIF block consists of a 3-byte ID and 77 bytes of data. DIF data bytes are numbered 0 to 79. Figure 4 shows the data structure of a DIF sequence for a 50 or 25 Mb/s structure. F
17、igure 1 Block diagram on digital interface Audio, Video and Subcode Processing Digital Interface Formatting Digital Interface Subcode In Video In Audio In SMPTE 314M-2005 Page 4 of 52 pages Figure 2 Data structure of one video frame for 50 Mb/s structure Figure 3 Data structure of one video frame fo
18、r 25 Mb/s structure DIF sequence 1,0 Second channel DIF sequence n-1,0 DIF sequence 0,0 DIF sequence 1,1 DIF sequence n-1,1 DIF sequence 0,1 Subcode section Header section Audio and 54 DIF blocks (9 DIF blocks 6 DIF sequences) for the 625/50 system. 4.6.2.1.2 Emphasis Audio encoding is carried out w
19、ith the first order preemphasis of 50/15 s. For analog input recording, emphasis shall be off in the default state. 4.6.2.1.3 Audio error code In the encoded audio data, 8000hshall be assigned as an audio error code to indicate an invalid audio sample. This code corresponds to negative full-scale va
20、lue in ordinary twos complement representation. When the encoded data includes 8000h, it shall be converted to 8001h. 4.6.2.1.4 Relative audio-video timing The audio frame duration equals a video frame period. An audio frame begins with an audio sample acquired within the duration of minus 50 sample
21、s relative to zero samples from the first pre-equalizing pulse of the vertical blanking period of the input video signal. The first pre-equalizing pulse means the start of line number 1 for the 525/60 system, and the middle of line number 623 for the 625/50 system. 4.6.2.1.5 Audio frame processing T
22、his standard provides audio frame processing in the locked mode. The sampling frequency of the audio signal is synchronous with the video frame frequency. Audio data are processed in frames. For an audio channel, each frame contains 1602 or 1600 audio samples for the SMPTE 314M-2005 Page 18 of 52 pa
23、ges 525/60 system or 1920 audio samples for the 625/50 system. For the 525/60 system, the number of audio samples per frame shall follow the five-frame sequence as shown below: 1600, 1602, 1602, 1602, 1602 samples. The sample audio capacity shall be capable of 1620 samples per frame for the 525/60 s
24、ystem or 1944 samples per frame for the 625/50 system. The unused space at the end of each frame is filled with arbitrary values. 4.6.2.2 Audio shuffling The 16-bit audio data word is divided into two bytes; the upper byte which contains MSB, and the lower byte LSB, as shown in figure 9. Audio data
25、shall be shuffled over DIF sequences and DIF blocks within a frame. The data bytes are defined as Dn(n = 0, 1, 2, .) which is sampled at nth order within a frame and shuffled by each Dnunit. The data shall be shuffled through a process expressed by the following equations: 525/60 system: DIF sequenc
26、e number: (INT (n/3) + 2 (n mod 3) mod 5 for CH1, CH3 (INT (n/3) + 2 (n mod 3) mod 5 + 5 for CH2, CH4 Audio DIF block number: 3 (n mod 3) + INT (n mod 45) / 15) where FSC = 0: CH1, CH2 FSC = 1: CH3, CH4 Byte position number: 8 + 2 INT(n/45) for the most significant byte 9 + 2 INT(n/45) for the least
27、 significant byte where n = 0 to 1619 625/50 system: DIF sequence number: (INT (n/3) + 2 (n mod 3) mod 6 for CH1, CH3 (INT (n/3) + 2 (n mod 3) mod 6 + 6 for CH2, CH4 Audio DIF block number: 3 (n mod 3) + INT (n mod 54) / 18) where FSC = 0: CH1, CH2 FSC = 1: CH3, CH4 Byte position number: 8 + 2 INT(n
28、/54) for the most significant byte 9 + 2 INT(n/54) for the least significant byte where n = 0 to 1943SMPTE 314M-2005 Page 19 of 52 pages 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 015 14 13 12 11 10 9 8 76 5 43 2 1 016 bitsMSB LSB8 bits 8 bitsLowerUpperFigure 9 Conversion of audio sample to audio data byte
29、s 4.6.2.3 Audio auxiliary data (AAUX) AAUX shall be added to the shuffled audio data as shown in figures 8 and 10. The AAUX pack shall include an AAUX pack header and data (AAUX payload). The length of the AAUX pack shall be 5 bytes as shown in figure 10, which depicts the AAUX pack arrangement. Pac
30、ks are numbered from 0 to 8 as shown in figure 10. This number is called an audio pack number. Audio pack number 0Audio pack number 1Audio pack number 2Audio pack number 3Audio pack number 4Audio pack number 5Audio pack number 6Audio pack number 7Audio pack number 8PackdataPC0 PC1 PC2 PC3 PC4Audio a
31、uxiliary data5 bytesA0,0 A0,1A1,0 A1,1A2,0 A2,1A3,0 A3,1A4,0 A4,1A5,0 A5,1A6,0 A6,1A7,0 A7,1A8,0 A8,1Audio dataIDByte position number0 1 2 3 - 7 8 - 79PackheaderFigure 10 Arrangement of AAUX packs in audio auxiliary data SMPTE 314M-2005 Page 20 of 52 pages Table 15 shows the mapping of an AAUX pack.
32、 An AAUX source pack (AS) and an AAUX source control pack (ASC) must be included in the compressed stream. Table 15 Mapping of AAUX pack in a DIF sequence Audio pack number Even DIF sequence Odd DIF sequence Pack data 3 4 0 1 AS ASC where Even DIF sequence: DIF sequence number 0, 2, 4, 6, 8 for 525/
33、60 system DIF sequence number 0, 2, 4, 6, 8, 10 for 625/50 system Odd DIF sequence: DIF sequence number 1, 3, 5, 7, 9 for 525/60 system DIF sequence number 1, 3, 5, 7, 9, 11 for 625/50 system 4.6.2.3.1 AAUX source pack (AS) The AAUX source pack is configured as shown in table 16. Table 16 Mapping of
34、 AAUX source pack MSB LSB PC0 0 1 0 1 0 0 0 0 PC1 LF Res AF SIZE PC2 0 CHN Res AUDIO MODE PC3 Res Res 50/60 STYPE PC4 Res Res SMP QU LF: Locked mode flag Locking condition of audio sampling frequency with video signal 0 = Locked mode; 1 = Reserved AF SIZE: The number of audio samples per frame 01010
35、0b = 1600 samples/frame (525/60 system) 010110b = 1602 samples/frame (525/60 system) 011000b = 1920 samples/frame (625/50 system) Others = Reserved CHN: The number of audio channels within an audio block 00b = One audio channel per audio block Others = Reserved The audio block is composed of 45 DIF
36、blocks of the audio section in five consecutive DIF sequences for the 525/60 system, and 54 DIF blocks of the audio section in six consecutive DIF sequences for the 625/50 system. AUDIO MODE: The contents of the audio signal on each audio channel 0000b = CH1 (CH3) 0001b = CH2 (CH4) 1111b = Invalid a
37、udio data Others = Reserved SMPTE 314M-2005 Page 21 of 52 pages 50/60: 0 = 60-field system 1 = 50-field system STYPE: STYPE defines audio blocks per video frame 00000b = 2 audio blocks 00010b = 4 audio blocks Others = Reserved SMP: Sampling frequency 000b = 48 kHz Others = Reserved QU: Quantization
38、000b = 16 bits linear Others = Reserved Res: Reserved bit for future use Default value shall be set to 1 4.6.2.3.2 AAUX source control pack (ASC) The AAUX source control pack is configured as shown in table 17. Table 17 Mapping of AAUX source control pack MSB LSB PC0 0 1 0 1 0 0 0 1 PC1 CGMS Res Res
39、 Res Res EFC PC2 REC ST REC END FADE ST FADE END Res Res Res Res PC3 DRF SPEED PC4 Res Res Res Res Res Res Res Res CGMS: Copy generation management system CGMS Copy possible generation 0 0 Copy free 0 1 1 0 1 1 Reserved EFC: Emphasis audio channel flag 00b = emphasis off 01b = emphasis on Others = r
40、eserved EFC shall be set for each audio block. REC ST: Recording start point 0 = recording start point 1 = not recording start point At a recording start frame, REC ST 0 lasts for a duration of one audio block which is equal to 5 or 6 DIF sequences for each audio channel. REC END: Recording end poin
41、t 0 = recording end point 1 = not recording end point SMPTE 314M-2005 Page 22 of 52 pages At a recording end frame, REC END 0 lasts for a duration of one audio block which is equal to 5 or 6 DIF sequences for each audio channel. FADE ST: Fading of recording start point 0 = fading off 1 = fading on T
42、he information of FADE ST shall be effective only at the recording start frame (REC ST = 0). If FADE ST is 1 at the recording start frame, the output audio signal should be faded in from the first sampling signal of the frame. If FADE ST is 0 at the recording start frame, the output audio signal sho
43、uld not be faded. FADE END: Fading of recording end point 0 = fading off 1 = fading on The information of FADE END shall be effective only at the recording end frame (REC END = 0). If FADE END is 1 at the recording end frame, the output audio signal should be faded out to the last sampling signal of
44、 the frame. If FADE END is 0 at the recording end frame, the output audio signal should not be faded. DRF: Direction flag 0 = reverse direction 1 = forward direction SPEED: Shuttle speed of VTR Shuttle speed of VTR SPEED 525/60 system 625/50 system 0000000 0/120 (=0) 0/100 (=0) 0000001 1/120 1/100 :
45、 : : 1100100 100/120 100/100 (=1) : : Reserved 1111000 120/120 (=1) Reserved : Reserved Reserv1111110 Reserv Reserved 1111111 Data invalid Data invalid RES: Reserved bit for future use. Default value shall be set to 1. 4.7 Video section 4.7.1 ID The ID part of each DIF block in the video section is
46、described in 4.3.1. The section type shall be 100. 4.7.2 Data The data part (payload) of each DIF block in the video section consists of 77 bytes of video data which shall be sampled, shuffled, and encoded. Video data of every video frame are processed as described in clause 5. 4.7.2.1 DIF block and
47、 compressed macro block Correspondence between video DIF blocks and video compressed macro blocks is shown in tables 18 and 19. Table 18 shows correspondence between video DIF blocks for 50 Mb/s structure and video compressed macro blocks of 4:2:2 compression. Table 19 shows correspondence between t
48、he video DIF blocks for 25 Mb/s structure and video compressed macro blocks of 4:1:1 compression. SMPTE 314M-2005 Page 23 of 52 pages The rule defining the correspondence between video DIF blocks and compressed macro blocks is shown below: 50 Mb/s structure 4:2:2 compression if (525/60 system) n = 1
49、0 else n = 12; for (i = 0; in; i+) a = i; b = (i - 6) mod n; c = (i - 2) mod n; d = (i - 8) mod n; e = (i - 4) mod n; p = a; q = 3; for (j = 0; j5; j+) for (k = 0; k27; k+) V (5 k + q),0 of DSNp = CM 2i,j,k; V (5 k + q),1 of DSNp = CM 2i + 1,j,k; if (q = 3) p = b; q = 1; else if (q = 1) p = c; q = 0; else if (q = 0) p = d; q = 2; else i