1、INTERNATIONAL STANDARD ISO/IEC 11172-I First edition 1993-08-01 Information technology - Coding of moving pictures and associated audio for digital storage media at up to about I,5 Mbit/s - Part 1: Systems Technologies de Iinforma tion - Codage de /image animbe et du son associ6 pour /es supports de
2、 stockage numdrique jusqui) environ 1,5 Mbit/s - Pat-tie 7: Syst by contrast, the semantic rules apply to the combined stream in its entirety. The systems specification does not specify the architecture or implementation of encoder or decoders. However, bitstream properties do impose functional and
3、performance requirements on encoders and decoders. For instance, encoders must meet minimum clock tolerance requirements. Notwithstanding this and other requirements, a considerable degree of freedom exists in the design and implementation of encoders and decodes. A prototypical audio/video decoder
4、system is depicted in figure 1 to illustrate the function of an ISO/IEC 11172 decoder. The architecture is not unique - System Decoder functions including decoder timing control might equally well be distributed among elementary stream decoders and the Medium Specific Decoder - but this figure is us
5、eful for discussion. The prototypical decoder design does not imply O IxI=Owhenx=O xl=-xwhenxo 0 XC 0 -1 x Greater than. e Greater than or equal to. Shift right with sign extension. Shift left with zero till. 2.2.5 Assignment = Assignment operator. 2.2.6 Mnemonics The following mnemonics are defined
6、 to describe the different data types used in the coded bit-stream. bslbf pa equal to 1 for single-channel mode, 2 in other modes. (Audio) Granule of 3 * 32 subband samples in audio Layer II, 18 * 32 sub-band samples in audio Layer III. (Audio) The main-data portion of the bitstream contains the sca
7、lefactors, Huffman encoded data, and ancillary information. (Audio) The location in the bitstream of the beginning of the main-data for the frame. The location is equal to the ending location of the previous frames main-data plus one bit. It is calculated from the main expr2; expr3) ( exprl is an ex
8、pression specifying the initialization of the loop. Normally it data-element specifies the initial state of the counter. expd is a condition specifying a test . . . made before each iteration of the loop. The loop terminates when the condition I is not true. expr3 is an expression that is performed
9、at the end of each iteration of the loop, normally it increments a counter. Note that the most common usage of this construct is as follows: for ( i = 0; i c n; i+) ( The group of data elements occurs n times. Conditional constructs data-element within the group of data elements may depend on the va
10、lue of the . . . loop control variable i, which is set to zero for the first occurrence, 1 incremented to one for the second occurrence, and so forth. As noted, the group of data elements may contain nested conditional constructs. For compactness, the ( ) may be omitted when only one data element fo
11、llows. data-element data-element 0 is an array of data The number of dam elements is indicated by the context. data-element n data-element n is the n+lth element of an array of data. data-element mn data-element mn is the m+l,n+l th element of a two-dimensional array of data-element Imn data-element
12、 lmn is the l+l,m+l,n+l th element of a three-dimensional array of data. data-element mn is the inclusive range of bits between bit m and bit n in the data-element. While the syntax is expressed in procedural terms, it should not be assumed that 2.4.3 implements a satisfactory decoding procedure. In
13、 particular, it defines a correct and error-free input bitstream. Actual decoders must include a means to look for start codes in order to begin decoding correctly, and to identify errors, erasures or insertions while decoding. The methods to identify these situations, and the actions to be taken, a
14、re not standardized. Definition of bytealigned function The function bytealigned 0 returns 1 if the current position is on a byte boundary, that is the next bit in the bit stream is the ftrst bit in a byte. Otherwise it returns 0. Definition of nextbits function The function nextbits () permits comp
15、arison of a bit string with the next bits to be decoded in the bit StfMIIl. Definition of next-start-code function The nextSarUode function removes any zero bit and zero byte stuffing and locates the next start code. Spltax No. of bits Mnemonic next_start_codeo ( while ( !bytealigned() ) zero-bit 1
16、9, I, 0 while ( nextbits != 000 0000 0000 0000 0000 0001 ) zero-byte 8 “00000000” 1 This function checks whether the current position is byte aligned. If it is not, zero stuffing bits are present. After that any number of zero bytes may be present before the start-code. Therefore start-codes are alw
17、ays byte aligned and may be preceded by any number of zero stuffing bits. 14 Copyright American National Standards Institute Provided by IHS under license with ANSINot for ResaleNo reproduction or networking permitted without license from IHS-,-,-0 ISOAEC ISOAEC 11172-l: 1993 (E) 2.4 Requirements 2.
18、4.1 Coding structure and parameters The system coding layer allows one or more elementary streams to be combined into a single stream. Data from each elementary stream am multiplexed and encoded together with information that allows elementary streams to be replayed in synchronism. ISOAEC 11172 mult
19、iplexed stream An ISOLEC 11172 stream consists of one or more elementary streams multiplexed together. Each elementary stream consists of access units, which are the coded representation of presentation units. The presentation unit for a video elementary stream is a picture. The corresponding access
20、 unit includes all the coded data for the picture. The access unit containing the first coded picture of a group of pictures also includes any preceding data from that group of pictures, as defined in 2.4.2.4 in ISOLEC 11172-2, starting with the group-start-code. The access unit containing the fast
21、coded picture after a sequence header, as defined in 2.4.2.3 in part 2, also includes that sequence header. The sequence-end-code is included in the access unit containing the last coded picture of a sequence. (See 2.4.2.2 in ISO/IEC 11172-2 for the definition of the sequence-end-code). The presenta
22、tion unit for an audio elementary stream is the set of samples that corresponds to samples from an audio frame (see 2.4.3.1, 2.4.2.1, and 2.4.2.2 in ISO/IEC 11172-3 for the definition of an audio frame). Data from elementary streams is stored in packets. A packet consists of a packet header followed
23、 by packet data. The packet header begins with a 32-bit start-code that also identifies the stream to which the packet data belongs. The packet header may contain decoding and/or presentation time-stamps (DTS and PTS) that refer to the first access unit that commences in the packet. The packet data
24、contains a variable number of contiguous bytes from one elementary stream. Packets are organised in packs. A pack commences with a pack header and is followed by zero or more packets. The pack header begins with a 32-bit start-code. The pack header is used to store timing and bitrate information. Th
25、e stream begins with a system header that optionally may be repeated. The system header carries a summary of the system parameters defined in the stream. 2.4.2 System target decoder The semantics of the multiplexed stream specified in 2.4.4 and the constraints on these semantics specified in 2.4.5 r
26、equire exact definitions of decoding events and the times at which these events occur. The definitions needed are set out in this International Standard using a hypothetical decoder known as the system target decoder (STD). The STD is a conceptual model used to define these terms precisely and to mo
27、del the decoding process during the construction of ISO/IEC 11172 streams. The STD is defined only for this purpose. Neither the architecture of the STD nor the timing described precludes unintcrruptcd, synchronized play-back of ISO/IEC 11172 multiplexed streams from a variety of decoders with diffe
28、rent architectures or timing schedules. 15 Copyright American National Standards Institute Provided by IHS under license with ANSINot for ResaleNo reproduction or networking permitted without license from IHS-,-,-ISOAEC 11172-1: 1993 (E) 0 ISOAEC P,(k) h-#3 Notation A System Control Figure 2 - Diagr
29、am of system target decoder The following notation is used to describe the system target decoder and is partially illustrated in figure 2. i, i are indices to bytes in the ISO/IEC 11172 multiplexed stream. The first byte has index 0. j is an index to access units in the elementary streams. k, k,k” a
30、re indices to presentation units in the elementary streams. n is an index to the elementary streams. M(i) is the i* byte in the ISO/IJX 11172 multiplexed stream. Mi) indicates the time in seconds at which the i* byte of the ISO/IEC 11172 multiplexed stream enters the system target decoder. The value
31、 tm(0) is an arbitmry constant. SCR(i) is the time encoded in the SCR field measured in units of the 90 kHz system clock MO G(i) P i N; i+) ( packet-data-byte I 24 bslbf 8 uimsbf 16 uimsbf 8 2 1 13 4 3 1 15 1 15 1 4 3 1 15 1 15 1 4 3 1 15 1 15 1 8 bslbf 8 bslbf 7 bslbf bslbf bslbf uimsbf bslbf bslbf
32、 bslbf bslbf bslbf bslbf bslbf bslbf bslbf bslbf bslbf bslbf bslbf bslbf bslbf bslbf bslbf bslbf bslbf bslbf bslbf 21 Copyright American National Standards Institute Provided by IHS under license with ANSINot for ResaleNo reproduction or networking permitted without license from IHS-,-,-ISO/IEC 1117
33、2-l: 1993 (E) 0 ISOAEC 2.4.4 Semantic definition of fields in syntax 2.4.4.1 iSO/iEC 11172 Layer iso- 1172-end-code - The iso- 1172-end-code is the bit string “0000 0000 0000 0000 0000 0001 1011 1001” (OOOOO1B9 in hexadecimal). It terminates the ISO/IEC 11172 multiplexed stream. 2.4.4.2 Pack Layer P
34、ack pack-start-code - The pack-start-code is the bit string “0000 0000 0000 0000 0000 Oool 1011 1010” (OOOOO1BA in hexadecimal). It identifies the beginning of a pack. system-clock-reference - The system-clock-reference (SCR) is a 33-bit number coded in three separate fields. It indicates the intend
35、ed time of arrival of the last byte of the system-clock-reference field at the input of the system target decoder. The value of the SCR is measured in the number of periods of a 9OkHz system clock with a tolerance specified in 2.4.2. IJsing the notation of 2.4.2, the value encoded in the system-cloc
36、k-reference is: SCR(i) = NINT (system-clock-frequency * (tm(i) ) % 233 for i such that M(i) is the last byte of the coded system-clock-reference field. marker-bit - A marker-bit is a one bit field that has the value “1”. mux-rate - This is a positive integer specifying the rate at which the system t
37、arget decoder receives the ISO/IEC 11172 multiplexed stream during the pack in which it is included. The value of mux-rate is measured in units of 50 bytes/s, rounded upwards. The value zero is forbidden. The value represented in mux-rate is used to define the time of arrival of bytes at the input t
38、o the system target decoder in 2.4.2. Ihe value encoded in the mux-rate field may vary from pack to pack in an ISO/lEC 11172 multiplexed StlBXll. System Header system-header-start-code - The system-header-start-code is the bit string “0000 0000 0000 0000 0000 0001 1011 1011” (000001BB in hexadecimal
39、). It identifies the beginning of a system header. header-length - The header-length shall be equal to the number of bytes in the system header following the header-length field. Note that future extensions of this part of ISOAEC 11172 may extend the system healer. rate-bound - The rate-bound is an
40、integer value greater than or equal to the maximum value of the mux-rate field coded in any pack of the ISO/IEC 11172 multiplexed stream. It may be used by a decoder to assess whether it is capable of decoding the entire stream. audio-bound - The audio-bound is an integer, in the inclusive range fro
41、m 0 to 32, greater than or equal to the maximum number of ISO/lEC 11172 audio streams in the ISO/IEC 11172 multiplexed stream for which the decoding processes are simultaneously active. For the purpose of this clause, the decoding process of an MPEG audio stream is active, if the STD buffer is not e
42、mpty, or if the decoded access unit is being presented in the STD model. fured-flag - The fwed-flag is a one-bit flag. If its value is set to “1” fixed bitrate operation is indicated. If its value is set to “0” variable bitrate operation is indicated. During fixed bitrate operation, the value encode
43、d in all system-clock-reference fields in the multiplexed ISO/IEC 11172 stream shall adhere to the following linear equation: SCR(i) = NINT (cl * i + c2) % 2 33 WhfX cl is a real-valued constant valid for all i; c2 is a real-valued constant valid for all i; i is the index in the ISOAEC 11172 multipl
44、exed stream of the final byte of any system-clock-reference field in the stream. 22 Copyright American National Standards Institute Provided by IHS under license with ANSINot for ResaleNo reproduction or networking permitted without license from IHS-,-,-0 ISOAEC ISOAEC 11172-l : 1993 (E) CSPSJlag -
45、The CSPS-flag is a one-bit flag. If its value is set to “1” the ISO/IEC 11172 multiplexed stream meets the constraints defined in 2.4.6. system-audio-lock-flag - The system-audio-lock-f is a one-bit flag indicating that there is a specified, constant rational relationship between the audio sampling
46、rate and the system clock frequency in the system target decoder. Subclause 2.4.2 defines system-clock-frequency and the audio sampling rate is specified in ISO/IEC 11172-3. The system-audio-lock-flag may only be set to “1” if, for all presentation units in all audio elementary streams in the ISO/IE
47、C 11172 multiplexed stream, the ratio of system-clock-frequency to the actual audio sampling rate, SCASR, is constant and equal to the value indicated in the following table at the nominal sampling rate indicated in the audio stream. SCASR = system-clock else BSn = STD-buffer-size-bound * 1024; 2.4.
48、4.3 Packet Layer packet-start-code-prefix - The packet-start-code-prefix is a 24-bit code. Together with the stream-id that follows, it constitutes a packet start code that identifies the beginning of a packet. The packet-star-code-prefix is the bit string “0000 0000 OOOO 0000 0000 0001” (OOOOOl in
49、hexadecimal). stream-id - The stream-id specifies the type and number of the elementary stream as defined by the stream-id table, table 1 in 2.4.4.2. Each elementary stream in an lSO/IEC 11172 multiplexed stream shall have a unique stream-id. packet-length - The packet-length specifies the number of bytes remaining in the packet after the packet-length field. St&E-byte - This is a fixed g-bit value equal to “1111 1111” that can be inserted by the