1、 Copyright 2012 by THE SOCIETY OF MOTION PICTURE AND TELEVISION ENGINEERS 3 Barker Avenue, White Plains, NY 10601 (914) 761-1100 Approved January 12, 2012 Table of Contents Page Foreword . 2 Intellectual Property 2 Introduction 2 1 Scope . 3 2 Conformance Notation . 3 3 Normative References . 34 Def
2、initions and Acronyms 44.1 Definitions 44.2 Acronyms from MPEG-2 Systems (ISO/IEC 13818-1) 45 MPEG-2 Transport Stream Constraints . 45.1 Eye Identification Descriptor 56 Video Coding Constraints 56.1 PES Alignment . 56.2 PCR PID . 66.3 Encoder Constraints . 66.4 Visual Fidelity (Informative) . 67 De
3、coder Constraints 6Annex A Bibliography (Informative) . 8Page 1 of 8 pages SMPTE ST 2063:2012 SMPTE STANDARD Stereoscopic 3D Full Resolution Contribution Link Based on MPEG-2 TS SMPTE ST 2063:2012 Page 2 of 8 pages Foreword SMPTE (the Society of Motion Picture and Television Engineers) is an interna
4、tionally-recognized standards developing organization. Headquartered and incorporated in the United States of America, SMPTE has members in over 80 countries on six continents. SMPTEs Engineering Documents, including Standards, Recommended Practices, and Engineering Guidelines, are prepared by SMPTE
5、s Technology Committees. Participation in these Committees is open to all with a bona fide interest in their work. SMPTE cooperates closely with other standards-developing organizations, including ISO, IEC and ITU. SMPTE Engineering Documents are drafted in accordance with the rules given in Part XI
6、II of its Administrative Practices. SMPTE ST 2063 was prepared by Technology Committee 32NF. Intellectual Property At the time of publication no notice had been received by SMPTE claiming patent rights essential to the implementation of this Standard. However, attention is drawn to the possibility t
7、hat some of the elements of this document may be the subject of patent rights. SMPTE shall not be held responsible for identifying any or all such patent rights. Introduction This section is entirely informative and does not form an integral part of this Engineering Document. This document supplies
8、the necessary constraints on coding and transport for full resolution dual image stereoscopic 3D contribution systems relying upon the MPEG-2 Transport Stream (TS). Dual image stereoscopic 3D imaging systems deliver two images (left eye and right eye) that are arranged to be seen simultaneously, or
9、near simultaneously, by the left and right eyes. Viewers then perceive increased depth in the picture, which becomes more like the natural binocular viewing experience. Users of this document (who are expected to be either producers of content, producers of the equipment used in the stereoscopic 3D
10、“ecosystem,” or producers of the compression equipment constrained by this document) must understand a number of system design concerns which are (and must be) out of scope of this document. Among these are the types and rigging of the cameras used, the system timing necessary to ensure both eyes im
11、ages are kept in time alignment, and many other items. Some of the key expectations are stated in Notes within the body of the document. The reader should also be aware that this document is “codec agnostic,” in that any video codec for which there is a defined mapping into the MPEG-2 TS may be used
12、. These codecs might support 10-bit video or 8-bit video compression or support 4:2:2 video versus 4:2:0. Such choices are left to the implementers. Since the importance of having precisely matched pairs of images is well documented, camera systems (camera bodies and lenses) should also be the same
13、make and model (if separate units). Any signal processing required to invert images must be accounted for by the system design to maintain proper system timing. Regardless of how the pair of images are created, it is assumed that they are properly aligned in space and time at the input of the compre
14、ssion chain. This standard therefore describes means by which the compression chain can be as transparent as possible to the time and space alignment of the image pair. SMPTE ST 2063:2012 Page 3 of 8 pages 1 Scope This document specifies how a stereoscopic 3D High Definition (“HDTV”) video contribut
15、ion system based on the MPEG-2 Transport Stream (TS) performs coding, multiplexing, and decoding. It defines constraints for the input image pair, the bitstream, the multiplexing, timing synchronization, use of a single PCR PID, and signaling, as well as for the video coding and decoder behavior. Th
16、is document is codec agnostic (i.e., any codec for which there are defined methods for transport via MPEG-2 TS is permitted). The input image pair needs to have the same image structure (horizontal and vertical pixel count, scanning system, colorimetry, and frame rate) and be coincident in time. 2 C
17、onformance Notation Normative text is text that describes elements of the design that are indispensable or contains the conformance language keywords: “shall“, “should“, or “may“. Informative text is text that is potentially helpful to the user, but not indispensable, and can be removed, changed, or
18、 added editorially without affecting interoperability. Informative text does not contain any conformance keywords. All text in this document is, by default, normative, except: the Introduction, any section explicitly labeled as “Informative“ or individual paragraphs that start with “Note:” The keywo
19、rds “shall“ and “shall not“ indicate requirements strictly to be followed in order to conform to the document and from which no deviation is permitted. The keywords, “should“ and “should not“ indicate that, among several possibilities, one is recommended as particularly suitable, without mentioning
20、or excluding others; or that a certain course of action is preferred but not necessarily required; or that (in the negative form) a certain possibility or course of action is deprecated but not prohibited. The keywords “may“ and “need not“ indicate courses of action permissible within the limits of
21、the document. The keyword “reserved” indicates a provision that is not defined at this time, shall not be used, and may be defined in the future. The keyword “forbidden” indicates “reserved” and in addition indicates that the provision will never be defined in the future. A conformant implementation
22、 according to this document is one that includes all mandatory provisions (“shall“) and, if implemented, all recommended provisions (“should“) as described. A conformant implementation need not implement optional provisions (“may“) and need not implement them as described. Unless otherwise specified
23、, the order of precedence of the types of normative information in this document shall be as follows: Normative prose shall be the authoritative definition; Tables shall be next; followed by formal languages; then figures; and then any other language forms. 3 Normative References The following stand
24、ards contain provisions which, through reference in this text, constitute provisions of this standard. At the time of publication, the editions indicated were valid. All standards are subject to revision, and parties to agreements based on this standard are encouraged to investigate the possibility
25、of applying the most recent edition of the standards indicated below. ISO/IEC 13818-1:2007|ITU-T H.222.0-2006, Information Technology Generic Coding of Moving Pictures and Associated Audio Information: Systems SMPTE ST 292-2:2011, Dual 1.5 Gb/s Serial Digital Interface for Stereoscopic Image Transpo
26、rt SMPTE ST 2063:2012 Page 4 of 8 pages 4 Definitions and Acronyms 4.1 Definitions Coincident in Time: With respect to dual video signals for stereoscopic television, this means that not only are the two video signals “genlocked“, but that they represent the same moments in time for the image displa
27、yed. Contribution: The term “contribution” in the context of this document means a very high quality link typically between venue and studio where the material is expected to undergo post-production and eventual emission to viewers by other links. EAV: An abbreviation for End of Active Video. Genloc
28、k: Abbreviation of “sync Generator Lock.” Genlock is a technique for locking a devices internal sync structures (and thus video structures) to a common external reference (a “sync generator”). This is especially important for stereoscopic 3D imaging systems where both eye images need to be aligned p
29、recisely both vertically and horizontally in respect to each other. MPTS: An abbreviation for Multi-Program Transport Stream. payload_id: Acronym for Payload Identifier. This is a structure defined by SMPTE 352 and used by SMPTE ST 292-2 to signal payload characteristics. Fields in the eye_identific
30、ation_descriptor( ) are derived from the payload_id of the incoming video. Program: The collection of all elements within the Transport Stream that have the same program number. SAV: An abbreviation for Start of Active Video. 4.2 Acronyms from MPEG-2 Systems (ISO/IEC 13818-1) DTS: An abbreviation fo
31、r Decode Time Stamp. PCR: An abbreviation for Program Clock Reference. PES: An abbreviation for Packetized Elementary Stream. PID: An abbreviation for packet identifier. PMT: An abbreviation for Program Map Table. PTS: An abbreviation for Presentation Time Stamp. TS: An abbreviation for (MPEG-2) Tra
32、nsport Stream. uimsbf: An abbreviation for “unsigned integer, most significant bit first.” 5 MPEG-2 Transport Stream Constraints The coded images for both eyes, along with associated audio and data shall be carried in an MPEG-2 multi-program Transport Stream (MPTS), which shall have only two program
33、s, one for the left eye images, and the other for the right eye images. Within the TS, PCR, PTS, and DTS clock sample timestamps shall be present and used by decoders to maintain the output images as coincident in time. There shall be a single PCR PID for both programs. SMPTE ST 2063:2012 Page 5 of
34、8 pages The coded video images for each eye shall be identified, as to which are left eye images and which are right eye images, by a descriptor as specified in Section 5.1 which shall be included in the descriptor loop immediately following the ES_info_length field in the PMT describing that video
35、elementary stream. Note: This descriptor is constructed per the guidance provided by Annex C.8.6 in ISO/IEC 13818-1. An implementation conformant to this standard shall be identified by the presence in the PMT of the pair of descriptors specified in Section 5.1. 5.1 Eye Identification Descriptor The
36、 PMT for a given eyes video stream shall be identified by the use of the eye_identification_descriptor( ) defined in this section. Table 1 eye_identification_descriptor( ) syntax Syntax No. of bits Format eye_identification_descriptor( ) descriptor_tag 8 uimsbf descriptor_length 8 uimsbf eye_identif
37、ier 4 uimsbf audio_status 4 uimsbf descriptor_tag The value for the Eye Identification Descriptor tag is 0xCB. descriptor_length This is an 8-bit field specifying the number of bytes of the descriptor immediately following descriptor_length field and shall be1. eye_identifier This is an 4-bit field
38、specifying the value 0x0 for left eye program or 0x1 for right eye program. All other values shall be reserved. The value should be derived from information in the payload Identifier of the input video to the encoder. audio_status This is an 4-bit field specifying the value 0x0 to indicated that the
39、 right eye program status is unknown or dont care, 0x1 to indicated that the right eye program carries a copy of left eye audio, or 0x2 to indicated that the right eye program carries additional channels. All other values shall be reserved. The value should be derived from the payload Identifier of
40、the incoming video. 6 Video Coding Constraints 6.1 PES Alignment Each video PES packet shall contain the start of only one access unit, as defined in Section 2.1.1 of ISO/IEC 13818-1. In the PES header, data_alignment_indicator shall be 1. The first byte of a PES packet payload shall be the first by
41、te of a video access unit. Each PES header shall contain a PTS and DTS if DTS differs from the PTS. The values for the PTS clock samples for a corresponding left eye and right eye image shall be identical within 45 microseconds (which equals 4 clock periods of the PTS/DTS clock.) Note: For assistanc
42、e in conversion of MPEG-2 PTS/DTS to absolute time and/or frame time, see SMPTE EG 40. SMPTE ST 2063:2012 Page 6 of 8 pages 6.2 PCR PID As required by Section 5, a single PCR PID shall be referenced by the video streams for both eyes, as well as for associated audio and data services. The PCR PID sh
43、all be a unique PID value or the value of either video PID. 6.3 Encoder Constraints It is recommended that the input video signal sent to the left eye and right eye encoders be compliant with SMPTE ST 292-2. In accordance with SMPTE ST 292-2, the timing difference between the serial digital clocks a
44、nd EAV / SAV of the Left eye stream and the Right eye stream may differ by up to 400 ns at the source. Any such timing difference shall be removed prior to encoding. When two encoders are used it is recommended that a common synchronizing signal (“genlock”) be used. Coding decisions should be shared
45、 between left eye and right eye encoders, including such decisions as repeat-field removal, and picture coding-type decisions. If repeated frames of video are to be removed before compression (e.g., “3/2 pulldown” removal), the frame removal process should be the same for both left eye and right eye
46、 images. Pairs of coded pictures should have the same H and V structure (which permits the decoder to recover the input video image structure). There are no special constrains for the placement of audio and ancillary data embedded in the video input stream beyond those given in SMPTE ST 292-2. The e
47、ncoder/decoder system shall ensure that the placement of audio and ancillary data on output matches that of the input digital interface. Audio data is normally carried in one or more packetized elementary streams, using a uniquely assigned PID or PIDs values per other standards (see Annex A). Ancill
48、ary data may be carried in Transport Stream packets identified by a uniquely assigned PID value per other SMPTE standards. Note 1: System designers need to assume that both sources will be genlocked, and the images will be coincident in time. This is especially important for stereoscopic 3D where bo
49、th eye images need to be aligned precisely. Note 2: Both video encoders need to also be the same make and model. 6.4 Visual Fidelity (Informative) The visual fidelity of both image streams (both eyes) should match. To that end, the picture structure and picture type should match between eyes. Bitrate allocation for corresponding left eye and right eye coded pictures should be similar. The coded video quality of the two streams should match between both eyes (with a minor difference between any measured values permissible). 7 Decoder Const
copyright@ 2008-2019 麦多课文库(www.mydoc123.com)网站版权所有
备案/许可证编号:苏ICP备17064731号-1