1、 Rec. ITU-R BT.1687-1 1 RECOMMENDATION ITU-R BT.1687-1 Video bit-rate reduction for real-time distribution* of large-screen digital imagery applications for presentation in a theatrical environment (Question ITU-R 15/6) (2004-2006) Scope This Recommendation covers video bit-rate reduction for real-t
2、ime distribution of large-screen digital imagery applications for presentation in a theatrical environment and include examples of bit rates for MPEG-2 and MPEG-4/AVC compressed signals used in LSDI applications. Baseband LSDI digital image systems are defined in Recommendation ITU-R BT.1680. The IT
3、U Radiocommunication Assembly, considering a) that it is desirable that large-screen digital imagery (LSDI) programme distribution and presentation should preserve the creative intent and the picture quality available on the distribution master1, as far as the envisaged programme presentation condit
4、ions allow; b) that Recommendation ITU-R BT.1680 recommends digital image systems that meet the requirements of LSDI applications for presentation in a theatrical environment and should be used for distribution of those applications; c) that, in the distribution of LSDI applications, it is necessary
5、 to apply bit-rate reduction to programme signals, in order to conserve bit rate and thus reduce transmission time and cost, without however adversely affecting programme quality as perceived by the viewer; d) that the human visual system has a much higher resolution for luminance details than for c
6、olour details; e) that the ISO/IEC JTC 1/SC 29/WG 11 has performed and documented extensive studies on the bit rate required to provide virtually transparent video compression2with various compression algorithms and various image systems3; * ITU defines “distribution” as follows: “To carry televisio
7、n programmes when no further post-production processing is expected”. 1The term “distribution master” is used here to denote the programme master that is obtained from the original finished master after it has been tailored (in content and in quality) to the distribution medium. 2ITU defines “transp
8、arent bit-rate reduction” as “A BRR process that does not affect the subjective quality of sound or picture sequences”. 3See for instance Document ISO/IEC JTC 1/SC 29/WG 11 MPEG2003/N6231, December 2003, Waikoloa Report of The Formal Verification Tests on AVC (ISO/IEC 14496-10 ITU-T Rec. H.264). 2 R
9、ec. ITU-R BT.1687-1 f) that those tests have confirmed that MPEG-2 encoding at MPHL with an optimized encoder4at 20 Mbit/s is virtually transparent to source quality for the image systems indicated in Recommendation ITU-R BT.1680 at a picture rate of 25 or 30 Hz, progressive or interlaced; g) that M
10、PEG-2 video compression is widely available and used today, and it complies with the ITU patent policy; h) that quite recently, a new video compression system derived from MPEG-2 and called MPEG-4/AVC (ITU-T Recommendation H.264) was developed and, even in initial implementations of its codecs, it p
11、rovides a compression efficiency about double that of MPEG-2; j) that very recently some extensions of the original MPEG-4/AVC source-coding system, known as the Fidelity Range Extensions (FRExt), were developed, which can provide an even higher coding efficiency, supporting the coding of the 1080/1
12、920p image system at 50 and 60 frames/s; k) that the use of the MPEG-4/AVC source-coding system may prove to be an attractive option for LSDI applications, in view of the much lower net bit rate required to deliver LSDI programmes, although MPEG-4/AVC codecs are more complex and they can thus be exp
13、ected to be somewhat more expensive; l) that MPEG-4/AVC offers great flexibility in the selection of image coding parameters such as 4:4:4 sampling, 10 or 12 bits/sample and frame rates beyond 72 Hz; m) that several countries are currently implementing services for the presentation of LSDI programme
14、s in a theatrical environment or plan to implement them in the near future, recommends 1 that early implementations of LSDI applications intended for presentation in a theatrical environment should use a digital baseband video signal that conforms to Recommendation ITU-R BT.1680 at the input and out
15、put interfaces of the LSDI distribution chain; 2 that, in view of the higher resolution of the human visual system to luminance than to colour details, 4:2:0 video encoding or preferably 4:2:2 video encoding should be used for the distribution of LSDI programmes for their theatrical presentation; 3
16、that, since the LSDI programme signal will not normally undergo further creative image post-processing after its distribution, the use of interframe bit-rate reduction should be preferred for the distribution of LSDI programmes for their theatrical presentation, since it provides a higher compressio
17、n efficiency than intraframe coding; it is recognized that, when local programme insertion (EDL or play list) is required, intraframe coding is preferable, and that in this case the minimum video bit-rate value may be higher; 4 that since the bit-rate reduction used in the distribution channel shoul
18、d be virtually transparent to the quality on the distribution master under the envisaged presentation conditions, the MPEG-2 interframe bit-rate reduction method at MPHL (HiQ) and at a minimum video net bit rate of the order of 20 Mbit/s should be considered in the short term, for real-time distribu
19、tion of LSDI programmes for their theatrical presentation (the use of encoding at 4:2:2 HL will require a slightly higher bit rate); 4The term “optimized encoder” is used here to denote an MPEG-2-compliant encoder implemented in hardware and incorporating advanced solutions, that is optimized with r
20、espect to the MPEG-2 TM5 “reference model” encoder, which is implemented in software and was adopted by the ISO/IEC JTC 1/SC 29/WG 11 as a performance benchmark. Rec. ITU-R BT.1687-1 3 5 that the use of the MPEG-4/AVC bit-rate reduction method should also be considered as a valuable alternative, par
21、ticularly when it is desired to use a much lower net bit rate for the delivery of LSDI programmes, and it is acceptable to use somewhat more complex codecs; NOTE 1 ITU-T Recommendation H.264 is available in electronic version at the following address: http:/www.itu.int/md/R03-SG06-C-0211/en. 6 that
22、the possibility should be analysed, in due time, to use even more efficient compression codecs that may emerge in the future, when they will comply with the ITU patent policy and will be thoroughly tested, widely available and preferably interoperable with MPEG-2 and MPEG-4/AVC codecs. Appendix 1 (I
23、nformative) Example parameters and minimal tools to source-code various members of the LSDI family of image systems in Recommendation ITU-R BT.709 using MPEG-4/AVC (ITU-T Recommendation H.264) This Appendix suggests examples of parameters and minimal tools of the MPEG-4/AVC source-coding method (ITU
24、-T Recommendation H.264), that would be used to compress various members of the LSDI family of image systems specified in Recommendation ITU-R BT.709-5 (Part 2). It also provides an indication of the bit rates required for the virtually transparent transport of those signals when so source-coded. Re
25、c. ITU-R BT.709 family member MPEG-4/AVC parameters and minimal tools Bit rate for virtually transparent transport 1 920 1080 24/25p Level 4 Coding tools High 10 Main profile tools X 4:2:0 chroma format X 8-bit sample bit depth X 8 8 vs. 4 4 transform adaptivity X Quantization scaling matrices X Sep
26、arate CBand CRQP control X Monochrome video format X 9- and 10-bit sample bit depth X 4:2:2 chroma format 11- and 12-bit sample bit depth 4:4:4 chroma format Residual colour transform Predictive lossless coding 9-11 Mbit/s Contribution quality 7-10 Mbit/s Distribution quality 4 Rec. ITU-R BT.1687-1
27、Rec. ITU-R BT.709 family member MPEG-4/AVC parameters and minimal tools Bit rate for virtually transparent transport 1 920 1080 60/50i Level 4 Coding tools High 4:2:2 Main profile tools X 4:2:0 chroma format X 8-bit sample bit depth X 8 8 vs. 4 4 transform adaptivity X Quantization scaling matrices
28、X Separate CBand CRQP control X Monochrome video format X 9- and 10-bit sample bit depth X 4:2:2 chroma format X 11- and 12-bit sample bit depth 4:4:4 chroma format Residual colour transform Predictive lossless coding 10-15 Mbit/s Contribution quality 8-12 Mbit/s Distribution quality 1 920 1080 60/5
29、0p Level 4.2 Coding tools High 4:2:2 Main profile tools X 4:2:0 chroma format X 8-bit sample bit depth X 8 8 vs. 4 4 transform adaptivity X Quantization scaling matrices X Separate CBand CRQP control X Monochrome video format X 9- and 10-bit sample bit depth X 4:2:2 chroma format X 11- and 12-bit sa
30、mple bit depth 4:4:4 chroma format Residual colour transform Predictive lossless coding 18-20 Mbit/s Emission format NOTE Use of 10 bits/sample may not increase the bit rate over the one needed for 8 bits. Rec. ITU-R BT.1687-1 5 Rec. ITU-R BT.709 family member MPEG-4/AVC parameters and minimal tools
31、 Bit rate for virtually transparent transport 1 920 1080 24/25p Level 4 Coding tools High 4:2:2 Main profile tools X 4:2:0 chroma format X 8-bit sample bit depth X 8 8 vs. 4 4 transform adaptivity X Quantization scaling matrices X Separate CBand CRQP control X Monochrome video format X 9- and 10-bit
32、 sample bit depth X 4:2:2 chroma format X 11- and 12-bit sample bit depth 4:4:4 chroma format Residual colour transform Predictive lossless coding 8-10 Mbit/s Emission High quality “Movies” mode Attachment 1 Information on the MPEG-4/AVC source-coding method Sullivan et al., 2004 ITU-T Recommendatio
33、n H.264/MPEG-4 (Part 10) Advanced Video Coding (commonly referred as H.264/AVC) is the newest entry in the series of international video coding standards. It is currently the most powerful and state-of-the-art standard, and was developed by a Joint Video Team (JVT) consisting of experts from ITU-Ts
34、Video Coding Experts Group (VCEG) and ISO/IECs Moving Picture Experts Group (MPEG). As it has been the case with past standards, its design provides the most current balance between coding efficiency, implementation complexity and cost-based on the current state of VLSI design technology (CPUs, DSPs
35、, ASICs, FPGAs, etc.). In the process, a standard was created that improved coding efficiency by a factor of about two minimum (on average) over MPEG-2 while keeping the cost within an acceptable range. In July 2004, a new amendment was added to this standard, called the Fidelity Range Extensions (F
36、RExt, Amendment 1), which demonstrates an even higher coding efficiency against MPEG-2, potentially attaining as much as 3:1 for some key applications. While having a broad range of applications, the initial H.264/AVC standard (as it was completed in May 2003), was primarily focused on “entertainmen
37、t-quality” video, based on 8-bits/sample, and 4:2:0 chroma sampling. Given its time constraints, it did not include support for use in the most demanding professional environments, and the design had not been focused on the highest video 6 Rec. ITU-R BT.1687-1 resolutions. For applications such as p
38、rogram contribution, programme distribution, and studio editing and post-processing, it may be necessary to: Use more than 8 bits per sample of source video accuracy. Use a higher resolution for colour representation than that which is typical in consumer applications (i.e. to use 4:2:2 or 4:4:4 sam
39、pling as opposed to 4:2:0 chroma sampling format). Perform source editing functions such as alpha blending (a process for blending of multiple video scenes, best known for use in weather reporting where it is used to key video of a newscaster over video of a map or weather-radar scene). Use very hig
40、h bit rates. Use very high resolution. Achieve very high fidelity even representing some parts of the video losslessly. Avoid colour-space transformation rounding error. Use RGB colour representation. The FRExt project produced a suite of four new profiles collectively called the High Profiles: The
41、High Profile (HP), supporting 8-bit video with 4:2:0 sampling, addressing high-end consumer use and other applications using high-resolution video without a need for extended chroma formats or extended sample accuracy. The High 10 Profile (Hi10P), supporting 4:2:0 video with up to 10 bits of represe
42、ntation accuracy per sample. The High 4:2:2 Profile (H422P), supporting up to 4:2:2 chroma sampling and up to 10 bits per sample. The High 4:4:4 Profile (H444P), supporting up to 4:4:4 chroma sampling, up to 12 bits per sample, and additionally supporting efficient lossless coding and an integer res
43、idual colour transform for coding RGB video while avoiding colour-space transformation errors. As FRExt is still rather new, and as some of the benefits of FRExt are perceptual rather than objective, it is somewhat more difficult to measure its capability. One relevant data point is the result of a
44、subjective quality evaluation done by the Blu-ray Disc Association (BDA). The summary results are reproduced in Fig. 1 from the test report Wedi and Kashiwagi, 2004. This test, conducted on a 24 frame/s film programme with 1 920 1080 progressive-scanning, shows the following nominal results (which s
45、hould not be considered rigorously statistically proven): The high profile of FRExt produced nominally better video quality than MPEG-2 when using only one third as many bits (8 Mbit/s versus 24 Mbit/s). The high profile of FRExt produced nominally transparent (i.e. difficult to distinguish from the
46、 original video without compression) video quality at only 16 Mbit/s. The quality bar (3.0), considered adequate for use on high-definition packaged media in this organization, was significantly surpassed using only 8 Mbit/s. Again also, there were sub-optimalities in the H.264/AVC coding method use
47、d in these tests. Thus, the bit rate can likely be reduced significantly below 8 Mbit/s while remaining above the 3.0 quality bar establishing a quality sufficient to call it “acceptable HD” in that demanding application. Rec. ITU-R BT.1687-1 7 FIGURE 1 Comparison of MPEG-2 to H.264 The result of an
48、 example objective (peak signal-to-noise ratio, PSNR) comparison test performed by FastVDO5is shown in Fig. 2. These objective results confirm the strong performance of the high profile. (Again, sub-optimal uses of B frames make the plotted performance conservative for FRExt.) FIGURE 2 PSNR comparis
49、on 5FastVDO is a company specializing in technology for media communications and infrastructure software. It is located in Columbia, MD, USA. 8 Rec. ITU-R BT.1687-1 References SULLIVAN, G.J., TOPIWALA, P. and LUTHRA, A. 2004 The H.264/AVC Advanced Video Coding Standard: Overview and Introduction to the Fidelity Range Extensions. Presented at the SPIE Conference on Applications of Digital Image Processing XXVII. Special Session on Advances in the New Emerging Standard: H.264/AVC. WEDI, T. and