1、 Rep. ITU-R BS.2054-1 1 REPORT ITU-R BS.2054-1 Audio levels and loudness (2005-2008) CONTENTS Page 1 Introduction 2 2 Simulcasting of analogue and MPEG-1 layer II digital audio 2 3 Loudness considerations. 2 4 Operating in the analogue domain 2 5 Operating in the digital domain 4 6 Use of audio mete
2、rs 4 6.1 VU meters. 4 6.2 Peak meters. 5 6.3 Loudness meters . 5 7 Harmonization of audio alignment levels for digital programme exchange Adoption of SMPTERP155 5 8 Peak audio level 6 9 The studio environment 6 10 Application of volume compression in post-production following the final mix of a tele
3、vision commercial soundtrack 7 11 Ingest of soundtracks into the television broadcasting chain . 8 12 Tests performed at CBS 9 13 Summary. 10 2 Rep. ITU-R BS.2054-1 1 Introduction This Report describes advice for the broadcasting of television programmes which include pre-recorded television adverti
4、sements (commercials) with particular reference to audio levels and loudness. It considers the several processes of studio production, recording onto storage media, transport of the media, and broadcast via a television presentation and transmission system. This descriptive material is provided as g
5、uidance. It describes one administrations approach to dealing with the ingest and transmission of television soundtracks, in particular the factors which contribute to loudness. Question ITU-R 2/6 decides the following Questions should be studied: a) What audio metering characteristics should be use
6、d to provide an accurate indication of signal level in order to assist the operator to avoid overload of digital media? b) What audio metering characteristics should be used to provide an accurate indication of subjective programme loudness? While the work of the Rapporteur on Level Metering within
7、Working Party 6P is progressing, some administrations are providing interim measures to address audio levels and loudness. For instance, Australian television broadcasters have established a common alignment level, 20 dBFS, in accordance with SMPTERP155. Guidelines have been introduced specifying th
8、at volume compression where used after the final mix of a television commercial soundtrack, be restricted to a slope of 2:1 with an onset point of 12 dBFS. 2 Simulcasting of analogue and MPEG-1 layer II digital audio Many administrations are currently progressing through or planning a transition to
9、digital broadcasting. During the transition a requirement exists for simultaneous broadcasting in analogue and digital form. Analogue and digital broadcasting systems have different parameters which in turn influence the operational characteristics of audio processors and, consequently, affect the p
10、erceived loudness of the analogue and digitally transmitted sound. 3 Loudness considerations Within most broadcasting systems, television broadcasters transmit material of varying programme genres contiguously and interspersed with inserted material, including advertisements. There is a potential fo
11、r variations in the perceived loudness of adjacent audio segments. The factors contributing to perceived loudness are complex but the correct alignment of audio levels through the various stages of production and transmission and the careful management of dynamic range and spectral content all contr
12、ibute to preventing extreme variations in perceived loudness. 4 Operating in the analogue domain In the analogue domain, the amount of headroom available in recording devices is limited by the amount of distortion that may occur at high levels of audio. Magnetic recordings will be limited by the int
13、egrity of the magnetic image at high levels of coercion. The analogue television transmission system is limited by the allowable deviation of the FM sound carrier. Rep. ITU-R BS.2054-1 3 Figure 1 describes the parameters and limits of the analogue television transmission audio system as adopted in A
14、ustralia. The lower limit of the analogue studio audio system is determined by the level of the system noise. Typically, this is maintained at least 50 dB below the alignment level providing some 70 dB of dynamic range. Although a studio analogue system and recording devices may provide adequate hea
15、droom for dynamic production, the limiting factor in an analogue broadcasting chain is the allowable deviation of the FM transmitter and broadcasters have to apply some compression and limiting of the audio level at the transmitter input to avoid over deviation, particularly at high frequencies wher
16、e pre-emphasis can reduce the clipping level significantly compared to that at 400 Hz. It is necessary to define the nature of peak level. Peak level in this context and as used throughout this document, really means what is commonly called “quasi peak” levels the levels as measured by a PPM type me
17、ter having (typically) an integration time of 10 ms. True peak level of programme content is usually not measured in the context of analogue audio systems. 4 Rep. ITU-R BS.2054-1 5 Operating in the digital domain Where production, post-production, switching and mixing of television programmes and ad
18、vertisements is carried out in the digital domain, the digital audio system is aligned using a 1 kHz sine wave reference signal 20 dB below full-scale digital “0 dBFS” (i.e. 20 dB below the level at which digital clipping of the sine wave signal commences). The alignment level 20 dBFS is usually equ
19、ated to zero on the stations VU meters or “4” or “TEST” on PPM metres. Figure 2 describes the parameters and limits of the digital audio system as adopted by Australia. In a 16-bit digital audio system where there is a theoretical dynamic-range of some 96 dB, if the average audio level is zero VU an
20、d the quasi peak level is some 11 dB above alignment level then there will be some 9 dB range before digital clipping occurs. 6 Use of audio meters 6.1 VU meters Standard VU (Volume Unit) meters are commonly used in television facilities in many administrations to adjust audio levels during the reco
21、rding, playback and transmission of audio material. A VU meter will indicate an “average” audio level, but cannot indicate the instantaneous level of audio peaks. The standard VU meter has an integration time of 300 ms and thus its “average” reading relates to that particular integration time. Rep.
22、ITU-R BS.2054-1 5 A VU meter scale is not capable of directly representing the full dynamic range of audio signals in analogue or digital systems but VU meters provide a convenient means of ascertaining that the audio level is within normal parameters relative to the alignment level and they provide
23、 some limited indication of loudness. It should be noted that in production and broadcasting establishments there is a wide range of audio level indicators including LED and on-screen devices commonly referred to as “VU meters”. Although most audio level indicators will return consistent measurement
24、s under steady state conditions such as that produced by a sinusoidal alignment tone, there may be inconsistencies between instruments where they are used to measure programme levels in dynamic material. 6.2 Peak meters Where meters such as Peak Programme Meters (PPMs) are used, they indicate quasi
25、peak levels more accurately than the VU meter because their internal time constants are optimized for such measurement. Since Peak Programme Meters are somewhat better indicators of quasi peak audio level they are more useful in the management of system headroom but are of limited use in assessing l
26、oudness. True peak meters indicate the absolute peak level of a signal to a limit of single digital sample values. As such they have no value or use in assessing loudness. However, their value is in controlling digital overload issues. A suitable true peak meter algorithm is described in Recommendat
27、ion ITU-R BS.1770. 6.3 Loudness meters The development of new meters capable of measuring the transmitted electrical signal in a way that will correlate to the human perception of loudness when that electrical signal is reproduced on loudspeakers in a typical “domestic” listening environment has led
28、 to the development of Recommendation ITU-R BS.1770. It is envisaged that such meters will eventually provide producers and broadcasters with an objective means of comparing the perceived loudness of adjacent programme segments or commercial/programme junctions or differences between services solely
29、 by measuring the electrical signal level. 7 Harmonization of audio alignment levels for digital programme exchange Adoption of SMPTERP155 Australian television broadcasters have adopted SMPTE RP 155 audio levels for the digital audio interface. For television recordings a sinusoidal steady state to
30、ne at 1 kHz representing the alignment level of 20 dBFS should precede programme material presented for broadcast. This level is usually equated to zero (zero VU) on the stations VU type audio level meters and is used to align the broadcasters recording and transmission equipment to the same referen
31、ce level as the originating equipment. When measured with a VU type meter, the normal audio level of the programme material that follows the alignment signal should be approximately zero VU. When measured with a PPM or digital equivalent type meter, the normal quasi peak audio level of programme mat
32、erial will, depending on the level of processing used, vary typically in the range of +2 to +9 dB above alignment level (heavily compressed commercials and pop music may not peak above 4 or 4.5 on a PPM. In contrast wide range classical music might read ppm 6 to 6.5). 6 Rep. ITU-R BS.2054-1 8 Peak a
33、udio level In the digital domain quasi audio peaks should not exceed 9 dBFS, i.e. quasi peak excursions should not be more than 11 dB above the alignment level. It must be understood that +11 dB in this context is not a deliberate aim point for production levels, but is a technical limit to be obser
34、ved. This limit will help ensure that short-duration true peaks do not reach 0 dBFS (full scale). In an analogue FM transmission system, quasi audio peaks should not exceed the alignment level by more than 8 dB. These levels are recommended for optimum use of the available headroom in the analogue a
35、nd digital systems. 9 The studio environment Studio analogue sound systems are capable of mixing, recording and reproducing material with dynamic ranges extending from the level of the audible system noise to the level at which distortion is unacceptable. For practical purposes, this represents a dy
36、namic range of some 70 dB. Studio digital sound systems typically operate with dynamic ranges of more than 90 dB. The lower limit in a digital audio system is determined by the theoretical digital noise floor where there is no meaningful data. This lower limit is principally determined by the audio
37、word length (16, 18, 20, 24 bits). The upper limit in a digital audio system is defined as the full-scale digital level, 0 dBFS. At that point, digital clipping occurs because the audio signal cannot be adequately represented by the finite number of data bits available. Using a VU metering system, p
38、rogramme audio material should be recorded such that the normal programme level is around zero VU with occasional louder passages allowed to exceed this level by 2 or 3 dB (+3 dB being the limit on most VU meters). In a normal broadcast audio mix of speech and music and/or sound effects (not a propr
39、ietary multichannel surround mix), the dialogue level will typically fall at around 2 to 3 dB below the alignment level. For significantly processed material such as commercials or pop music the VU meter reading should not be permitted to exceed zero VU. In both the production and transmission phase
40、s of audio, it is common to employ various forms of audio processing. In the production phase processing is a normal part of the creative process. However, both the production and emission processing will have a common aim, i.e. to provide material to the viewer so that the loudest and softest passa
41、ges of the material can be enjoyed without the need to adjust the receiver volume control. An extreme example of this is the emission processing of cinema style audio mixes which require compression for comfortable listening in the home environment. The consequence of compression is reduction of the
42、 ratio between the peak and “average” level of the content. Increasing the “average” level will increase the apparent loudness. The human ear tends to be more sensitive to frequencies in the mid range and if these frequencies are artificially boosted, then again the apparent loudness will increase.
43、The use of audio processing must be judicious so that the compression of the dynamic range of the soundtrack plus any other processing employed does not produce excessively loud or strident material. Soundtrack production studios often employ gates to attenuate or eliminate the sounds below a lower
44、threshold, and peak limiters to prevent audio exceeding the level that causes distortion, or digital clipping. These devices should not be used for the purpose of increasing the relative loudness of the material. Rep. ITU-R BS.2054-1 7 10 Application of volume compression in post-production followin
45、g the final mix of a television commercial soundtrack Australian free-to-air commercial television broadcasters have introduced guidelines specifying that volume compression should, where used after the final mix, be restricted to a slope of 2:1 with an onset point of 12 dBFS. Figure 3 provides a di
46、agrammatic representation of this simple profile. In this profile, an onset of compression at 12 dBFS allows for gentle compression of the upper 3 dB of the signal before reaching the maximum permissible peak level. If any further peak limiting were to be necessary, it would be provided automaticall
47、y by the broadcasters transmission processor. The elements of a soundtrack, namely dialogue, music and effects are subject to various processes during production. Where these elements sit in the final soundtrack, with respect to audio levels and loudness, is the result of a final mix and effectively
48、 it is here that the loudness of the soundtrack will be principally influenced. Rap 2054-03FIGURE 30 dBFSPermitted m level using a PPMaximum 1:1SlopeAlignment levelNocompressionInput level9 dBFS12 dBFS20 dBFS6 dBFS012 dBFS20 dBFS306dBcompressionOnset of compressionOutput level02:1SlopeMaterial that
49、has been compressed may sound louder, even though there is no increase in peak level. This is because compression of a soundtrack may raise the energy content of the sound by reducing the dynamic range (i.e. the difference between the loudest and softest levels of the sound) thereby making it more dense. 8 Rep. ITU-R BS.2054-1 Many modern processors are not calibrated in dB, have constantly varying compression ratios and are likely to be multiband devices which apply different amounts of compression in different frequency bands. This makes it difficult for soundtrack