1、 Rec. ITU-R BR.1352-3 1 RECOMMENDATION ITU-R BR.1352-3 File format for the exchange of audio programme materials with metadata on information technology media (Question ITU-R 58/6) (1998-2001-2002-2007) Scope This Recommendation contains the specification of the broadcast audio extension chunk1and i
2、ts use with PCM-coded, and MPEG-1 or MPEG-2 audio data. Basic information on the RIFF format and how it can be extended to other types of audio data is also included. The ITU Radiocommunication Assembly, considering a) that storage media based on Information Technology, including data disks and tape
3、s, have penetrated all areas of audio production for radio broadcasting, namely non-linear editing, on-air play-out and archives; b) that this technology offers significant advantages in terms of operating flexibility, production flow and station automation and it is therefore attractive for the up-
4、grading of existing studios and the design of new studio installations; c) that the adoption of a single file format for signal interchange would greatly simplify the interoperability of individual equipment and remote studios, it would facilitate the desirable integration of editing, on-air play-ou
5、t and archiving; d) that a minimum set of broadcast related information must be included in the file to document the metadata related to the audio signal; e) that, to ensure the compatibility between applications with different complexity, a minimum set of functions, common to all the applications a
6、ble to handle the recommended file format must be agreed; f) that Recommendation ITU-R BS.646 defines the digital audio format used in audio production for radio and television broadcasting; g) that the need for exchanging audio materials also arises when ISO/IEC 11172-3 and ISO/IEC 13818-3 coding s
7、ystems are used to compress the signal; h) that the compatibility with currently available commercial file formats could minimize the industry efforts required to implement this format in the equipment; j) that a standard format for the coding history information would simplify the use of the inform
8、ation after programme exchange; k) that the quality of an audio signal is influenced by signal processing experienced by the signal, particularly by the use of non-linear coding and decoding during bit-rate reduction processes, 1A chunk is the basic building block of a file in the Microsoft Resource
9、 Interchange File Format (RIFF). 2 Rec. ITU-R BR.1352-3 recommends 1 that, for the exchange of audio programmes on Information Technology media, the audio signal parameters, sampling frequency, coding resolution and pre-emphasis should be set in agreement with the relevant parts of Recommendation IT
10、U-R BS.646; 2 that the file format specified in Annex 1 should be used for the interchange of audio programmes in linear pulse code modulation (PCM) format on Information Technology media; 3 that, when the audio signals are coded using ISO/IEC 11172-3 or ISO/IEC 13818-3 coding systems, the file form
11、at specified in Annex 1 and complemented with Annex 2 should be used for the interchange of audio programmes on Information Technology media2; 4 that, when the file format specified in Annexes 1 and/or 2 is used to carry information on the audio material gathered and computed by a capturing workstat
12、ion (Digital Audio Workstation (DAW), the metadata should conform to the specifications detailed in Annex 3. Annex 1 Specification of the broadcast wave format A format for audio data files in broadcasting 1 Introduction The Broadcast Wave Format (BWF) is based on the MicrosoftWAVE audio file format
13、 which is a type of file specified in the Microsoft“Resource Interchange File Format”, RIFF. WAVE files specifically contain audio data. The basic building block of the RIFF file format, called a chunk, contains a group of tightly related pieces of information. It consists of a chunk identifier, an
14、integer value representing the length in bytes and the information carried. A RIFF file is made up of a collection of chunks. For the BWF, some restrictions are applied to the original WAVE format. In addition the BWF file includes a chunk. This is illustrated in Fig. 1. 2It is recognized that a rec
15、ommendation in that sense could penalize developers using some computer platforms. Rec. ITU-R BR.1352-3 3 Annex contains the specification of the broadcast audio extension chunk that is used in all BWF files. In addition, information on the basic RIFF format and how it can be extended to other types
16、 of audio data is given in Appendix 1. Details of the PCM wave format are also given in Appendix 1. Detailed specifications of the extension to other types of audio data, and metadata are included in Annexes 2 and 3. 1.1 Normative provisions Compliance with this Recommendation is voluntary. However,
17、 the Recommendation may contain certain mandatory provisions (to ensure, e.g. interoperability or applicability) and compliance with the Recommendation is achieved when all of these mandatory provisions are met. The words “shall” or some other obligatory language such as “must” and the negative equi
18、valents are used to express those mandatory provisions. The use of such words does not suggest that compliance with the Recommendation is required of any party. 4 Rec. ITU-R BR.1352-3 2 Broadcast wave format (BWF) file 2.1 Contents of a broadcast wave format file A broadcast wave format file shall s
19、tart with the mandatory MicrosoftRIFF “WAVE” header and at least the following chunks: RIFF(WAVE /*Format of the audio signal:PCM/MPEG */ /*information on the audio sequence */ /* ubxt is required for multi-byte language support only*/ /* Fact chunk is required for MPEG formats only*/ /* MPEG Audio
20、Extension chunk is required for MPEG formats only*/ ) /*sound data */ /* only required when information concerning relevant events impacting quality is needed*/ NOTE 1 Additional chunks may be present in the file. Some of these may be outside the scope of this Recommendation. Applications may or may
21、 not interpret or make use of these chunks, so the integrity of the data contained in such unknown chunks cannot be guaranteed. However, compliant applications shall pass on unknown chunks transparently. 2.2 Existing chunks defined as part of the RIFF standard The RIFF standard is defined in documen
22、ts issued by the Microsoft3Corporation. This application uses a number of chunks that are already defined. These are: fmt-ck fact-ck The current descriptions of these chunks are given for information in Appendix 1 to Annex 1. 2.3 Broadcast audio extension chunk4Extra parameters needed for exchange o
23、f material between broadcasters are added in a specific “Broadcast Audio Extension” chunk defined as follows: broadcast_audio_extension typedef struct DWORD ckID, /* (broadcastextension)ckID=bext. */ DWORD ckSize, /* size of extension chunk */ BYTE ckDatackSize, /* data of the chunk */ typedef struc
24、t broadcast_audio_extension CHAR Description256, /* ASCII : ”Description of the sound sequence”*/ 3Microsoft Resource Interchange File Format, RIFF, available (2005-12) at http:/ 4See 2.4 for ubxt chunk definition, to express the human-readable information of the bext chunk in a multi-byte character
25、 set. Rec. ITU-R BR.1352-3 5 CHAR Originator32, /* ASCI : ”Name of the originator”*/ CHAR OriginatorReference32, /* ASCII : ”Reference of the originator“*/ CHAR OriginationDate10, /* ASCI : ”yyyy:mm:dd“ */ CHAR OriginationTime8, /* ASCI :”hh:mm:ss“ */ DWORD TimeReferenceLow, /* First sample count si
26、nce midnight, low word*/ DWORD TimeReferenceHigh, /* First sample count since midnight, high word */ WORD Version, /* Version of the BWF; unsigned binary number */ BYTE UMID_0, /* Binary byte 0 of SMPTE UMID */ BYTE UMID_63, /* Binary byte 63 of SMPTE UMID */ CHAR Reserved190, /* 190 bytes, reserved
27、 for future use, set to .NULL. * / CHAR CodingHistory, /* ASCII : ”History coding“ */ BROADCAST_EXT, Field Description Description ASCII string (maximum 256 characters) containing a free description of the sequence. To help applications which only display a short description it is recommended that a
28、 resume of the description is contained in the first 64 characters and the last 192 characters are used for details. If the length of the string is less than 256 characters the last one is followed by a null character. (0x00) Originator ASCII string (maximum 32 characters) containing the name of the
29、 originator/producer of the audio file. If the length of the string is less than 32 characters the field is ended by a null character. (0x00) OriginatorReference ASCII string (maximum 32 characters) containing a non ambiguous reference allocated by the originating organization. If the length of the
30、string is less than 32 characters the field is ended a null character. (0x00) A standard format for the “Unique” Source Identifier (USID) information for use in the OriginatorReference field is given in Appendix 3 to Annex 1. OriginationDate 10 ASCII characters containing the date of creation of the
31、 audio sequence. The format is ,year,-,month,-,day, with 4 characters for the year and 2 characters per other item. Year is defined from 0000 to 9999 Month is define from 1 to 12 Day is defined from 1 to 31 The separator between the items should be a hyphen in compliance with ISO 8601. Some legacy i
32、mplementations may use _ underscore, : colon, space, . Stop, reproducing equipment should recognize these separator characters OriginationTime 8 ASCII characters containing the time of creation of the audio sequence. The format is hour,-,minute,-,second with 2 characters per item. Hour is defined fr
33、om 0 to 23. Minute and second are defined from 0 to 59. 6 Rec. ITU-R BR.1352-3 The separator between the items should be a hyphen in compliance with ISO 8601 . Some legacy implementations may use _ underscore, : colon, space, . Stop, reproducing equipment should recognize these separator characters.
34、 TimeReference This field contains the time-code of the sequence. It is a 64-bit value which contains the first sample count since midnight. The number of samples per second depends on the sample frequency that is defined in the field from the . Version An unsigned binary number giving the version o
35、f the BWF. For Version 1, this is set to 0x0001. UMID 64 bytes containing an extended UMID defined by SMPTE 330M. If a32-byte basic UMID is used, the last 32 bytes should be filled with zeros. If no UMID is available, the 64 bytes should be filled with zeros. NOTE The length of the UMID is coded at
36、the head of the UMID itself. Reserved 190 bytes reserved for extension. These 190 bytes should be set to zero. Coding History A variable-size block of ASCII characters comprising 0 or more strings each terminated by The first unused character should be a null character (0x00). Each string should con
37、tain a description of a coding process applied to the audio data. Each new coding application should add a new string with the appropriate info. A standard format for the coding history information is given in Appendix 2 to Annex 1. This information must contain the type of sound (PCM or MPEG) with
38、its specific parameters: PCM: mode (mono, stereo), size of the sample (8, 16 bits) and sample frequency, MPEG: sampling frequency, bit rate, Layer (I or II) and the mode (mono, stereo, joint stereo or dual channel), It is recommended that the manufacturers of the coders provide an ASCII string for u
39、se in the coding history. 2.4 Universal broadcast audio extension chunk The information contained in the Broadcast Audio Extension (bext) chunk defined in 2.3 may additionally be carried by a dedicated chunk called “Universal Broadcast Audio Extension”, or “ubxt” chunk to express the human-readable
40、information of the bext chunk in multi-byte languages. The basic structure of this metadata chunk is the same as that of the bext chunk. Four human-readable items, uDescription, uOriginator, uOriginatorReference and uCodingHistory, are described in UTF-8 (8-bit UCS Transformation Format) instead of
41、ASCII. The first three items have 8 times the data size of the corresponding items in the bext chunk. The structure of the ubxt chunk is defined as follows: typedef struct chunk_header DWORD ckID; /* (universal broadcast extension)ckID=ubxt */ DWORD ckSize; /* size of extension chunk */ Rec. ITU-R B
42、R.1352-3 7 BYTE ckDatackSize; /* data of the chunk */ CHUNK_HEADER; typedef struct universal_broadcast_audio_extension BYTE uDescription256*8; /* UTF-8 : “Description of the sound sequence” */ BYTE uOriginator32*8; /* UTF-8 : “Name of the originator” */ BYTE uOriginatorReference32*8; /* UTF-8 : “Ref
43、erence of the originator” */ CHAR OriginationDate10; /* ASCII : “yyyy:mm:dd” */ CHAR OriginationTime8; /* ASCII : “hh:mm:ss” */ DWORD TimeReferenceLow; /* First sample count since midnight, low word */ DWORD TimeReferenceHigh; /* First sample count since midnight, high word */ WORD Version; /* Versi
44、on of the BWF; unsigned binary number */ BYTE UMID_0; /* Binary byte 0 of SMPTE UMID */ BYTE UMID_63; /* Binary byte 63 of SMPTE UMID */ CHAR Reserved190; /* 190 bytes, reserved for future use, set to “NULL” */ BYTE uCodingHistory; /* UTF-8 : “Coding history“ */ UNIV_BROADCAST_EXT; Field Description
45、 uDescription UTF-8 string, 2 048 bytes or less, containing a description of the sequence. If data is not available or if the length of the string is less than 2 048 bytes, the first unused byte shall be a null character (0x00) uOriginator UTF-8 string, 256 bytes or less, containing the name of the
46、originator of the audio file. If data is not available or if the length of the string is less than 256 bytes, the first unused byte shall be a null character (0x00). uOriginatorReference UTF-8 string, 256 bytes or less, containing a reference allocated by the originating organization. If data is not
47、 available or if the length of the string is less than 256 bytes, the first unused byte shall be a null character (0x00) OriginationDate 10 ASCII characters containing the date of creation of the audio sequence. The format is ,year,-,month,-,day, with 4 characters for the year and 2 characters per o
48、ther item. Year is defined from 0000 to 9999 Month is define from 1 to 12 Day is defined from 1 to 31 The separator between the items should be a hyphen in compliance with ISO 8601. Some legacy implementations may use _ underscore, : colon, space, . Stop; reproducing equipment should recognize these
49、 separator characters OriginationTime 8 ASCII characters containing the time of creation of the audio sequence. The format is hour,-,minute,-,second with 2 characters per item. Hour is defined from 0 to 23. 8 Rec. ITU-R BR.1352-3 Minute and second are defined from 0 to 59. The separator between the items should be a hyphen in compliance with ISO 8