SMPTE RDD 29-2014 Dolby Atmos Bitstream Specification.pdf

上传人:boatfragile160 文档编号:1046321 上传时间:2019-03-27 格式:PDF 页数:18 大小:320.42KB
下载 相关 举报
SMPTE RDD 29-2014 Dolby Atmos Bitstream Specification.pdf_第1页
第1页 / 共18页
SMPTE RDD 29-2014 Dolby Atmos Bitstream Specification.pdf_第2页
第2页 / 共18页
SMPTE RDD 29-2014 Dolby Atmos Bitstream Specification.pdf_第3页
第3页 / 共18页
SMPTE RDD 29-2014 Dolby Atmos Bitstream Specification.pdf_第4页
第4页 / 共18页
SMPTE RDD 29-2014 Dolby Atmos Bitstream Specification.pdf_第5页
第5页 / 共18页
点击查看更多>>
资源描述

1、 Copyright 2014 by THE SOCIETY OF MOTION PICTURE AND TELEVISION ENGINEERS 3 Barker Avenue, White Plains, NY 10601 (914) 761-1100 Approved June 17, 2014 The attached document is a Registered Disclosure Document prepared by the proponent identified below. It has been examined by the appropriate SMPTE

2、Technology Committee and is believed to contain adequate information to satisfy the objectives defined in the Scope, and to be technically consistent. This document is NOT a Standard, Recommended Practice or Engineering Guideline, and does NOT imply a finding or representation of the Society. Errors

3、 in this document should be reported to the proponent identified below, with a copy to engsmpte.org. This document is intended to allow the interpretation of Dolby Atmos Bitstream files. It is not intended to support the development of hardware or software applications that create or process these f

4、iles. Creation and processing of such files is reserved to individuals and organizations that have entered into agreements with the proponent identified below for this purpose. Use of this document to produce or process Dolby Atmos files using non-Dolby tools would potentially cause user confusion,

5、diminished sound quality as experienced by content consumers, and damage to the reputation of the Dolby Atmos brand and to Dolby Laboratories itself. All other inquiries in respect of this document, including inquiries as to intellectual property requirements that may be attached to use of the discl

6、osed technology, should be addressed to the proponent identified below. Proponent contact information: Dean Bullock Dolby Laboratories Inc. 100 Potrero Ave. San Francisco, CA 94103 Email Page 1 of 18 pages SMPTE REGISTERED DISCLOSURE DOCUMENT Dolby Atmos Bitstream Specification SMPTE RDD 29:2014 SM

7、PTE RDD 29:2014 Page 2 of 18 pages Table of Contents Page 1 Scope 3 2 Bitstream Organization . 3 2.1 ATMOSFrame Element 4 2.2 BedDefinition Element 4 2.3 ObjectDefinition Element 4 2.4 AudioDataDLC Element . 4 3 Bitstream Conventions . 4 3.1 Position . 4 3.2 Relative distance coding . 5 3.3 Amplitud

8、e Gain . 5 3.4 Plex Coding 5 4 Bit Stream Syntax . 6 4.1 Syntax of ReadElement() . 6 4.2 Syntax of ATMOSFrame( ) . 7 4.3 Syntax of BedDefinition1() 7 4.4 Syntax of ObjectDefinition1() 8 4.5 Syntax of AudioDataDLC() . 9 5 Bit Stream Field Description . 11 5.1 ReadElement() Data Fields 11 5.2 ATMOSFra

9、me() data fields . 11 5.3 BedDefinition1() Data Fields . 13 5.4 ObjectDefinition1() Data Fields . 13 5.5 AudioDataDLC() Data Fields 16 Introduction Dolby Atmos is an advanced cinema sound format comprising an audio essence and metadata stream played through specialized renderers in the cinema. SMPTE

10、 RDD 29:2014 Page 3 of 18 pages 1 Scope This document defines the syntax of a frame-based Dolby Atmos bit stream. The bit stream carries audio essence and metadata necessary to reproduce a complete audio program. 2 Bitstream Organization The audio program is segmented into Frames, with Frames transm

11、itted 24, 25, 30, 48, 50, 60, 96, 100, or 120 times per second. The audio Frames are aligned with the program edit units. In most cases the edit units, and picture and audio frames are of the same duration and are time aligned. Support for frame rates above 120 Hz is not defined. All audio data is e

12、ncapsulated into “elements,” similar in concept to “chunks” in the RIFF1 format. Each element begins with a unique identifier called ElementID. The second field, ElementSize, indicates the size in bytes of the entire element, not including the ElementID and ElementSize. Elements can contain sub-Elem

13、ents. The ElementSize includes the size of all sub elements. Sub-Elements contain additional description data related to the parent Element. At the top level, the entire audio frame is contained in a single ATMOSFrame element. All audio essence and metadata elements for a given frame are contained a

14、s sub Elements of the ATMOSFrame Element as shown below. 1 Resource Interchange File Format Frame Element Bed Definition Element Audio Element Object Definition Element SMPTE RDD 29:2014 Page 4 of 18 pages Currently there are 4 element types specified in the Dolby Atmos bit stream; ATMOSFrame, BedDe

15、finition, ObjectDefinition, and AudioDataDLC. The purpose of each element is described in the following sections. 2.1 ATMOSFrame Element The ATMOSFrame element contains all the information that is common to the entire Dolby Atmos frame. Specifically, the ATMOSFrame contains the Dolby Atmos version,

16、audio sample rate, the audio bit depth, the audio frame rate, and the maximum number of rendered audio assets. All raw audio assets and metadata must be sub elements of the ATMOSFrame element. 2.2 BedDefinition Element A Dolby Atmos bed is a collection of audio channels. An audio channel is an audio

17、 stream that is intended to be played back with a nominal location (e.g. “Left” channel) or function (e.g. LFE). The BedDefinition element contains a list of the audio assets and the associated channel names. 2.3 ObjectDefinition Element The Dolby Atmos system allows audio assets to be panned to any

18、 location independent of the physical or nominal loudspeaker configuration; these panned audio assets are called objects. The ObjectDefinition element provides all the information to pan an audio object. Each ObjectDefinition element updates the position of a single audio object at approximately 20-

19、ms time intervals. The Dolby Atmos presentation can have a large number of audio objects that will be independently rendered to the appropriate locations. To achieve this, the Dolby Atmos bit stream will contain multiple ObjectDefinition elements that must be direct sub-elements of the ATMOSFrame el

20、ement and must have a unique MetaID. 2.4 AudioDataDLC Element Each AudioDataDLC element contains the audio assets for one track of audio, channel or object. Every audio track is losslessly compressed and exists for the duration of the program. The AudioDataDLC element supports sample rates of 48 kHz

21、 or 96 kHz with 24-bit resolution. All AudioDataDLC elements must be direct sub-elements of the ATMOSFrame element and must have a unique AudioDataID. Note: audio object tracks are typically sparse; most audio events conveyed by objects have limited time extent, with digital zero signal between even

22、ts. The AudioDataDLC element can efficiently indicate periods of silence to dramatically decrease the audio payload. 3 Bitstream Conventions 3.1 Position Axes and Origin Many of the metadata elements contained in the bitstream specify a relative position or size. In most cases position is described

23、relative to the playback environment using a unit cube to describe the room boundaries. The origin is taken to be the front left corner of the room. Position is then described using Euclidian (x,y,z) coordinates, assigned as follows: x: lateral, or left/right position x=0 corresponds to left wall; x

24、=1 corresponds to right wall. y: longitude, or front/back position y=0 corresponds to front wall; y=1 corresponds to back wall. z: elevation, or up/down position z=0 corresponds to a plane aligned with the screen, side and rear loudspeakers; z=1 corresponds to the ceiling. SMPTE RDD 29:2014 Page 5 o

25、f 18 pages For example, (0, 0, 0) - front left corner, 0 elevation (left screen speaker), (1, 0, 0) - front right corner, 0 elevation (right screen speaker), and (0.5, 0.5, 1) - middle of ceiling. Metadata that describes position relative to the room uses the unit axes and origin described above; th

26、e location along each axis is coded using the distance coding method described below. 3.2 Relative distance coding 12 Bit Distance Throughout the bitstream, distance metadata on or within the unit cube is coded as a 12 bit distance mantissa (D12) that maps linearly into the range 0,1; for example, 0

27、x000-0.0, and 0xfff-1.0. If D12 is interpreted as an unsigned 12 bit unsigned integer, D12 is mapped to a distance value as follows: Distance = D12/(212 1), 0 1) ObjectDecorCoefsb 8 SMPTE RDD 29:2014 Page 9 of 18 pages ObjectDefintion1 Syntax Word Size /* end if(PanInfoExists) */ /* reads extra bits

28、 to get to byte alignment relative to the start of the frame */ AlignBits . VARIABLE AudioDescription . 8 if(AudioDescription n NumPredRegions; n + ) RegionLengthn . 4 FIROrdern . 5 IIROrdern 5 for(m = 1; m = FIROrder; m +) FIRPredictornm 10 for(m = 1; m = IIROrder; m +) IIRPredictornm . 10 /* Coded

29、 residual */ for(n = 0; n NumSubBlocks; n +) CodeType 1 if(CodeType = 0) /* PCM Residual */ BitDepth 5 for(l = 0; l SubBlockSize; l +) Residualn * SubBlockSize + l BitDepth SMPTE RDD 29:2014 Page 10 of 18 pages AudioDataDLC Syntax Word Size else /*Rice/Golomb Residual */ RiceCode . 5 for(l = 0; l Su

30、bBlockSize; l +) Residualn * SubBlockSize + l. VARIABLE /* 96kHz Residual Data */ if(SampleRate = 0x1) /* Predictor information */ NumPredRegions . 2 for(n = 0; n NumPredRegions; n + ) RegionLengthn . 4 FIROrdern . 5 IIROrdern 5 for(m = 1; m = FIROrder; m +) FIRPredictornm 10 for(m = 1; m = IIROrder

31、; m +) IIRPredictornm . 10 /* Coded residual */ for(n = 0; n NumSubBlocks; n +) CodeType 1 if(CodeType = 0) /* PCM Residual */ BitDepth 5 for(l = 0; l SubBlockSize; l +) Residualn * SubBlockSize + l BitDepth else /*Rice/Golumn Residual */ RiceCode . 5 for(l = 0; l SubBlockSize; l +) Residualn * SubB

32、lockSize + l . VARIABLE /* Each Element must keep track of the number of bits read */ AlignBits VARIABLE /* end of AudioDataDLC*/ SMPTE RDD 29:2014 Page 11 of 18 pages 5 Bit Stream Field Description 5.1 ReadElement() Data Fields 5.1.1 ElementID Plex(8) Each Element block starts with an ElementID. Th

33、e ElementID defines the type of element and the contents of the element. Depending on the ElementID the decoder will perform different tasks. Table 1 provides a list of ElementIDs. If the ElementID is not defined in the system, then the decoder shall skip the element. Table 1 Dolby Atmos Element IDs

34、 ElementID Name Value Meaning ATMOS_FRAME 0x08 Frame Header BED_DEFINITION1 0x10 Bed Definition Type 1 RESERVED 0x20 Reserved OBJECT_DEFINITION1 0x40 Object Definition Type 1 RESERVED 0x80 Reserved RESERVED 0x100 Reserved AUDIO_DATA_DLC 0x200 Audio Data (DLC encoded) 5.1.2 ElementSize Plex(8) Elemen

35、tSize, indicates the size in bytes of the entire element, not including the ElementID and ElementSize. For the Frame Element, the ElementSize is the entire audio frame (not including the ATMOSFrame ElementID and ElementSize) as all other elements are contained as sub elements. 5.2 ATMOSFrame() data

36、fields 5.2.1 ATMOSVersion 8 bits The ATMOSVersion specifies the version of the Dolby Atmos Bit Stream. This field currently has the value of 0x1. This document describes the protocol with ATMOSVersion = 1; 5.2.2 SampleRate 2 bits The SampleRate code specifies the sampling rate of the audio data. All

37、 audio tracks, channels and objects, must have the same sampling rate. The SampleRate code has following definitions as shown in Table 2. Table 2 Sample Rate code SampleRate Code Meaning 0x0 48000 samples per second 0x1 96000 samples per second 0x2 RESERVED 0x3 RESERVED 5.2.3 BitDepth 2 bits The Bit

38、eDepth code specifies the bit depth of the object audio data. All audio tracks must have same bit depth. The BitDepth code has the following meanings as specified in Table 3. Only 24-bits per audio sample are currently supported. SMPTE RDD 29:2014 Page 12 of 18 pages Table 3 Bit Depth Code BitDepth

39、Code Meaning 0x0 RESERVED 0x1 24 bits per audio sample 0x2 RESERVED 0x3 RESERVED 5.2.4 FrameRate 4 bits The FrameRate code specifies the audio frame rate. The FrameRate code has the following meanings as specified by Table 4. Table 4 Frame Rate Code FrameRate Code Meaning 0x0 24 frames per second 0x

40、1 25 frames per second 0x2 30 frames per second 0x3 48 frames per second 0x4 50 frames per second 0x5 60 frames per second 0x6 96 frames per second 0x7 100 frames per second 0x8 120 frames per second 0x9-0xF RESERVED The FrameRate code also controls the sample count (SampleCount) contained in each a

41、udio asset as specified by Table 5. Table 5 Sample Count versus Frame Rate Code and Sample Rate FrameRate Code Sample Count 48 kHz Sample Count 96 kHz 0x0 2000 4000 0x1 1920 3840 0x2 1600 3200 0x3 1000 2000 0x4 960 1920 0x5 800 1600 0x6 500 1000 0x7 480 960 0x8 400 800 0x9-0xF RESERVED RESERVED 5.2.

42、5 MaxRendered Plex(8) The MaxRendered code specifies the maximum audio assets that will be rendered during playback of the Dolby Atmos frame for theaters with the optimal target playback. For example, for a stream with 9.1 channel beds and 118 objects, the MaxRendered count would be set to 128. 5.2.

43、6 SubElementCount Plex(8) The SubElementCount code is the number of elements contained in the current element. SMPTE RDD 29:2014 Page 13 of 18 pages 5.3 BedDefinition1() Data Fields 5.3.1 MetaID Plex(8) MetaID is the unique ID that aids the system track metadata information between audio frames. 5.3

44、.2 ChannelCount Plex(4) The channel count is the number of channels that make up the bed. 5.3.3 ChannelID Plex(4) The ChannelID code specifies the known channel locations. Table 6 provides a list of channelIDs and the associated loudspeaker name. Table 6 Channel IDs ChannelID Code Meaning 0x0 Left S

45、creen Speaker 0x1 Right Screen Speaker 0x2 Center Screen Speaker 0x3 LFE 0x4 Reserved 0x5 Reserved 0x6 Left Side Surround (7.1) 0x7 Right Side Surround (7.1) 0x8 Left Rear Surround (7.1) 0x9 Right Rear Surround (7.1) 0xA Left Top Surround (9.1) 0xB Right Top Surround (9.1) otherwise Reserved 5.3.4 A

46、udioDataID Plex(8) The AudioDataID code is a unique identifier to each of the raw mono audio assets carried in the bit stream. An AudioDataID of NULL (0) indicates no audio asset. 5.4 ObjectDefinition1() Data Fields 5.4.1 NumPanSubBlocks Informative The NumPanSubBlocks specifies the division of the

47、frame into sub frames of approximately 5 ms, as specified by Table 7. SMPTE RDD 29:2014 Page 14 of 18 pages Table 7 Number of Pan Sub Blocks and Sub Block Size versus Sample Rate and Frame Rate Sample Rate Frame Rate (sec-1) NumPanSubBlocks PanSubBlockSize Duration (ms) 48 kHz 24 8 250 5.2 48 25 8 2

48、40 5.0 48 30 8 200 4.2 48 48 4 250 5.2 48 50 4 240 5.0 48 60 4 200 4.2 48 96 2 250 5.2 48 100 2 240 5.0 48 120 2 200 4.2 96 kHz 24 8 500 5.2 96 25 8 480 5.0 96 30 8 400 4.2 96 48 4 500 5.2 96 50 4 480 5.0 96 60 4 400 4.2 96 96 2 500 5.2 96 100 2 480 5.0 96 120 2 400 4.2 5.4.2 PanInfoExists 1 bit The

49、 PanInfoExists bit specifies when the panning information is updated in each sub block boundary. The panning information always exists for the first sub block of a frame. The decoder should assume that if the PanInfoExists bit is set to zero then the panning information is repeated from the previous sub block. 5.4.3 ObjectPosXsb, ObjectPosYsb, ObjectPosZsb 1

展开阅读全文
相关资源
猜你喜欢
相关搜索

当前位置:首页 > 标准规范 > 国际标准 > 其他

copyright@ 2008-2019 麦多课文库(www.mydoc123.com)网站版权所有
备案/许可证编号:苏ICP备17064731号-1