1、 ETSI TS 103 448 V1.1.1 (2016-09) AC-4 Object Audio Renderer for Consumer Use TECHNICAL SPECIFICATION ETSI ETSI TS 103 448 V1.1.1 (2016-09)2 Reference DTS/JTC-038 Keywords audio, broadcasting, digital ETSI 650 Route des Lucioles F-06921 Sophia Antipolis Cedex - FRANCE Tel.: +33 4 92 94 42 00 Fax: +3
2、3 4 93 65 47 16 Siret N 348 623 562 00017 - NAF 742 C Association but non lucratif enregistre la Sous-Prfecture de Grasse (06) N 7803/88 Important notice The present document can be downloaded from: http:/www.etsi.org/standards-search The present document may be made available in electronic versions
3、 and/or in print. The content of any electronic and/or print versions of the present document shall not be modified without the prior written authorization of ETSI. In case of any existing or perceived difference in contents between such versions and/or in print, the only prevailing document is the
4、print of the Portable Document Format (PDF) version kept on a specific network drive within ETSI Secretariat. Users of the present document should be aware that the document may be subject to revision or change of status. Information on the current status of this and other ETSI documents is availabl
5、e at https:/portal.etsi.org/TB/ETSIDeliverableStatus.aspx If you find errors in the present document, please send your comment to one of the following services: https:/portal.etsi.org/People/CommiteeSupportStaff.aspx Copyright Notification No part may be reproduced or utilized in any form or by any
6、means, electronic or mechanical, including photocopying and microfilm except as authorized by written permission of ETSI. The content of the PDF version shall not be modified without the written authorization of ETSI. The copyright and the foregoing restriction extend to reproduction in all media. E
7、uropean Telecommunications Standards Institute 2016. European Broadcasting Union 2016. All rights reserved. DECTTM, PLUGTESTSTM, UMTSTMand the ETSI logo are Trade Marks of ETSI registered for the benefit of its Members. 3GPPTM and LTE are Trade Marks of ETSI registered for the benefit of its Members
8、 and of the 3GPP Organizational Partners. GSM and the GSM logo are Trade Marks registered and owned by the GSM Association. ETSI ETSI TS 103 448 V1.1.1 (2016-09)3 Contents Intellectual Property Rights 5g3Foreword . 5g3Modal verbs terminology 5g3Introduction 5g31 Scope 7g32 References 7g32.1 Normativ
9、e References 7g32.2 Informative References 7g33 Definitions, symbols, abbreviations and conventions 7g33.1 Definitions 7g33.2 Symbols 8g33.3 Abbreviations . 8g33.4 Conventions 9g34 System overview 10g34.1 Architecture 10g34.1.1 Introduction. 10g34.1.2 Requirements 10g34.2 Input . 11g34.2.1 Audio 11g
10、34.2.2 Metadata . 11g34.2.3 Playback parameters . 13g34.3 Output . 13g35 Functional overview . 13g35.1 Metadata pre-processing . 13g35.1.1 Processing order 13g35.1.1.1 Introduction . 13g35.1.1.2 Requirements 14g35.1.2 Screen-anchored coordinates 14g35.1.2.1 Introduction . 14g35.1.2.2 Recommendations
11、 . 15g35.1.2.3 Algorithmic details 16g35.1.3 Zone constraints 16g35.1.3.1 Introduction . 16g35.1.3.2 Requirements 16g35.1.4 Snap 18g35.1.4.1 Introduction . 18g35.1.4.2 Requirements 18g35.1.4.3 Algorithmic details 19g35.1.5 Metadata for future use cases 19g35.2 Source panner . 19g35.2.1 Source panner
12、 architecture 19g35.2.1.1 Introduction . 19g35.2.1.2 Requirements 20g35.2.2 Rendering point sources 20g35.2.2.1 Introduction . 20g35.2.2.2 Requirements 21g35.2.2.3 Algorithmic details 21g35.2.3 Rendering objects with size 23g35.2.3.1 Introduction . 23g35.2.3.2 Requirements 23g35.2.3.3 Algorithmic de
13、tails 25g35.2.4 Rendering objects with divergence . 28g35.2.4.1 Introduction . 28g3ETSI ETSI TS 103 448 V1.1.1 (2016-09)4 5.2.4.2 Requirements 28g35.2.4.3 Algorithmic details 29g35.2.5 Rendering objects with speaker-anchored coordinates . 29g35.2.5.1 Introduction . 29g35.2.5.2 Requirements 29g35.3 T
14、rim 30g35.3.1 Introduction. 30g35.3.2 Requirements 30g35.3.3 Algorithmic details . 31g35.4 Gain mixer 32g35.4.1 Introduction. 32g35.4.2 Requirements 32g35.4.3 Algorithmic details . 32g35.5 Ramp mixer 32g35.5.1 Introduction. 32g35.5.2 Requirements 32g35.5.3 Algorithmic details . 32g3Annex A (normativ
15、e): Loudspeaker configurations and source panner coordinates 34g3A.1 Introduction 34g3A.2 Requirements 34g3A.3 Tables . 34g3History 39g3ETSI ETSI TS 103 448 V1.1.1 (2016-09)5 Intellectual Property Rights IPRs essential or potentially essential to the present document may have been declared to ETSI.
16、The information pertaining to these essential IPRs, if any, is publicly available for ETSI members and non-members, and can be found in ETSI SR 000 314: “Intellectual Property Rights (IPRs); Essential, or potentially Essential, IPRs notified to ETSI in respect of ETSI standards“, which is available
17、from the ETSI Secretariat. Latest updates are available on the ETSI Web server (https:/ipr.etsi.org/). Pursuant to the ETSI IPR Policy, no investigation, including IPR searches, has been carried out by ETSI. No guarantee can be given as to the existence of other IPRs not referenced in ETSI SR 000 31
18、4 (or the updates on the ETSI Web server) which are, or may be, or may become, essential to the present document. Foreword This Technical Specification (TS) has been produced by Joint Technical Committee (JTC) Broadcast of the European Broadcasting Union (EBU), Comit Europen de Normalisation ELECtro
19、technique (CENELEC) and the European Telecommunications Standards Institute (ETSI). NOTE: The EBU/ETSI JTC Broadcast was established in 1990 to co-ordinate the drafting of standards in the specific field of broadcasting and related fields. Since 1995 the JTC Broadcast became a tripartite body by inc
20、luding in the Memorandum of Understanding also CENELEC, which is responsible for the standardization of radio and television receivers. The EBU is a professional association of broadcasting organizations whose work includes the co-ordination of its members activities in the technical, legal, program
21、me-making and programme-exchange domains. The EBU has active members in about 60 countries in the European broadcasting area; its headquarters is in Geneva. European Broadcasting Union CH-1218 GRAND SACONNEX (Geneva) Switzerland Tel: +41 22 717 21 11 Fax: +41 22 717 24 81 Modal verbs terminology In
22、the present document “shall“, “shall not“, “should“, “should not“, “may“, “need not“, “will“, “will not“, “can“ and “cannot“ are to be interpreted as described in clause 3.2 of the ETSI Drafting Rules (Verbal forms for the expression of provisions). “must“ and “must not“ are NOT allowed in ETSI deli
23、verables except when used in direct citation. Introduction Motivation Current industry trends for authoring and reproduction of audio content include immersive audio and support for personalization of the audio, as well as many different speaker setups and layouts. Different means of immersive audio
24、 and personalization are provided in ETSI TS 103 190-2 1. Object-based audio is one of the means for supporting these trends. Objects can be thought of as the input tracks to a mixing console, the mixing console being the renderer. But objects are more than audio tracks. They carry metadata that is
25、authored with the tracks. Contemporary mixing consoles have automated gains. For a renderer accepting object-based audio, those gains are driven by the objects own metadata. Metadata is also used to define object location and size, as well as many other ancillary parameters that control the object p
26、resentation. ETSI ETSI TS 103 448 V1.1.1 (2016-09)6 The final mix output by the contemporary mixing console is targeted at a specific playback system. Other channel configurations can be derived from the mix, but they are not necessarily what is monitored. A renderer, located in a playback device in
27、 a consumers home, acts as the mixing console for that device, with the advantage that the speaker setup is known to the renderer. The renderer can use the location and size metadata defined for each object to produce the playback experience that best matches the content creators intention, within t
28、he possibilities and constraints of the available speaker setup. The present document specifies an object audio renderer for use with ETSI TS 103 190-2 1, using the metadata as specified therein. Structure of the document The present document is structured as follows. Clause 4 specifies the input an
29、d output interfaces, and the architecture of the renderer. Clause 5 specifies the processing blocks of the renderer. These are: - Metadata preprocessing, specified in clause 5.1 - Source panners, specified in clause 5.2 - Trim processing, specified in clause 5.3 - Gain mixing, specified in clause 5.
30、4 - Ramp mixing, specified in clause 5.5 Annex A lists the supported loudspeaker configurations and associated parameters that are utilized by the processing blocks of the renderer. An overview of the incoming metadata and the result of the rendering process is presented in clause 4, which makes it
31、a proper starting point when reading the document. ETSI ETSI TS 103 448 V1.1.1 (2016-09)7 1 Scope The present document defines an extension to the AC-4 codec. The present document specifies a consumer object-based audio renderer for use with the AC-4 codec as specified in ETSI TS 103 190-2 1, and th
32、e object-based audio metadata specified therein. The renderer takes the object audio essence and the corresponding metadata defined in ETSI TS 103 190-2 1 as inputs, and produces loudspeaker feeds for consumer loudspeaker layouts. 2 References 2.1 Normative References References are either specific
33、(identified by date of publication and/or edition number or version number) or non-specific. For specific references, only the cited version applies. For non-specific references, the latest version of the reference document (including any amendments) applies. Referenced documents that are not found
34、to be publicly available in the expected location might be found at http:/docbox.etsi.org/Reference. NOTE: Although any hyperlinks included in this clause were valid at the time of publication, ETSI cannot guarantee their long term validity. The following referenced documents are necessary for the a
35、pplication of the present document. 1 ETSI TS 103 190-2: “Digital Audio Compression (AC-4) Standard; Part 2: Immersive and personalized audio“. 2 Recommendation ITU-R BS.2051-0: “Advanced sound system for programme production“. 2.2 Informative References References are either specific (identified by
36、 date of publication and/or edition number or version number) or nonspecific. For specific references, only the cited version applies. For non-specific references, the latest version of the reference document (including any amendments) applies. NOTE: While any hyperlinks included in this clause were
37、 valid at the time of publication, ETSI cannot guarantee their long term validity. The following referenced documents are not necessary for the application of the present document, but they assist the user with regard to a particular subject area. Not applicable. 3 Definitions, symbols, abbreviation
38、s and conventions 3.1 Definitions For the purposes of the present document, the following terms and definitions apply: bitstream: sequence of bits channel: audio signal intended for playback by one of a set of dedicated loudspeakers with predetermined locations, e.g. Left, Right, and Centre channels
39、 codec: system that consists of an encoder and a decoder ETSI ETSI TS 103 448 V1.1.1 (2016-09)8 divergence: panning mechanism including a control to balance between rendering the object as a point source and panning the object across a specified horizontal distance gain: multiplicative factor applie
40、d to a signal immersive audio: multi-channel audio for playback with loudspeaker layouts in more than one plane EXAMPLE: 3/4/4 or 3/2/2 low-frequency effects: band-limited channel specifically intended for deep, low-pitched sounds loudspeaker feed: audio signal that the renderer has determined to be
41、 played back by a certain loudspeaker metadata: data about data object: object audio essence plus associated object-based audio metadata object audio essence: part of the object that is PCM coded object-based audio: audio content composed of objects panner: device or an algorithm that performs panni
42、ng panning: distribution of a sound signal into a stereo or multi-channel loudspeaker layout point source: single localized source of audio with negligible size (object-based audio) rendering: processing of audio content to adapt it to a specific loudspeaker layout screen-anchored coordinates: coord
43、inates that specify the position of an object in relation to the size and location of the screen snap: relocation of an object to minimize the audible result of panning speaker-anchored coordinates: coordinates that specify the position of an object by associating it to a loudspeaker surround sound:
44、 multi-channel audio content for playback with loudspeaker layouts in a single plane EXAMPLE: 3/2/0 or 3/4/0 trim: process of signal attenuation to adapt the audio to play back on a loudspeaker layout with fewer loudspeakers than the mastering loudspeaker layout zone: sub-volume of the listening roo
45、m 3.2 Symbols For the purposes of the present document, the following symbols apply: g4668auni003Bbg4669 a list of individual values a and b g4670auni003Bbg4671 a closed interval between a and b values g4666xuni003Byuni003Bzg4667 a three-dimensional vector, used for specifying a position inside the
46、room uni007Cauni007C the absolute value of a 3.3 Abbreviations For the purposes of the present document, the following abbreviations apply: AC Audio Codec LFE Low-Frequency Effects PCM Pulse Code Modulation ETSI ETSI TS 103 448 V1.1.1 (2016-09)9 3.4 Conventions Unless otherwise stated, the following
47、 conventions are used in the present document. Typographic convention: Italic font denotes variables and metadata items (n is a variable or a metadata item). Function prototypes can take scalars, vectors, or matrices as arguments and operate element-wise. The return type is either scalar or vector o
48、f the same format as the argument. abs(x) The absolute value of the elements of x clamp(x) The clamp function is defined as follows: clampg4666xg4667g3404g4688g882g481 g1875g1860g1857g1866 x g3407g882g154g481 when x g1488g4670g882uni002Cg883g4671g883g481 g1875g1860g1857g1866 x g3408g883db_to_linear(
49、x) The conversion of values of the elements of x from logarithmic to linear scale, defined as follows: db_to_linearg4666xg4667g3404g883g882g3435xg2870g2868g3415 g3439floor(x) The largest integer(s) less than or equal to the elements of x isempty(x) Returns true if the vector x is empty, false otherwise max(x) The maximum value of the elements of x min(x) The minimum value of the elements of x mod(x, y) The mod function denotes the remainder of x after division by y pow(x, y) The pow function denotes the power function, where x is a base and y is an e
copyright@ 2008-2019 麦多课文库(www.mydoc123.com)网站版权所有
备案/许可证编号:苏ICP备17064731号-1