1、 ETSI TR 126 950 V15.0.0 (2018-07) Universal Mobile Telecommunications System (UMTS); LTE; Study on Surround Sound codec extension for Packet Switched Streaming (PSS) and Multimedia Broadcast/Multicast Service (MBMS) (3GPP TR 26.950 version 15.0.0 Release 15) TECHNICAL REPORT ETSI ETSI TR 126 950 V1
2、5.0.0 (2018-07)13GPP TR 26.950 version 15.0.0 Release 15Reference RTR/TSGS-0426950vf00 Keywords LTE,UMTS ETSI 650 Route des Lucioles F-06921 Sophia Antipolis Cedex - FRANCE Tel.: +33 4 92 94 42 00 Fax: +33 4 93 65 47 16 Siret N 348 623 562 00017 - NAF 742 C Association but non lucratif enregistre la
3、 Sous-Prfecture de Grasse (06) N 7803/88 Important notice The present document can be downloaded from: http:/www.etsi.org/standards-search The present document may be made available in electronic versions and/or in print. The content of any electronic and/or print versions of the present document sh
4、all not be modified without the prior written authorization of ETSI. In case of any existing or perceived difference in contents between such versions and/or in print, the only prevailing document is the print of the Portable Document Format (PDF) version kept on a specific network drive within ETSI
5、 Secretariat. Users of the present document should be aware that the document may be subject to revision or change of status. Information on the current status of this and other ETSI documents is available at https:/portal.etsi.org/TB/ETSIDeliverableStatus.aspx If you find errors in the present docu
6、ment, please send your comment to one of the following services: https:/portal.etsi.org/People/CommiteeSupportStaff.aspx Copyright Notification No part may be reproduced or utilized in any form or by any means, electronic or mechanical, including photocopying and microfilm except as authorized by wr
7、itten permission of ETSI. The content of the PDF version shall not be modified without the written authorization of ETSI. The copyright and the foregoing restriction extend to reproduction in all media. ETSI 2018. All rights reserved. DECTTM, PLUGTESTSTM, UMTSTMand the ETSI logo are trademarks of ET
8、SI registered for the benefit of its Members. 3GPPTM and LTETMare trademarks of ETSI registered for the benefit of its Members and of the 3GPP Organizational Partners. oneM2M logo is protected for the benefit of its Members. GSMand the GSM logo are trademarks registered and owned by the GSM Associat
9、ion. ETSI ETSI TR 126 950 V15.0.0 (2018-07)23GPP TR 26.950 version 15.0.0 Release 15Intellectual Property Rights Essential patents IPRs essential or potentially essential to normative deliverables may have been declared to ETSI. The information pertaining to these essential IPRs, if any, is publicly
10、 available for ETSI members and non-members, and can be found in ETSI SR 000 314: “Intellectual Property Rights (IPRs); Essential, or potentially Essential, IPRs notified to ETSI in respect of ETSI standards“, which is available from the ETSI Secretariat. Latest updates are available on the ETSI Web
11、 server (https:/ipr.etsi.org/). Pursuant to the ETSI IPR Policy, no investigation, including IPR searches, has been carried out by ETSI. No guarantee can be given as to the existence of other IPRs not referenced in ETSI SR 000 314 (or the updates on the ETSI Web server) which are, or may be, or may
12、become, essential to the present document. Trademarks The present document may include trademarks and/or tradenames which are asserted and/or registered by their owners. ETSI claims no ownership of these except for any which are indicated as being the property of ETSI, and conveys no right to use or
13、 reproduce any trademark and/or tradename. Mention of those trademarks in the present document does not constitute an endorsement by ETSI of products, services or organizations associated with those trademarks. Foreword This Technical Report (TR) has been produced by ETSI 3rd Generation Partnership
14、Project (3GPP). The present document may refer to technical specifications or reports using their 3GPP identities, UMTS identities or GSM identities. These should be interpreted as being references to the corresponding ETSI deliverables. The cross reference between GSM, UMTS, 3GPP and ETSI identitie
15、s can be found under http:/webapp.etsi.org/key/queryform.asp. Modal verbs terminology In the present document “should“, “should not“, “may“, “need not“, “will“, “will not“, “can“ and “cannot“ are to be interpreted as described in clause 3.2 of the ETSI Drafting Rules (Verbal forms for the expression
16、 of provisions). “must“ and “must not“ are NOT allowed in ETSI deliverables except when used in direct citation. ETSI ETSI TR 126 950 V15.0.0 (2018-07)33GPP TR 26.950 version 15.0.0 Release 15Contents Intellectual Property Rights 2g3Foreword . 2g3Modal verbs terminology 2g3Foreword . 4g31 Scope 5g32
17、 References 5g33 Definitions and abbreviations . 5g33.1 Definitions 5g33.2 Abbreviations . 6g34 Use cases 6g34.1 Surround sound over headphones . 6g34.2 Surround sound over loudspeakers . 7g34.2.1 Decoding and rendering on a UE 7g34.2.2 Decoding and rendering on a non-3GPP device connected to a UE .
18、 8g34.2.3 Decoding on a UE and rendering on a non-3GPP device connected to a UE . 8g34.2.4 PSS/MBMS delivery methods 9g35 Design constraints 9g35.1 Mono/Stereo Backwards compatibility 9g35.2 Number of audio channels 9g35.2.1 Number of audio input channels . 9g35.2.2 Number of audio output channels .
19、 10g35.3 Sampling frequency 10g35.4 Bit rates 10g35.5 Computational complexity . 10g35.6 Other design constraints . 11g36 Test item selection criteria . 11g37 Performance requirements 12g37.1 General requirements . 12g37.2 Loudspeaker requirements 12g37.3 Binaural test 12g37.4 Backward compatibility
20、 test . 12g37.5 Error test . 13g37.6 Listening test on HRTF 13g38 Validation of the user benefits and feasibility through evaluation of at least one example of surround sound . 13g38.1 Listening test over loudspeakers 13g38.2 Listening test over headphones 14g38.3 Backward compatibility . 17g38.4 Te
21、st under errors conditions . 18g38.4.1 Results with interleaver 18g38.4.2 Results without interleaver . 20g38.5 Test on HRTFs . 22g39 Conclusion 23g3Annex A: Test plans and global analysis reports 24g3Annex B: Change history 25g3History 26g3ETSI ETSI TR 126 950 V15.0.0 (2018-07)43GPP TR 26.950 versi
22、on 15.0.0 Release 15Foreword This Technical Report has been produced by the 3rdGeneration Partnership Project (3GPP). The contents of the present document are subject to continuing work within the TSG and may change following formal TSG approval. Should the TSG modify the contents of the present doc
23、ument, it will be re-released by the TSG with an identifying change of release date and an increase in version number as follows: Version x.y.z where: x the first digit: 1 presented to TSG for information; 2 presented to TSG for approval; 3 or greater indicates TSG approved document under change con
24、trol. y the second digit is incremented for all changes of substance, i.e. technical enhancements, corrections, updates, etc. z the third digit is incremented when editorial only changes have been incorporated in the document. ETSI ETSI TR 126 950 V15.0.0 (2018-07)53GPP TR 26.950 version 15.0.0 Rele
25、ase 151 Scope The present document investigates the potential user experience benefits of surround audio in 3GPP services. The investigation will be performed as follows: - Identify and document relevant use cases for surround sound in 3GPP. - Define design constraints that would need to be met by a
26、 surround audio codec extension method for adoption by 3GPP. - Identify suitable testing methodology for surround sound in relevant use cases of the PSS and MBMS services. - Define subjective minimum performance criteria that would need to be met in order to motivate the consideration of a surround
27、audio coding extension for adoption by 3GPP. - Validate the user benefits and the feasibility of the deployment of surround sound for the PSS and MBMS services according to the defined minimum performance criteria, bit rate and design constraints for all the use cases (such as surround sound speaker
28、 set-up and headphone decoding mode) through evaluation of at least one example of surround sound coding methods which may be MPS. 2 References The following documents contain provisions which, through reference in this text, constitute provisions of the present document. - References are either spe
29、cific (identified by date of publication, edition number, version number, etc.) or non-specific. - For a specific reference, subsequent revisions do not apply. - For a non-specific reference, the latest version applies. In the case of a reference to a 3GPP document (including a GSM document), a non-
30、specific reference implicitly refers to the latest version of that document in the same Release as the present document. 1 3GPP TR 21.905: “Vocabulary for 3GPP Specifications“. 2 3GPP TS 26.346: “Multimedia Broadcast/Multicast Service (MBMS); Protocols and codecs“. 3 3GPP TS 26.234: “Transparent end
31、-to-end Packet-switched Streaming Service (PSS); Protocols and codecs“. 4 ITU-R Recommendation BS.775-2: “Multichannel stereophonic sound system with and without accompanying picture,“ Jul. 2006. 5 ITU- Recommendation BS.1534-1: “Method for the subjective assessment of intermediate quality level of
32、coding systems“, Geneva, 2003. 6 ISO/IEC JTC1/SC29/WG11 N2006 “Report on the MPEG-2 AAC Stereo Verification Tests“; Feb 1998, http:/www.chiariglione.org/mpeg/working_documents/mpeg-02/audio/AAC_results.zip. 7 3GPP TR 26.936: “Performance characterization of 3GPP audio codecs“. 3 Definitions and abbr
33、eviations 3.1 Definitions For the purposes of the present document, the terms and definitions given in TR 21.905 1 and the following apply. A term defined in the present document takes precedence over the definition of the same term, if any, in TR 21.905 1. HRTF: A Head-Related Transfer Function (HR
34、TF) represents a pair of filters that are obtained by measurement or modelling. It represents the acoustic transmission from a point in space to the entrance of a listeners ear canal. It depends on the relative positions of the source and the listeners head. ETSI ETSI TR 126 950 V15.0.0 (2018-07)63G
35、PP TR 26.950 version 15.0.0 Release 153.2 Abbreviations For the purposes of the present document, the abbreviations given in TR 21.905 1 and the following apply. An abbreviation defined in the present document takes precedence over the definition of the same abbreviation, if any, in TR 21.905 1. 5.1
36、ch Loudspeaker set-up with 2 front channels, 2 rear channels, 1 center channel and 1 subwoofer HRTF Head-Related Transfer Function MPS MPEG Surround MUSHRA MUlti Stimulus test with Hidden Reference and Anchor 4 Use cases The relevant use cases considered in this study are applications in the context
37、 of MBMS and/or PSS services. In the home entertainment industry the de facto standard for surround sound content is the 5.1 channel format. The reproduction of such surround signal can be done in various ways using a number of channels that is not necessarily equal to the content at the service pro
38、vider side resulting in different listening modes. The general characteristics of MBMS and PSS services apply and will be considered to derive design constraints and performance requirements for the study item. We have identified the following use cases for consideration. Table 1: List of use cases
39、considered in the study Use case # Reproduction Description 1 a Headphones Surround decoding with binaural post-processing 1 b Headphones Surround decoding with binaural processing being part of the decoding process 2.1 a Loudspeakers Surround decoding followed by rendering on the UE 2.1 b Loudspeak
40、ers Surround decoding with rendering being part of the decoding process on the UE 2.2 Loudspeakers Surround bit-stream is transported via the UE. Decoding and rendering is performed in a non-3gpp device connected to the UE. 2.3 Loudspeakers Surround decoding on the UE. Decoded surround audio data ar
41、e transported to a non-3gpp device connected to the UE for rendering. NOTE: - In the following use cases it is assumed that the surround sound content provided to the server comprises multiple channels, typically 6 channels in the 5.1 format. - Alternatively, the surround sound content may be presen
42、ted to the server as a binauralized stereo signal. In this case, the server would encode the surround sound as an artistic downmix (which is also referred to as Binaural Virtual Surround effect). No additional processing would be required when listening over headphones. However, this alternative for
43、mat would have several implications: - When playing over stereo or multichannel loudspeakers, the decoder would have to remove the binauralization effect. Some signalling would be needed to indicate that the downmix is binauralized stereo signal. - This alternative format would not offer mono/stereo
44、 backward compatibility to existing 3GPP audio codecs, especially when listening over loudspeakers. - In the following use cases it is assumed that the surround bit-stream contains spatial information to control the behaviour of the surround decoder. The surround decoder produces surround sound base
45、d on this side information. However, a possible additional function of the surround capable UE is that the surround decoder may be able to upmix stereo signals encoded by legacy 3GPP audio codecs, which can then be binauralized for listening over headphones. 4.1 Surround sound over headphones Binaur
46、al/Stereo post-processing may or may not be part of the surround sound decoder (see Figures 1 and 2). Figure 1 illustrates a block diagram where the binaural or stereo post-processing is not part of the surround decoder. A server ETSI ETSI TR 126 950 V15.0.0 (2018-07)73GPP TR 26.950 version 15.0.0 R
47、elease 15transmits surround sound bit-streams via PSS or MBMS protocols/services. The UE first decodes the received surround bit-stream to a surround signal. The resulting surround signal is processed by binaural or stereo downmix post-processing to produce a stereo signal. The resulting signal can
48、be represented on headphones. NOTE: The surround bit-stream is decoded inside the UE to a surround signal. This surround signal is input to a binaural or stereo downmix post-processor that produces a representation of the surround signal for headphone reproduction. Figure 1: Signal flow for use case
49、 1 a where binaural and stereo downmix post-processing is not part of the surround sound decoder Figure 2 provides a block diagram where binaural post-processing is part of, i.e. integrated into, the surround decoder. The only difference with regard to Figure 1 is that the surround bit-stream is not first decoded to a full surround signal prior to binaural post-processing. Instead the steps of surround decoding and binaural decoding are integrated into a single binaural surround decoder. NOTE: The surround bit-stream is decoded inside the UE directly t