1、 ETSI TS 126 448 V15.0.0 (2018-07) Universal Mobile Telecommunications System (UMTS); LTE; Codec for Enhanced Voice Services (EVS); Jitter Buffer Management (3GPP TS 26.448 version 15.0.0 Release 15) TECHNICAL SPECIFICATION ETSI ETSI TS 126 448 V15.0.0 (2018-07)13GPP TS 26.448 version 15.0.0 Release
2、 15Reference RTS/TSGS-0426448vf00 Keywords LTE,UMTS ETSI 650 Route des Lucioles F-06921 Sophia Antipolis Cedex - FRANCE Tel.: +33 4 92 94 42 00 Fax: +33 4 93 65 47 16 Siret N 348 623 562 00017 - NAF 742 C Association but non lucratif enregistre la Sous-Prfecture de Grasse (06) N 7803/88 Important no
3、tice The present document can be downloaded from: http:/www.etsi.org/standards-search The present document may be made available in electronic versions and/or in print. The content of any electronic and/or print versions of the present document shall not be modified without the prior written authori
4、zation of ETSI. In case of any existing or perceived difference in contents between such versions and/or in print, the only prevailing document is the print of the Portable Document Format (PDF) version kept on a specific network drive within ETSI Secretariat. Users of the present document should be
5、 aware that the document may be subject to revision or change of status. Information on the current status of this and other ETSI documents is available at https:/portal.etsi.org/TB/ETSIDeliverableStatus.aspx If you find errors in the present document, please send your comment to one of the followin
6、g services: https:/portal.etsi.org/People/CommiteeSupportStaff.aspx Copyright Notification No part may be reproduced or utilized in any form or by any means, electronic or mechanical, including photocopying and microfilm except as authorized by written permission of ETSI. The content of the PDF vers
7、ion shall not be modified without the written authorization of ETSI. The copyright and the foregoing restriction extend to reproduction in all media. ETSI 2018. All rights reserved. DECTTM, PLUGTESTSTM, UMTSTMand the ETSI logo are trademarks of ETSI registered for the benefit of its Members. 3GPPTM
8、and LTETMare trademarks of ETSI registered for the benefit of its Members and of the 3GPP Organizational Partners. oneM2M logo is protected for the benefit of its Members. GSMand the GSM logo are trademarks registered and owned by the GSM Association. ETSI ETSI TS 126 448 V15.0.0 (2018-07)23GPP TS 2
9、6.448 version 15.0.0 Release 15Intellectual Property Rights Essential patents IPRs essential or potentially essential to normative deliverables may have been declared to ETSI. The information pertaining to these essential IPRs, if any, is publicly available for ETSI members and non-members, and can
10、be found in ETSI SR 000 314: “Intellectual Property Rights (IPRs); Essential, or potentially Essential, IPRs notified to ETSI in respect of ETSI standards“, which is available from the ETSI Secretariat. Latest updates are available on the ETSI Web server (https:/ipr.etsi.org/). Pursuant to the ETSI
11、IPR Policy, no investigation, including IPR searches, has been carried out by ETSI. No guarantee can be given as to the existence of other IPRs not referenced in ETSI SR 000 314 (or the updates on the ETSI Web server) which are, or may be, or may become, essential to the present document. Trademarks
12、 The present document may include trademarks and/or tradenames which are asserted and/or registered by their owners. ETSI claims no ownership of these except for any which are indicated as being the property of ETSI, and conveys no right to use or reproduce any trademark and/or tradename. Mention of
13、 those trademarks in the present document does not constitute an endorsement by ETSI of products, services or organizations associated with those trademarks. Foreword This Technical Specification (TS) has been produced by ETSI 3rd Generation Partnership Project (3GPP). The present document may refer
14、 to technical specifications or reports using their 3GPP identities, UMTS identities or GSM identities. These should be interpreted as being references to the corresponding ETSI deliverables. The cross reference between GSM, UMTS, 3GPP and ETSI identities can be found under http:/webapp.etsi.org/key
15、/queryform.asp. Modal verbs terminology In the present document “shall“, “shall not“, “should“, “should not“, “may“, “need not“, “will“, “will not“, “can“ and “cannot“ are to be interpreted as described in clause 3.2 of the ETSI Drafting Rules (Verbal forms for the expression of provisions). “must“
16、and “must not“ are NOT allowed in ETSI deliverables except when used in direct citation. ETSI ETSI TS 126 448 V15.0.0 (2018-07)33GPP TS 26.448 version 15.0.0 Release 15Contents Intellectual Property Rights 2g3Foreword . 2g3Modal verbs terminology 2g3Foreword . 4g31 Scope 5g32 References 5g33 Definit
17、ions, symbols and abbreviations . 5g33.1 Definitions 5g33.2 Symbols 5g33.3 Abbreviations . 6g33.4 Mathematical Expressions 6g34 General . 6g34.1 Introduction 6g34.2 Packet-based communications 7g34.3 EVS Receiver architecture overview 7g35 Jitter Buffer Management . 8g35.1 Overview 8g35.2 Depacketiz
18、ation of RTP packets (informative) 8g35.3 Network Jitter Analysis and Delay Estimation . 9g35.3.1 General 9g35.3.2 Long-term Jitter 10g35.3.3 Short-term jitter 10g35.3.4 Target Playout Delay 10g35.3.5 Playout Delay Estimation . 11g35.4 Adaptation Control Logic . 12g35.4.1 Control Logic 12g35.4.2 Fra
19、me-based adaptation 12g35.4.2.1 General 12g35.4.2.2 Insertion of Concealed Frames 12g35.4.2.3 Frame Dropping 13g35.4.2.4 Comfort Noise Insertion in DTX 13g35.4.2.5 Comfort Noise Deletion in DTX . 13g35.4.3 Signal-based adaptation 13g35.4.3.1 General 13g35.4.3.2 Time-shrinking 14g35.4.3.3 Time-stretc
20、hing . 15g35.4.3.4 Energy Estimation . 16g35.4.3.5 Similarity Measurement 16g35.4.3.6 Quality Control . 17g35.4.3.7 Overlap-add . 17g35.5 Receiver Output Buffer 18g35.6 De-Jitter Buffer 18g36 Decoder interaction 19g36.1 General . 19g36.2 Decoder Requirements . 19g36.3 Partial Redundancy. 19g36.3.1 C
21、omputation of the Partial Redundancy Offset 20g36.3.2 Computation of a frame erasure rate indicator to control the frequency of the Partial Redundancy transmission 21g3Annex A (informative): Change history . 22g3History 23 ETSI ETSI TS 126 448 V15.0.0 (2018-07)43GPP TS 26.448 version 15.0.0 Release
22、15Foreword This Technical Specification has been produced by the 3rdGeneration Partnership Project (3GPP). The contents of the present document are subject to continuing work within the TSG and may change following formal TSG approval. Should the TSG modify the contents of the present document, it w
23、ill be re-released by the TSG with an identifying change of release date and an increase in version number as follows: Version x.y.z where: x the first digit: 1 presented to TSG for information; 2 presented to TSG for approval; 3 or greater indicates TSG approved document under change control. y the
24、 second digit is incremented for all changes of substance, i.e. technical enhancements, corrections, updates, etc. z the third digit is incremented when editorial only changes have been incorporated in the document. ETSI ETSI TS 126 448 V15.0.0 (2018-07)53GPP TS 26.448 version 15.0.0 Release 151 Sco
25、pe The present document defines the Jitter Buffer Management solution for the Codec for Enhanced Voice Services (EVS). 2 References The following documents contain provisions which, through reference in this text, constitute provisions of the present document. - References are either specific (ident
26、ified by date of publication, edition number, version number, etc.) or non-specific. - For a specific reference, subsequent revisions do not apply. - For a non-specific reference, the latest version applies. In the case of a reference to a 3GPP document (including a GSM document), a non-specific ref
27、erence implicitly refers to the latest version of that document in the same Release as the present document. 1 3GPP TR 21.905: “Vocabulary for 3GPP Specifications“. 2 3GPP TS 26.445: “Codec for Enhanced Voice Services (EVS); Detailed Algorithmic Description“. 3 3GPP TS 26.114: “IP Multimedia Subsyst
28、em (IMS); Multimedia telephony; Media handling and interaction“. 4 3GPP TS 26.071: “Mandatory speech CODEC speech processing functions; AMR speech Codec; General description“. 5 3GPP TS 26.171: “Speech codec speech processing functions; Adaptive Multi-Rate - Wideband (AMR-WB) speech codec; General d
29、escription“. 6 3GPP TS 26.442: “Codec for Enhanced Voice Services (EVS); ANSI C code (fixed-point)“. 7 3GPP TS 26.443: “Codec for Enhanced Voice Services (EVS); ANSI C code (floating-point)“. 8 3GPP TS 26.131: “Terminal acoustic characteristics for telephony; Requirements“. 9 IETF RFC 4867 (2007): “
30、RTP Payload Format and File Storage Format for the Adaptive Multi-Rate (AMR) and Adaptive Multi-Rate Wideband (AMR-WB) Audio Codecs“, J. Sjoberg, M. Westerlund, A. Lakaniemi and Q. Xie. 3 Definitions, symbols and abbreviations 3.1 Definitions For the purposes of the present document, the terms and d
31、efinitions given in TR 21.905 1 and the following apply. A term defined in the present document takes precedence over the definition of the same term, if any, in TR 21.905 1. 3.2 Symbols For the purposes of the present document, the following symbols apply: Time signal and time index n in context x,
32、 e.g. x can be inp, out, HP, pre, etc. Frame length / size of module x Energy values in context of x Correlation function in context x ()nsxxLxExCETSI ETSI TS 126 448 V15.0.0 (2018-07)63GPP TS 26.448 version 15.0.0 Release 153.3 Abbreviations For the purposes of the present document, the abbreviatio
33、ns given in TR 21.905 1 and the following apply. An abbreviation defined in the present document takes precedence over the definition of the same abbreviation, if any, in TR 21.905 1. AMR Adaptive Multi Rate (codec) AMR-WB Adaptive Multi Rate Wideband (codec) CNG Comfort Noise Generator DTX Disconti
34、nuous Transmission EVS Enhanced Voice Services FB Fullband FIFO First In, First Out IP Internet ProtocolJBM Jitter Buffer Management MTSI Multimedia Telephony Service for IMS NB Narrowband PCM Pulse Code Modulation PLC Packet Loss Concealment RTP Real Time Transport Protocol SID Silence Insertion De
35、scriptor SOLA Synchronized overlap-add SWB Super Wideband TSM Time Scale Modification VAD Voice Activity Detection WB Wideband 3.4 Mathematical Expressions For the purposes of the present document, the following conventions apply to mathematical expressions: indicates the smallest integer greater th
36、an or equal to x: , and indicates the largest integer less than or equal to x: , and min(x0,xN1) indicates the minimum of x0, xN1, N being the number of components max(x0,xN1) indicates the maximum of x0, , xN1indicates summation 4 General 4.1 Introduction The present document defines the Jitter Buf
37、fer Management solution for the Codec for Enhanced Voice Services (EVS) 2. Jitter Buffers are required in packet-based communications, such as 3GPP MTSI 2, to smooth the inter-arrival jitter of incoming media packets for uninterrupted playout. The solution is used in conjunction with the EVS decoder
38、 and can also be used for AMR 4 and AMR-WB 5. It is optimized for the Multimedia Telephony Service for IMS (MTSI) and fulfils the requirements for delay and jitter-induced concealment operations set in 2. The present document is recommended for implementation in all network entities and UEs supporti
39、ng the EVS codec. In the case of discrepancy between the EVS Jitter Buffer Management described in the present document and its ANSI-C code specification contained in 6, the procedure defined by 6 prevails. In the case of discrepancy between the procedure described in the present document and its AN
40、SI-C code specification contained in 7, the procedure defined by 7 prevails. x 21.1 = 20.2 =11.1 =x 11.1 = 10.1 = 21.1 =ETSI ETSI TS 126 448 V15.0.0 (2018-07)73GPP TS 26.448 version 15.0.0 Release 154.2 Packet-based communications In packet-based communications, packets arrive at the terminal with r
41、andom jitters in their arrival time. Packets may also arrive out of order. Since the decoder expects to be fed a speech packet every 20 milliseconds to output speech samples in periodic blocks, a de-jitter buffer is required to absorb the jitter in the packet arrival time. The larger the size of the
42、 de-jitter buffer, the better its ability to absorb the jitter in the arrival time and consequently fewer late arriving packets are discarded. Voice communications is also a delay critical system and therefore it becomes essential to keep the end to end delay as low as possible so that a two way con
43、versation can be sustained. The defined adaptive Jitter Buffer Management (JBM) solution reflects the above mentioned trade-offs. While attempting to minimize packet losses, the JBM algorithm in the receiver also keeps track of the delay in packet delivery as a result of the buffering. The JBM solut
44、ion suitably adjusts the depth of the de-jitter buffer in order to achieve the trade-off between delay and late losses. 4.3 EVS Receiver architecture overview An EVS receiver for MTSI-based communication is built on top of the EVS Jitter Buffer Management solution. In the EVS Jitter Buffer Managemen
45、t solution the received EVS frames, contained in RTP packets, are depacketized and fed to the Jitter Buffer Management (JBM). The JBM smoothes the inter-arrival jitter of incoming packets for uninterrupted playout of the decoded EVS frames at the Acoustic Frontend of the terminal. Figure 1: Receiver
46、 architecture for the EVS Jitter Buffer Management Solution Figure 1 illustrates the architecture and data flow of the receiver side of an EVS terminal. Note that the architecture serves only as an example to outline the integration of the JBM in a terminal. This specification defines the JBM module
47、 and its interfaces to the RTP Depacker, the EVS Decoder 2, and the Acoustic Frontend 8. The modules for Modem and Acoustic Frontend are outside the scope of the present document. The actual implementation of the RTP Depacker is outlined in a basic form; more complex depacketization scenarios depend
48、 on the usage of RTP. Real-time implementations of this architecture typically use independent processing threads for reacting on arriving RTP packets from the modem and for requesting PCM data for the Acoustic Frontend. Arriving packets are typically handled by listening for packets received on the
49、 network socket related to the RTP session. Incoming packets are pushed into the RTP Depacker module which extracts the frames contained in an RTP packet. These frame are then pushed into the JBM where the statistics are updated and the frames are stored for later decoding and playout. The Acoustic Frontend contains the audio interface which, concurrently to the push operation of EVS frames, pulls PCM buffers from the JBM. The JBM is therefore required to provide PCM buffers, which are normally gener