ImageVerifierCode 换一换
格式:PDF , 页数:24 ,大小:1.24MB ,
资源ID:796337      下载积分:10000 积分
快捷下载
登录下载
邮箱/手机:
温馨提示:
如需开发票,请勿充值!快捷下载时,用户名和密码都是您填写的邮箱或者手机号,方便查询和重复下载(系统自动生成)。
如填写123,账号就是123,密码也是123。
特别说明:
请自助下载,系统不会自动发送文件的哦; 如果您已付费,想二次下载,请登录后访问:我的下载记录
支付方式: 支付宝扫码支付 微信扫码支付   
注意:如需开发票,请勿充值!
验证码:   换一换

加入VIP,免费下载
 

温馨提示:由于个人手机设置不同,如果发现不能下载,请复制以下地址【http://www.mydoc123.com/d-796337.html】到电脑端继续下载(重复下载不扣费)。

已注册用户请登录:
账号:
密码:
验证码:   换一换
  忘记密码?
三方登录: 微信登录  

下载须知

1: 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。
2: 试题试卷类文档,如果标题没有明确说明有答案则都视为没有答案,请知晓。
3: 文件的所有权益归上传用户所有。
4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
5. 本站仅提供交流平台,并不能对任何下载内容负责。
6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

版权提示 | 免责声明

本文(ITU-T G 711 APP I-1999 Pulse Code Modulation (PCM) of Voice Frequencies - Appendix I A High Quality Low-Complexity Algorithm for Packet Loss Concealment with G 711 - Series G Trans .pdf)为本站会员(boatfragile160)主动上传,麦多课文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。 若此文所含内容侵犯了您的版权或隐私,请立即通知麦多课文库(发送邮件至master@mydoc123.com或直接QQ联系客服),我们立即给予删除!

ITU-T G 711 APP I-1999 Pulse Code Modulation (PCM) of Voice Frequencies - Appendix I A High Quality Low-Complexity Algorithm for Packet Loss Concealment with G 711 - Series G Trans .pdf

1、INTERNATIONAL TELECOMMUNICATION UNION ITU=T TELECOMMUNICATION STANDARDIZATION SECTOR OF ITU G.71 I Appendix I (09/99) SERIES G: TRANSMISSION SYSTEMS AND MEDIA, DIGITAL SYSTEMS AND NETWORKS Digital transmission systems - Terminal equipments - Coding of analogue signals by pulse code modulation Pulse

2、code modulation (PCM) of voice frequencies Appendix I: A high quality low-complexity algorithm for packet loss concealment with G.711 ITU-T Recommendation G.71 I - Appendix I (Previously CCITT Recommendation) ITU-T G-SERIES RECOMMENDATIONS TRANSMISSION SYSTEMS AND MEDIA, DIGITAL SYSTEMS AND NETWORKS

3、 INTERNATIONAL TELEPHONE CONNECTIONS AND CIRCUITS INTERNATIONAL ANALOGUE CARRIER SYSTEM GENERAL CHARACTERISTICS COMMON TO ALL ANALOGUE CARRIER- TRANSMISSION SYSTEMS INDIVIDUAL CHARACTERISTICS OF INTERNATIONAL CARRIER TELEPHONE SYSTEMS ON METALLIC LINES GENERAL CHARACTERISTICS OF INTERNATIONAL CARRIE

4、R TELEPHONE WITH METALLIC LINES COORDINATION OF RADIOTELEPHONY AND LINE TELEPHONY TESTING EQUIPMENTS TRANSMISSION MEDIA CHARACTERISTICS DIGITAL TRANSMISSION SYSTEMS TERMINAL EQUIPMENTS SYSTEMS ON RADIO-RELAY OR SATELLITE LINKS AND INTERCONNECTION General Coding of analogue signals by pulse code modu

5、lation Coding of analogue signals by methods other than PCM Principal characteristics of primary multiplex equipment Principal characteristics of second order multiplex equipment Principal characteristics of higher order multiplex equipment Principal characteristics of transcoder and digital multipl

6、ication equipment Operations, administration and maintenance features of transmission equipment Principal characteristics of multiplexing equipment for the synchronous digital hierarchy Other terminal equipment DIGITAL NETWORKS DIGITAL SECTIONS AND DIGITAL LINE SYSTEM G.100-G.199 G.200-G.299 G.300-G

7、.399 G.400-G.449 G.450-G.499 G.700-G.799 G.700-G.709 6.790-6.719 G.720-G.729 G.730-G.739 G.740-G.749 G.750-G.759 G.760-G.769 G. 770-G. 779 G.780-G.789 G.790-G.799 G.800-G.899 G.900-G.999 For further details, please refer to ITU-T List of Recommendations 9 48b2591 Ob72093 44T ITU-T RECOMMENDATION G.7

8、11 PULSE CODE MODULATION (PCM) OF VOICE FREQUENCIES APPENDIX I A high quality low-complexity algorithm for packet loss concealment with G.711 Summary Packet Loss Concealment (PLC) algorithms, also known as frame erasure concealment algorithms, hide transmission losses in an audio system where the in

9、put signal is encoded and packetized at a transmitter, sent over a network, and received at a receiver that decodes the packet and plays out the output. Many of the standard CELP-based speech coders have PLC algorithms built into their standards. The algorithm described here provides a method for Re

10、commendation G.71 l. Source Appendix I to ITU-T Recommendation G.711 was prepared by ITU-T Study Group 16 (1997-2000) and was approved under the WTSC Resolution No. 1 procedure on 30 September 1999. FOREWORD ITU International Telecommunication Union) is the United Nations Specialized Agency in the f

11、ield of telecommunications. The ITU Telecommunication Standardization Sector (ITU-T) is a permanent organ of the ITU. The ITU-T is responsible for studying technical, operating and tariff questions and issuing Recommendations on them with a view to standardizing telecommunications on a worldwide bas

12、is. The World Telecommunication Standardization Conference (WTSC), which meets every four years, establishes the topics for study by the ITU-T Study Groups which, in their turn, produce Recommendations on these topics. In some areas of information technology which fall within ITU-Ts purview, the nec

13、essary standards are prepared on a collaborative basis with IS0 and IEC. NOTE In this Recommendation the term recognized operating agency (ROA) includes any individual, company, corporation or governmental organization that operates a public correspondence service. The terms Administration, ROA and

14、public correspondence are defined in the Constitution of the ITU (Geneva, 1992). INTELLECTUAL PROPERTY RIGHTS The ITU draws attention to the possibility that the practice or implementation of this Recommendation may involve the use of a claimed Intellectual Property Right. The ITU takes no position

15、concerning the evidence, validity or applicability of claimed Intellectual Property Rights, whether asserted by ITU members or others outside of the Recommendation development process. As of the date of approval of this Recommendation, the ITU had received notice of intellectual property, protected

16、by patents, which may be required to implement this Recommendation. However, implementors are cautioned that this may not represent the latest information and are therefore strongly urged to consult the TSB patent database. O ITU 2000 All rights reserved. No part of this publication may be reproduce

17、d or utilized in any form or by any means, electronic or mechanical, including photocopying and microfilm, without permission in writing from the ITU. m 48b259L Ob32095 212 m CONTENTS Appendix T . A high quality low-complexity algorithm for packet loss concealment with G.711 I . 1 Introduction 1.2 A

18、lgorithm descnptlon . 1.2.1 Good frames 1.2.2 First bad frame 1.2.3 Pitch detection . 1.2.4 Synthetic signal generation for first 10 ms 1.2.5 Synthetic signal generation after 10 ms 1.2.6 Attenuation 1.2.7 First good fiame after an erasure . 1.2.8 Example . 1.3.1 Typedefs and constants . 1.3.2 Class

19、 declaration . 1.3.3 Main loop 1.3.4 Utility member functions 1.3.5 Constructor 1.3.6 Addtohistory and savespeech 1.3.7 Dofe . 1.3.8 Pitch detection . 1.3.9 Synthetic signal generation and attenuation 1.3 Algorithm description with annotated C+ code . 1.3.1 O Overlap add operators . Complexity and d

20、elay . 1.4 Page 1 1 1 1 1 2 2 2 3 3 3 5 5 5 7 7 8 9 10 12 15 16 17 48b259L Ob72096 L59 Recommendation G.711 PULSE CODE MODULATION (PCM) OF VOICE FREQUENCIES APPENDIX I A high quality low-complexity algorithm for packet loss concealment with G.711 (Geneva, 1999) 1.1 Introduction Packet Loss Concealme

21、nt (PLC) algorithms, also known as frame erasure concealment algorithms, hide transmission losses in an audio system where the input signal is encoded and packetized at a transmitter, sent over a network, and received at a receiver that decodes the packet and plays out the output. Many of the standa

22、rd CELP-based speech coders, such as Recommendations G.723.1 l, G.728 2 and G.729 3, have PLC algorithms built into their standards. The algorithm described here provides a method for Recommendation G.7 1 l. The objective of PLC is to generate a synthetic speech signal to cover missing data (erasure

23、s) in a received bit stream. Ideally, the synthesized signal will have the same timbre and spectral characteristics as the missing signal, and will not create unnatural artifacts. Since speech signals are often locally stationary, it is possible to use the signals past history to generate a reasonab

24、le approximation to the missing segment. If the erasures are not too long, and the erasure does not land in a region where the signal is rapidly changing, the erasures may be inaudible after concealment. 1.2 Algorithm description To add PLC to a G.711 system that currently does not conceal losses, c

25、hanges are only required in the receiver. The G.711-encoded audio data is sampled at 8 kHz. In this appendix it is assumed to be partitioned into 10 ms frames (80 samples). By adjusting a few parameters, other packet sizes or sampling rates can be accommodated. 1.2.1 Good frames During normal operat

26、ion (good packets or frames) the receiver decodes the received packet and sends its output to the audio port. Two minor changes are made to the receiver when it processes good frames to support PLC. 1) A copy of the decoded output is saved in a circular history buffer that is 48.75 ms (390 samples)

27、long. The history buffer is used to calculate the current pitch period and extract waveforms during an erasure. This buffering does not introduce any delay into the output signal. 2) The output is delayed by 3.75 ms (30 samples) before it sent to the audio port. This algorithm delay, used for an Ove

28、rlap Add (OLA) at the start of an erasure, allows the PLC code to make a smooth transition between the real and synthesized signal. 1.2.2 First bad frame At the start of the erasure, the circular history buffer is copied to a non-circular buffer, called the pitch buffer, that is easier to work with.

29、 The contents of the pitch buffer are used for the duration of the erasure. An additional copy of the most recent 1/4 pitch period, called the lastq buffer, is made in case the erasure lasts longer than 10 ms. 1.2.3 Pitch detection First, the pitch period is estimated by finding the peak of the norm

30、alized cross-correlation of the most recent 20 ms of speech in the history buffer with the previous speech at taps from 5 (40 samples) to 15 ms (120 samples). This corresponds to frequencies of 200 to 66 Hz. The pitch range was chosen based on a range used in G.728 post-filter. While G.728 uses a lo

31、wer bound of 2.5 ms (20 samples), here it is increased to 40 samples so the same pitch period is not repeated more than twice in a single 10 ms erased frame. To lower complexity, the pitch estimation is calculated in two phases. First, a coarse search is performed on a 2:l decimated signal, and then

32、 a finer search is performed in the vicinity of the peak of the coarse search. The complexity can be lowered with a slight degradation in quality by skipping the fine search. In the following the term wavelength is also used to refer to the output value of this calculation, since the missing signal

33、may be either voiced or unvoiced speech. From Waveform Shift Overlap Add (WSOLA), it is known that the normalized cross-correlation function can be replaced with either a non-normalized cross correlation, or a cross-Average Magnitude Difference Function (AMDF) and similar overall performance results

34、 will be obtained. 1.2.4 Synthetic signal generation for first 10 ms For the first 10 ms of the erasure, the best results are obtained by generating the synthesized signal from the last pitch period with no attenuation. Only the most recent 1.25 pitch periods of the pitch buffer are used during the

35、first 10 ms. To insure a smooth transition between the real and synthetic signal, and a smooth transition if the pitch period is repeated multiple times, an Overlap Add (OLA) is performed using a triangular window on 1/4 of the pitch period between the last and next to last pitch period. For 1/4 wav

36、elength the signal starting at 1.25 pitch periods from the end of the pitch buffer is multiplied by an up-sloping ramp and is added to the last 0.25 pitch period in the las tq buffer multiplied by a down-sloping ramp. If complexity is not an issue, the triangular windows may be replaced with Harming

37、 windows in all the OLA operations. The result of the OLA replaces both the tail of the pitch buffer and the tail of the history buffer. It is also output by the receiver during the tail of the last good frame, replacing the original signal. This introduces the algorithm delay - the tail of the last

38、 frame cannot be output until it is known whether the next frame is erased. If an erasure occurs the signal in the tail of the last good frame is modified by the OLA to insure a smooth transition to the synthesized signal. The synthesized signal for the 10 ms during the erasure is generated by placi

39、ng a pointer one pitch period back from the end of the pitch buffer, and copying the samples to the output. If the pitch period is shorter than 10 ms, when the pointer rolls off the end of the pitch buffer the pointer is set back exactly one pitch period before continuing. If the pitch period is sho

40、rt (the frequency is high), the last pitch period in the pitch buffer is repeated multiple times during the 1 O ms erasure. While the erasure progresses, the history buffer is updated with the synthesized output. This way, the history buffer always has a smooth, continuous signal in it. This continu

41、ity is important if a “bad frame, good frame, bad frame“ sequence occurs. 1.2.5 Synthetic signal generation after 10 ms If the next frame is also erased, the erasure will be at least 20 ms long and further action is required. While repeating a single pitch period works well for short erasures (e.g.

42、10 ms), on long erasures it introduces unnatural harmonic artifacts (beeps). This is especially noticeable if the erasure lands in an unvoiced region of speech, or in a region of rapid transition such as a stop. It was discovered by experimentation that these artifacts are significantly reduced by i

43、ncreasing the number of pitch periods used to synthesize the signal as the erasure progresses. Playing more pitch periods increases the variation in the signal. Although the pitch periods are not played in the order they occurred in the original signal, the resulting output still sounds natural. At

44、1 O ms into the erasure the number of pitch m 4862573 Ob72078 T21 periods used to synthesize the speech is increased to two, and at 20 ms a third pitch period is added. For erasures longer than 20 ms no additional modifications to the pitch buffer are made. When the number of pitch periods used in t

45、he pitch buffer increases, it is important that the transition in the synthesized signal be smooth. This is accomplished by continuing the output of the existing pitch buffer for 1/4 of a pitch period at the start of the second and third erased frame, updating the pitch buffer, keeping the buffer po

46、inter synchronized with the correct phase, and then doing an OLA with the output from the new pitch buffer. The pitch buffer is updated exactly as during the first erased frame, except that the number of pitch periods is increased. For example, at the start of the second erased frame, for 114 wavele

47、ngth the signal starting at 2.25 pitch periods from the end of the pitch buffer is multiplied by an up-sloping ramp and is added to the 1/4 wavelength in the las tq buffer multiplied by a down-sloping ramp. The result of the OLA replaces the last 1/4 wavelength in the pitch buffer. To maintain the p

48、hase of the current output pointer, pitch periods are subtracted fiom the pointer until it is in the first pitch period used. 1.2.6 Attenuation As with other PLC algorithms, such as G.729 and G.728 Annex I, with long erasures it is necessary to attenuate the signal as the erasure progresses. As the

49、erasure gets longer, the synthesized signal is more likely to diverge from the real signal. Without attenuation strange artifacts are created by holding certain types of sounds too long, even if the synthesized signal segment sounds natural in isolation. For the first 10 ms of an erasure the signal is not attenuated. At the start of the second 10 ms, the synthesized signal is linearly attenuated with a ramp at the rate of 20% per 10 ms. After 60 ms, the synthesized signal is zero. 1.2.7 First good frame after an erasure At the first good frame after an erasure, a smooth transition

copyright@ 2008-2019 麦多课文库(www.mydoc123.com)网站版权所有
备案/许可证编号:苏ICP备17064731号-1