ATIS T1 TR 45-1995 Speech Packetization.pdf

上传人:fuellot230 文档编号:541644 上传时间:2018-12-08 格式:PDF 页数:32 大小:81.89KB
下载 相关 举报
ATIS T1 TR 45-1995 Speech Packetization.pdf_第1页
第1页 / 共32页
ATIS T1 TR 45-1995 Speech Packetization.pdf_第2页
第2页 / 共32页
ATIS T1 TR 45-1995 Speech Packetization.pdf_第3页
第3页 / 共32页
ATIS T1 TR 45-1995 Speech Packetization.pdf_第4页
第4页 / 共32页
ATIS T1 TR 45-1995 Speech Packetization.pdf_第5页
第5页 / 共32页
亲,该文档总共32页,到这儿已超出免费预览范围,如果喜欢就下载吧!
资源描述

1、Report No. 45 A Technical ReportonDecember 1995 Speech PacketizationPrepared byT1A1.7Working Group on Specialized SignalProcessingCommittee T1 is sponsored by the Alliance for Telecommunciations Industry Solutions(formerly the Exchange Carriers Standards Association)Accredited by American National S

2、tandards InstituteCopyright 1995 by Alliance for Telecommunications IndustrySolutions All rights reserved.No part of this publication may be reproduced in any form, in an electronic retrieval system or otherwise, without the prior written permission of the publisher.A Technical Report onSPEECH PACKE

3、TIZATIONABSTRACTThis technical report summarizes Committee T1s current views on speechpacketization. It explains the various issues that affect the packetization of speech,provides an overview of transport techniques and considerations, and presentsinformation on topics of concern to designers, impl

4、ementors, and service providers.Document T1A1.7/94-016r2Prepared byT1A1.7Working Group on Specialized Signal ProcessingiCONTENTS1. Scope, purpose, and application 11.1 Scope 11.2 Purpose. 11.3 Application. 12. Definitions. 22.1 Relationship between ANSI Standards and CCITT Recommendations 22.2 Gloss

5、ary 23. Historical background . 24. Summary of design issues 35. Reconstitution of speech signals 46. Delay equalization 46.1 Blind delay. 56.2 Absolute time stamp 56.3 Relative time stamp. 57. Robustness to errors 77.1 Sensitivity of speech to bit errors. 77.2 CRC calculation over part of the frame

6、 77.3 CRC calculation over whole frames. 78. Congestion control 88.1 Global control 88.2 Local control 89. Packet loss . 129.1 Case of speech . 129.2 Case of voice-band data . 139.3 Fill-in strategies (speech). 1310. Choice of packet size 1410.1 Speech considerations 1410.2 Bit error considerations

7、1410.3 Integrated traffic 1410.4 ANSI T1.312 (CCITT Recommendation G.764) 1511. Compression Issues . 1511.1 Speech coding algorithms . 1511.2 Digital speech interpolation . 1512. Channel-oriented signaling. 1713. Extensions 1713.1 Facsimile . 1713.2 Speech/video synchronization. 1813.3 Interface bet

8、ween the PSTN and LANs. 1813.4 Extension to New Algorithms. 1914. Summary 19References . 20iiiiiForewordThis technical report was written as part of the activities of T1 Working Group T1A1.7under project T1Y1 23, entitled “Speech Packetization.“ The report is based primarilyon technical contribution

9、s provided by Mostafa Hashem Sherif (AT2. quality of service for different traffic classes.In March 1993, the International Telecommunication Union (ITU) was reorganized andthe CCITT renamed as the Telecommunication Standardization Sector (ITU-T).Documents prior to March 1993 will be referred to as

10、CCITT documents while thoseafter March 1993 are ITU-T documents.1.2 PurposeThe purpose of this technical report is to:Gb7G20 explain the various issues that affect the packetization of speech;G20Gb7G20 provide an overview of the techniques and considerations in the transport ofpacketized speech, as

11、explained in existing packetized speech standardprotocol in ANSI T1.312-1991 (CCITT Recommendation G.764);G20Gb7G20 disseminate information on the various topics of concern to designers,implementors of packetized speech equipment and to the service providers thatuse them.The report draws heavily on

12、the protocols of ANSI T1.312-1991.1.3 ApplicationThe intended use of this document is to provide background information on speechpacketization. The purpose of this report was not to develop a standard, but to makethe technical information widely available within the telecommunications industry.TECHN

13、ICAL REPORT NO. 4522. Definitions2.1 Relationship between ANSI Standards and CCITT RecommendationsCCITT Recommendation G.764 specifies a speech packetization protocol for thepacketization of speech and voice-band data using ADPCM algorithms. The protocol inANSI T1.312-1991 is identical to that of G.

14、764 but T1.312-1991 includes a formaldescription of the packetized voice protocol. The full reference to the ANSI standard is:ANSI T1.312-1991, Speech Packetization - Packetized Voice Protocol.2.2 GlossaryThe report contains the following acronyms:ADPCM adaptive differential pulse code modulationATM

15、 asynchronous transfer modeBER bit error ratioCRC cyclic redundancy checkDCME digital circuit multiplication equipmentFEC forward error correctionHDLC high-level digital link controlIDR intermediate data rateIRS intermediate reference systemISDN integrated systems digital networkLAN local area netwo

16、rkLD-CELP low delay - code excited linear predictionMOS mean opinion scorePCM pulse code modulationPCME packet circuit multiplication equipmentPSTN public switched telecommunications networkQPSK quadrature phase shift keyingTASI time assignment speech interpolationTCP/IP transmission control protoco

17、l/internet protocol3. Historical backgroundTraditionally voice services have been implemented in the Public SwitchedTelecommunications Network (PSTN) (also denoted as Wide Area Network or WAN)using a circuit-oriented approach. The growth of packet transport techniques forexample, X.25/X.75, Internet

18、, Wideband Packet Technology, Frame Relay and theAsynchronous Transfer Mode (ATM) has stimulated research in new techniques for thetransport of speech.Packetized systems can exploit the bursty nature of traffic to multiplex different types oftraffic (for example, voice, data, video) of many users so

19、 that they can sharetransmission bandwidth and switching resources dynamically. Packetization facilitatesthe integration of the different types of traffic to allow more efficient utilization of theavailable bandwidth and switching resources. Packetization offers more flexibility thancircuit-oriented

20、 approaches because the packet header contains the necessary controlinformation that identifies, for example, the type of traffic and, where appropriate, thecoding scheme.TECHNICAL REPORT NO. 453Work in speech packetization began in the CCITT during the middle of the 1984-1988study period in Working

21、 Party XVIII/8 and continued in Working Party XV/2 during the1989-1992 study period. It is now continuing in the ITU-T for wideband packet andATM networks. The goal is to provide a uniform basis for speech packetization, with orwithout speech compression and speech interpolation, to facilitate the i

22、nterworking ofequipment from various vendors in telecommunications applications.The work in the CCITT and Committee T1 has resulted in the Voice PacketizationProtocol of ANSI T1.312-1991/CCITT G.764 and its extensions in ANSI T1.509/G.765.These protocols are compatible at the link layer with the ISD

23、N protocols LAPD andLAPF specified in CCITT Recommendations Q.921 and Q.922 respectively.4. Summary of design issuesUsing the OSI protocol stack as a reference, the major design considerations indevising a speech packetization protocol are:1. Layer 1 (physical layer): The issue is whether the physic

24、al interface will conformto that of public telephone networks (for example, CCITT RecommendationsG.703 and G.704) or to other local area networks, such as IEEE 802.2, 802.3 or802.9, etc.2. Layer 2 (link layer): Some of the issues are: 1) whether the logical layer will becompatible with ISDN (LAPD/LA

25、PF) protocols or will have the same structure asthose for LANs, 2) how to deal with the loss of frames, and 3) robustness toerrors.3. Layer 3 (procedures to deal with digitized voice and voice-band data traffic):Issues are: 1) the delay variability for speech packets, and 2) the transport ofchannel-

26、associated signaling.4. Higher-layer issues involve the speech coder and the type of compression used.A speech packetization protocol has the following requirements:1. The speech must be reconstructed at the receiving end from packets arriving atirregular intervals (or, in some architectures, out of

27、 order).2. The protocol must be robust against line errors.3. It must offer an easy method for congestion control in the network.4. It must specify procedures at the terminating end to recover from packet loss orexcessive delay.5. It must carry channel-associated signaling.6. If digital speech inter

28、polation is used to eliminate silence intervals, it shallspecify the level at which noise is re-injected at the terminating end.TECHNICAL REPORT NO. 4545. Reconstitution of speech signalsTo achieve good speech quality, the terminating end must reconstitute a continuousspeech stream and play it out a

29、t regular intervals despite varying packet arrival times.This involves two aspects: 1) preserving the relative timing of information within onespeech burst and 2) considerations of delay equalization.In ANSI T1.312-1991 (CCITT Recommendation G.764), a packet sequence number isused to encode the rela

30、tive timing of the information within one speech burst. The firstpacket of a speech burst always has the sequence number of 0; subsequent packets inthe same burst have the numbers from 1 to 15 , rolling back to 1. The terminatingendpoints use the packet sequence number to: 1) determine the first pac

31、ket of aspeech burst and, 2) to detect packet loss. The determination of the first packet isuseful for delay equalization and may be needed for some speech coding algorithms,such as those described in CCITT Recommendation G.728. Delay equalization isdiscussed in the next section.6. Delay equalizatio

32、nDelays in packet communication consist of two components: a fixed delay and avariable delay.1The fixed delay arises from signal propagation on the transmissionlinks, and from fixed processing delays within the network and at the originating andterminating endpoints. The effect of variations in the

33、propagation delay for a given pathis assumed to be negligible.For speech packetization, the fixed processing delays consist of the followingcomponents:1. packetization delay during which the speech samples are buffered for furtherprocessing;2. the hang-over time of the speech detector if digital spe

34、ech interpolation is usedto remove silent intervals;23. the end-to-end algorithmic delay due to the encoding and decoding of speech;this delay depends on the coding scheme; for example, it is 125 ms for PCM,250 ms for the adaptive differential pulse coded modulation (ADPCM)algorithms of CCITT Recomm

35、endations G.726 and G.727, while it is less than 2ms for the low-delay code excited linear predictor (LD-CELP) algorithm of CCITTRecommendations G.728; and4. any added delay at the terminating end to mask the timing jitter resulting fromthe variability in the delay; this added delay is denoted as bu

36、ild-out.Variable delays result primarily from the queueing and processing of packets. Theydepend on the characteristics of the route of each packet: the number of hops (nodes),the type and speed of each link and the traffic intensity.Speech traffic requires low and uniform delay. ITU-T Recommendatio

37、n G.114 (1993)discusses the effect of end-to-end delays on the quality of a conversation. Delayvariations that may be acceptable for digital data transmission, usually affect the gapsTECHNICAL REPORT NO. 455between words and syllables and are troublesome for conversational speech.Available data sugg

38、est that variation in the interval between speech bursts should beless than 200 ms to avoid subjective degradation of speech quality.7, 8In applicationssuch as video telephony where both audio and video information are transmitted andmust remain synchronized, the effect of the variable delay on this

39、 synchronization mustalso be taken into account.There are several methods to mask the variability of the delay in the network. Thesetechniques include: 1) blind delay, 2) absolute time stamp, and 3) relative time stamp.The effect of all these methods is to increase the effective end-to-end buffering

40、 delayand, therefore, the total end-to-end delay.6.1 Blind delayIn the blind delay method, a fixed buffer delay is always added at the terminating end atthe first packet of a speech burst. This delay corresponds to the maximum variabledelay expected. The advantage of the blind-delay scheme is its si

41、mplicity which makesit a good candidate when the transmission speech is such that the variable delays areon the order of a fraction of a millisecond (for example, local area networks orbroadband networks at 150 mbit/s). In these cases, a fixed build-out delay on the orderof 10 ms will be adequate to

42、 eliminate end-to-end delay jitter.3Over long haul connections in the PSTN, the scheme may require such a large delaythat the total end-to-end delay would exceed the performance limits for network delaysspecified in ITU-T Recommendation G.114. For example, if the first packet has alreadyexperienced

43、the worst case delay variation, the total variable delay to be added will betwice the worst case value.1Because this approach does not mask the delay variabilitycompletely, the gaps between the words may be varying which may degrade thesubjective quality of speech. Voiceband traffic, including demod

44、ulated facsimile, maybe perturbed; total delays larger than 500 ms cause premature disconnections offacsimile calls, especially when echo is present.46.2 Absolute time stampThis is the method used in datagram networks. The packet header includes a field fora time stamp that represents real time. The

45、 time stamp has a resolution sufficient toallow accurate detection of packet jitter, and to cover the worst case transit time of apacket across the network. Thus, packets that arrive out of sequence may be correctlyordered and buffered at the receiver using time stamp information.5This datagramschem

46、e also requires clock synchronization between the transmitter and receiver sothat the delay of each incoming packet can be compared to the previous ones,assuming that the network delay is fixed.6.3 Relative time stampIn the relative time stamp method, an estimate of the play-out time is obtained for

47、 thefirst packet of a speech burst for all signaling packets and for the first packet after amissing packet. This time is then used to adjust the delay of all remaining packets ofthis burst conveyed on that virtual circuit.TECHNICAL REPORT NO. 456The accumulated variable delay experienced by a packe

48、t is recorded in the time stampfield of the packet header.1Each network node adds to the time stamp the amount oftime it took to serve a packet before sending it, using its local clock as reference.The maximum allowable variable delay for a virtual circuit is specified as the build-out.The build-out

49、 is defined for a given virtual circuit. Once an estimate of the play-out timehas been made, subsequent packets are placed in the order of their sequence numberin the play-out buffer and then held for the following duration:time before play-out = build-out delay - time stamp value.The terminating endpoint must, therefore, store the speech packets that arrive beforetheir scheduled play-out time and then play them at regular intervals. Packets whosetime stamp field exceeds the build-out delay are considered late and are dropped.The relative time stamp method is less compli

展开阅读全文
相关资源
猜你喜欢
相关搜索

当前位置:首页 > 标准规范 > 国际标准 > 其他

copyright@ 2008-2019 麦多课文库(www.mydoc123.com)网站版权所有
备案/许可证编号:苏ICP备17064731号-1