1、 International Telecommunication Union ITU-T G.1021TELECOMMUNICATION STANDARDIZATION SECTOR OF ITU (07/2012) SERIES G: TRANSMISSION SYSTEMS AND MEDIA, DIGITAL SYSTEMS AND NETWORKS Multimedia Quality of Service and performance Generic and user-related aspects Buffer models for development of client p
2、erformance metrics Recommendation ITU-T G.1021 ITU-T G-SERIES RECOMMENDATIONS TRANSMISSION SYSTEMS AND MEDIA, DIGITAL SYSTEMS AND NETWORKS INTERNATIONAL TELEPHONE CONNECTIONS AND CIRCUITS G.100G.199 GENERAL CHARACTERISTICS COMMON TO ALL ANALOGUE CARRIER-TRANSMISSION SYSTEMS G.200G.299 INDIVIDUAL CHA
3、RACTERISTICS OF INTERNATIONAL CARRIER TELEPHONE SYSTEMS ON METALLIC LINES G.300G.399 GENERAL CHARACTERISTICS OF INTERNATIONAL CARRIER TELEPHONE SYSTEMS ON RADIO-RELAY OR SATELLITE LINKS AND INTERCONNECTION WITH METALLIC LINES G.400G.449 COORDINATION OF RADIOTELEPHONY AND LINE TELEPHONY G.450G.499 TR
4、ANSMISSION MEDIA AND OPTICAL SYSTEMS CHARACTERISTICS G.600G.699 DIGITAL TERMINAL EQUIPMENTS G.700G.799 DIGITAL NETWORKS G.800G.899 DIGITAL SECTIONS AND DIGITAL LINE SYSTEM G.900G.999 MULTIMEDIA QUALITY OF SERVICE AND PERFORMANCE GENERIC AND USER-RELATED ASPECTS G.1000G.1999TRANSMISSION MEDIA CHARACT
5、ERISTICS G.6000G.6999 DATA OVER TRANSPORT GENERIC ASPECTS G.7000G.7999 PACKET OVER TRANSPORT ASPECTS G.8000G.8999 ACCESS NETWORKS G.9000G.9999 For further details, please refer to the list of ITU-T Recommendations. Rec. ITU-T G.1021 (07/2012) i Recommendation ITU-T G.1021 Buffer models for developme
6、nt of client performance metrics Summary Recommendation ITU-T G.1021 defines buffer models that describe the behaviour of the client buffers used at the receiver side of audio/video applications. Buffer models receive as input IP packets associated with meta information on their content and provide
7、as output client states associated with relevant timing information. The buffer models are intended to be used for the development of client performance metrics in order to complement the IP-layer metrics defined in Recommendation ITU-T Y.1540. History Edition Recommendation Approval Study Group 1.0
8、 ITU-T G.1021 2012-07-14 12 ii Rec. ITU-T G.1021 (07/2012) FOREWORD The International Telecommunication Union (ITU) is the United Nations specialized agency in the field of telecommunications, information and communication technologies (ICTs). The ITU Telecommunication Standardization Sector (ITU-T)
9、 is a permanent organ of ITU. ITU-T is responsible for studying technical, operating and tariff questions and issuing Recommendations on them with a view to standardizing telecommunications on a worldwide basis. The World Telecommunication Standardization Assembly (WTSA), which meets every four year
10、s, establishes the topics for study by the ITU-T study groups which, in turn, produce Recommendations on these topics. The approval of ITU-T Recommendations is covered by the procedure laid down in WTSA Resolution 1. In some areas of information technology which fall within ITU-Ts purview, the neces
11、sary standards are prepared on a collaborative basis with ISO and IEC. NOTE In this Recommendation, the expression “Administration“ is used for conciseness to indicate both a telecommunication administration and a recognized operating agency. Compliance with this Recommendation is voluntary. However
12、, the Recommendation may contain certain mandatory provisions (to ensure, e.g., interoperability or applicability) and compliance with the Recommendation is achieved when all of these mandatory provisions are met. The words “shall“ or some other obligatory language such as “must“ and the negative eq
13、uivalents are used to express requirements. The use of such words does not suggest that compliance with the Recommendation is required of any party. INTELLECTUAL PROPERTY RIGHTS ITU draws attention to the possibility that the practice or implementation of this Recommendation may involve the use of a
14、 claimed Intellectual Property Right. ITU takes no position concerning the evidence, validity or applicability of claimed Intellectual Property Rights, whether asserted by ITU members or others outside of the Recommendation development process. As of the date of approval of this Recommendation, ITU
15、had not received notice of intellectual property, protected by patents, which may be required to implement this Recommendation. However, implementers are cautioned that this may not represent the latest information and are therefore strongly urged to consult the TSB patent database at http:/www.itu.
16、int/ITU-T/ipr/. ITU 2013 All rights reserved. No part of this publication may be reproduced, by any means whatsoever, without the prior written permission of ITU. Rec. ITU-T G.1021 (07/2012) iii Table of Contents Page 1 Scope 1 2 References. 2 3 Definitions 2 3.1 Terms defined elsewhere 2 4 Abbrevia
17、tions and acronyms 2 5 Conventions 3 6 De-jitter buffers 3 6.1 Network jitter and de-jitter buffers . 3 6.2 De-jitter buffers for audio applications 5 6.3 De-jitter buffers for “low bitrate mode“ and “high bitrate mode“ video applications . 5 6.4 De-jitter buffers for over-the-top video applications
18、 6 6.5 De-jitter buffer performance outcomes 7 7 De-jitter buffer models . 7 7.1 Maximum buffer size . 9 7.2 Service strategy 9 7.3 Initial buffering strategy . 10 7.4 Re-buffering strategy 11 Annex A Pseudocode of the state-machine description of the de-jitter buffer model 13 A.1 Algorithm overview
19、 13 A.2 Configuration parameters . 13 A.3 De-jitter buffer variables 13 A.4 De-jitter buffer input . 14 A.5 De-jitter buffer high level function pseudocode . 14 Bibliography. 17 Rec. ITU-T G.1021 (07/2012) 1 Recommendation ITU-T G.1021 Buffer models for development of client performance metrics 1 Sc
20、ope Audio and video applications currently used in packet networks implement client buffers (e.g., de-jitter buffers and play-out buffers) at the receiver side. The behaviour of these buffers is essential to the end-to-end performance of these applications. This Recommendation defines models for the
21、se client buffers for audio/video applications currently used in actual networks, in order to estimate the client state based on the dynamic network behaviour. In this Recommendation, different buffer models are derived depending on the different applications taken into account. However, these buffe
22、r models may be based on common intrinsic characteristics: i.e., buffer size, service rate, initial delay and re-buffering delay. Example buffer performance metrics include peak/average/instantaneous buffer occupation. In this Recommendation, buffer models may accept as input, either live traffic ar
23、riving on measurement points, packet capture (pcap) files, or post-processed files. Buffer models are expected to provide client states (e.g., codec play-out begin, buffer overflow, buffer dry out, play-out resume) associated with relevant timing information. As part of future work, these buffer mod
24、els could be used to derive new performance parameters and associated statistics in order to complement the IP-layer metrics defined in Recommendation Y.1540. It is out of scope for this Recommendation to define specific de-jitter buffers for use in actual client media stream delivery. In this Recom
25、mendation the following applications, rate and buffer characteristics, and associated protocol stacks are considered: a) Audio applications: i) Ideally constant fill, constant play-out rate and relatively low bandwidth. b) Video “low bitrate mode“ quarter common intermediate format-quarter video gra
26、phics array (QCIF-QVGA) resolutions, mostly for mobile TV and streaming with the sub application areas: i) Linear mobile TV over real-time transport protocol (RTP) (including mobile TV over a 3G mobile network with multimedia broadcast/multicast services (MBMS) and with unicast transport over RTP/UD
27、P/IP). ii) Multimedia streaming (including 3GPP packet-switched streaming service (PSS) with transport over RTP/UDP/IP). iii) IETF RFC 3984 variable bit rate, up to 2 Mbps, with separate buffers for audio and video, and separate play-out rates for each buffer. c) Video “high bitrate mode“ standard-d
28、efinition (SD) and high-definition (HD) television, mostly for Internet Protocol television (IPTV) with the following sub-application areas: i) Linear broadcast TV (including transmission over MPEG2-TS/RTP/UDP/IP, MPEG2-TS/UDP/IP and RTP/UDP/IP transport). ii) Video on-demand (including transmission
29、 over MPEG2-TS/RTP/UDP/IP, MPEG2-TS/UDP/IP and RTP/UDP/IP transport). iii) IETF RFC 2250, constant bit rate (CBR) or variable bit rate (VBR) up to 50 Mbps, with separate buffers for audio and video, and separate play-out rates for each buffer. 2 Rec. ITU-T G.1021 (07/2012) d) Over-the-top video appl
30、ications: i) Non-adaptive streaming (MPEG4_fileformat/HTTP/TCP, HTTP progressive download, HTTP partial download, chunks): Progressive download: TCP transport, TCP flow control (65Mbps super-peak rates, peak rate depends on network conditions and path), classic single buffer, leaky bucket at constan
31、t frame rate. ii) Adaptive streaming dynamic adaptive streaming over HTTP (DASH/TCP): Chunk Delivery: (e.g., DASH) bursts of packets, like sequential FTP file transfers. 2 References The following ITU-T Recommendations and other references contain provisions which, through reference in this text, co
32、nstitute provisions of this Recommendation. At the time of publication, the editions indicated were valid. All Recommendations and other references are subject to revision; users of this Recommendation are therefore encouraged to investigate the possibility of applying the most recent edition of the
33、 Recommendations and other references listed below. A list of the currently valid ITU-T Recommendations is regularly published. The reference to a document within this Recommendation does not give it, as a stand-alone document, the status of a Recommendation. ITU-T G.1020 Recommendation ITU-T G.1020
34、 (2006), Performance parameter definitions for quality of speech and other voiceband applications utilizing IP networks. ITU-T H.222.0 Recommendation ITU-T H.222.0 (2006) | ISO/IEC 13818-1:2007, Information technology Generic coding of moving pictures and associated audio information: Systems. ITU-T
35、 Y.1540 Recommendation ITU-T Y.1540 (2010), Internet protocol data communication service IP packet transfer and availability performance parameters. IETF RFC 2250 IETF RFC 2250 (1998), RTP Payload Format for MPEG1/MPEG2 Video. IETF RFC 3984 IETF RFC 3984 (2005), RTP Payload Format for H.264 Video. 3
36、 Definitions 3.1 Terms defined elsewhere This Recommendation uses the following terms defined elsewhere: 3.1.1 decoding time-stamp (DTS) ITU-T H.222.0: A field that may be present in a PES packet header that indicates the time that an access unit is decoded in the system target decoder. 4 Abbreviati
37、ons and acronyms This Recommendation uses the following abbreviations and acronyms: ARQ Automatic Retransmission on Request CBR Constant Bit Rate DASH Dynamic Adaptive Streaming over HTTP DSL Digital Subscriber Line DSP Digital Signal Processor DTS Decoding Time Stamp EWMA Exponentially Weighted Mov
38、ing Average Rec. ITU-T G.1021 (07/2012) 3 HTTP Hypertext Transfer Protocol IP Internet Protocol IPTV Internet Protocol Television FEC Forward Error Correction FTP File Transfer Protocol MBMS Multimedia Broadcast/Multicast Services MPEG Moving Picture Experts Group MPEG-TS MPEG Transport Stream NTP N
39、etwork Time Protocol OTT Over The Top PCAP Packet Capture PCR Program Clock Reference PDV Packet Delay Variation PES Packetized Elementary Stream PLC Packet Loss Concealment PLL Phase-Locked Loop PSS Packet-switched Streaming Service QCIF Quarter Common Intermediate Format QVGA Quarter Video Graphic
40、s Array RTP Real-time Transport Protocol STB Set-Top Box VBR Variable Bit Rate 5 Conventions The buffer model behavior is described by the state-machine pseudocode provided in Annex A of this Recommendation. Pseudocode is an informal, high-level description of the operating principle of an algorithm
41、. It uses the structural conventions of a programming language, but is intended for human reading rather than machine reading. Pseudocode typically omits details that are not essential for human understanding of the algorithm, such as variable declarations, system-specific code and some subroutines.
42、 Note that no standard for pseudocode syntax exists. In this Recommendation, pseudo-code similar to the C programming language has been used. 6 De-jitter buffers 6.1 Network jitter and de-jitter buffers In the context of computer networks, the term “jitter“ is often used as a measure of the variabil
43、ity over time of packet latency across a network. A network with constant latency has no variation and therefore no jitter. However, the word is imprecise and, in the context of packet-switched networks, the term “packet delay variation“ is often preferred over “jitter“ (see ITU-T Y.1540). 4 Rec. IT
44、U-T G.1021 (07/2012) In order to smooth out the natural packet delay variation or jitter effects of asynchronous networks and to synchronize play-out between sender and receiver, most techniques introduce an additional delay, implemented by having the packets arriving on the reception side equipment
45、 being temporarily stored in a buffer. This buffer is designed to counteract the jitter that is introduced by the network, until the moment the audio signal or the image is delivered to the decoding scheme. This buffer is called the “de-jitter buffer.“ In order to ensure a continuous play-out of str
46、eaming audio or video, the de-jitter buffer parameters have to be tuned, such that, at the moment an audio or a picture is to be played-out, the entire content of the audio or picture resides in the buffer. There is a trade-off between the end-to-end delay caused by the de-jitter buffer and the pack
47、et loss. Packets that arrive before timeout are played out in the same order as they were created at the sender side. Those packets that arrive after timeout are discarded. An ideal algorithm should neither delay the play-out time excessively, nor discard too many packets. As shown in Figure 6-1, th
48、e end-to-end delay experienced by an IP packet (carrying speech or voice band applications) can be decomposed in several parts: packetization delay, network transfer delay, de-jitter buffer delay (accommodating delay variation from source and network) and play-out buffer delay. G.1021(12)_F6-1Packet
49、izationNetwork transfer delaySource delays Other destination delaysDe-jitter buffer 99.9%-tile transfer delayMinimum transfer delayMinimum additionalsource delayAccommodates delay variation from network and source terminalMinimum/playout bufferDSP and other queuingNOTE Figure 6-1 is an exact replica of Figure 5 of ITU-T G.1020. Figure 6-1 Delay of packet networks and network elements This type of decomposition can be generalized to most audio/video applications. 6.1.1 Fixed size de-jitter buffers The size of the play-out b