1、 I n t e r n a t i o n a l T e l e c o m m u n i c a t i o n U n i o n ITU-T G.1022 TELECOMMUNICATION STANDARDIZATION SECTOR OF ITU (07/2016) SERIES G: TRANSMISSION SYSTEMS AND MEDIA, DIGITAL SYSTEMS AND NETWORKS Multimedia Quality of Service and performance Generic and user-related aspects Buffer m
2、odels for media streams on TCP transport Recommendation ITU-T G.1022 ITU-T G-SERIES RECOMMENDATIONS TRANSMISSION SYSTEMS AND MEDIA, DIGITAL SYSTEMS AND NETWORKS INTERNATIONAL TELEPHONE CONNECTIONS AND CIRCUITS G.100G.199 GENERAL CHARACTERISTICS COMMON TO ALL ANALOGUE CARRIER-TRANSMISSION SYSTEMS G.2
3、00G.299 INDIVIDUAL CHARACTERISTICS OF INTERNATIONAL CARRIER TELEPHONE SYSTEMS ON METALLIC LINES G.300G.399 GENERAL CHARACTERISTICS OF INTERNATIONAL CARRIER TELEPHONE SYSTEMS ON RADIO-RELAY OR SATELLITE LINKS AND INTERCONNECTION WITH METALLIC LINES G.400G.449 COORDINATION OF RADIOTELEPHONY AND LINE T
4、ELEPHONY G.450G.499 TRANSMISSION MEDIA AND OPTICAL SYSTEMS CHARACTERISTICS G.600G.699 DIGITAL TERMINAL EQUIPMENTS G.700G.799 DIGITAL NETWORKS G.800G.899 DIGITAL SECTIONS AND DIGITAL LINE SYSTEM G.900G.999 MULTIMEDIA QUALITY OF SERVICE AND PERFORMANCE GENERIC AND USER-RELATED ASPECTS G.1000G.1999 TRA
5、NSMISSION MEDIA CHARACTERISTICS G.6000G.6999 DATA OVER TRANSPORT GENERIC ASPECTS G.7000G.7999 PACKET OVER TRANSPORT ASPECTS G.8000G.8999 ACCESS NETWORKS G.9000G.9999 For further details, please refer to the list of ITU-T Recommendations. Rec. ITU-T G.1022 (07/2016) i Recommendation ITU-T G.1022 Buff
6、er models for media streams on TCP transport Summary Recommendation ITU-T G.1022 defines buffer models that predict the behaviour of client side buffers for audio/video streams. Buffer models are client independent and receive A/V data, meta information and network state information as input, as out
7、put encoded frames and playback event information are given. The buffer models are intended to be used for the development of client performance metrics. History Edition Recommendation Approval Study Group Unique ID* 1.0 ITU-T G.1022 2016-07-29 12 11.1002/1000/12967 Keywords Buffer model, HTTP, prog
8、ressive download video, pseudo streaming, TCP streaming, video streaming. * To access the Recommendation, type the URL http:/handle.itu.int/ in the address field of your web browser, followed by the Recommendations unique ID. For example, http:/handle.itu.int/11.1002/1000/11830-en. ii Rec. ITU-T G.1
9、022 (07/2016) FOREWORD The International Telecommunication Union (ITU) is the United Nations specialized agency in the field of telecommunications, information and communication technologies (ICTs). The ITU Telecommunication Standardization Sector (ITU-T) is a permanent organ of ITU. ITU-T is respon
10、sible for studying technical, operating and tariff questions and issuing Recommendations on them with a view to standardizing telecommunications on a worldwide basis. The World Telecommunication Standardization Assembly (WTSA), which meets every four years, establishes the topics for study by the IT
11、U-T study groups which, in turn, produce Recommendations on these topics. The approval of ITU-T Recommendations is covered by the procedure laid down in WTSA Resolution 1. In some areas of information technology which fall within ITU-Ts purview, the necessary standards are prepared on a collaborativ
12、e basis with ISO and IEC. NOTE In this Recommendation, the expression “Administration“ is used for conciseness to indicate both a telecommunication administration and a recognized operating agency. Compliance with this Recommendation is voluntary. However, the Recommendation may contain certain mand
13、atory provisions (to ensure, e.g., interoperability or applicability) and compliance with the Recommendation is achieved when all of these mandatory provisions are met. The words “shall“ or some other obligatory language such as “must“ and the negative equivalents are used to express requirements. T
14、he use of such words does not suggest that compliance with the Recommendation is required of any party. INTELLECTUAL PROPERTY RIGHTSITU draws attention to the possibility that the practice or implementation of this Recommendation may involve the use of a claimed Intellectual Property Right. ITU take
15、s no position concerning the evidence, validity or applicability of claimed Intellectual Property Rights, whether asserted by ITU members or others outside of the Recommendation development process. As of the date of approval of this Recommendation, ITU had not received notice of intellectual proper
16、ty, protected by patents, which may be required to implement this Recommendation. However, implementers are cautioned that this may not represent the latest information and are therefore strongly urged to consult the TSB patent database at http:/www.itu.int/ITU-T/ipr/. ITU 2016 All rights reserved.
17、No part of this publication may be reproduced, by any means whatsoever, without the prior written permission of ITU. Rec. ITU-T G.1022 (07/2016) iii Table of Contents Page 1 Scope . 1 2 References . 2 3 Definitions 3 3.1 Terms defined elsewhere 3 3.2 Terms defined in this Recommendation . 3 4 Abbrev
18、iations and acronyms 6 5 Conventions 6 6 Description of possibilities for TCP connection . 6 7 Container layer processing and frame count estimation . 7 7.1 Buffer modelling modes . 7 8 Inference engine 9 8.1 Overview of inference engine 9 8.2 Seeking events during media play-out . 10 9 Event detect
19、or/data selection/queue . 11 9.1 External event notification . 12 10 Buffer configuration . 13 11 Play-out buffer 14 Annex A Video download as one single chunk within a single TCP connection . 16 A.1 Estimating the arrival time of the video frame data at the client device 16 A.2 Higher performance b
20、ased on not analysing each data packet . 16 Annex B Considerations for network layer and container layer processing . 18 Annex C Blackbox player measurement and buffer model parameterization 19 C.1 Introduction 19 C.2 Blackbox test set-up . 19 C.3 Investigative testing 20 C.4 Verification/validation
21、 testing 20 C.5 Basis of comparison between predicted and actual events . 20 C.6 Study questions . 20 Appendix I Frameless mode Annex A performance evaluation by TU Chemnitz . 22 Appendix II Relationship to Recommendation ITU-T P.1203 . 23 II.1 Introduction 23 II.2 ITU-T P.1203 (track 1) . 23 II.3 I
22、TU-T G.1022 buffer mode 0 . 24 Bibliography. 25 iv Rec. ITU-T G.1022 (07/2016) Introduction This Recommendation is organized as follows: Beyond the Scope and usual introductory material, clauses 6 to 12 describe the modelling approach and different aspects of the buffer models. It is possible to use
23、 the model with minimal measured and supplied information, and clause 7 describes this in detail. Annexes outline specific media flow and connection conditions, such as single media flow on a single reliable connection. Each annex gives the specific requirements for model operation and defines defau
24、lt values for input parameters. Rec. ITU-T G.1022 (07/2016) 1 Recommendation ITU-T G.1022 Buffer models for media streams on TCP transport 1 Scope This Recommendation defines models to estimate the buffer occupancy and operational state used in client media players employing TCP transport and other
25、forms of reliable transport. The measurement point(s) provide an observation point on TCP connection(s) primarily within the end-to-end path, where input data are collected. The scope of work can be described in terms of inputs and outputs of the model(s): Exact procedures for processing network lay
26、ers and subsequent container processing are beyond the scope of this Recommendation, as it is highly variable and obviously must be matched to the specific circumstances. However, general guidance for readers is provided. Inputs Frames or partial frames (audio/visual (A/V) frames) pertaining to the
27、reliable transport connection(s) in use and the time stamp when each packet appears at the measurement point. Normal operation considers both directions of transmission. Alternatively, meta-data from the container or streaming media application layer and network layers may be used whenever present.
28、Outputs for each model state change, it is required to report the new state name, the timestamp of the change in units of measure equal to 1 millisecond resolution. Optionally, the current buffer fill level in units of milliseconds is reported for each change. There is optional periodic reporting of
29、 the buffer fill level with timestamp and current state name. Also, the frame type and size information that is played-out may be reported and for some models, the complete media stream information. Furthermore, if there are multiple video streams (e.g., in the adaptive case) a possible stream chang
30、e could be reported. When available for each frame, there shall be a descriptive tuple containing a synthesized or decoded status, frame type, frame size in bytes, duration in time (ms) represented by the frame, and the payload bytes. The frame type should be registered (somewhere) and referenced in
31、 the type value. Operation there are scenarios where it may not be possible to access the media frame boundaries within the A/V stream, such as when the media is encrypted or an un-decodeable media format is encountered. In such cases, the set of default frame tuples within a container may be synthe
32、sised, and the distinction between decoded frame boundaries and synthesised frames shall be indicated. When a sufficiently long pause is detected in the TCP stream (or media flow) and a waiting time for resumption of the TCP stream expires, the monitoring system declares an End of stream if the buff
33、er is coincidentally depleted, and normal operation should cease. All stored state, including buffer state and queuing diagnostic information may be made available for on demand retrieval. Some network analysis is allowed to assist the mid-path monitoring to determine frame arrival time and other si
34、gnals that are inputs to the model (the details of complex network interactions are out of the scope of the Recommendation, but some general guidance is provided). Advertisements interjected in user sessions are extremely important to the business of video delivery, and evaluation of their delivery
35、may need to be separate to meet the needs of performance management. NOTE Non-reliable and UDP streams are outside of the scope of this Recommendation; they are covered in b-ITU-T G.1021. 2 Rec. ITU-T G.1022 (07/2016) Figure 1-1 Block diagram of functions Adaptive streaming is in scope, and as the t
36、able below shows, all forms known at this time use TCP and are possible topics of coverage for this Recommendation. Table 1-1 Forms of streaming using TCP Type Source TCP Adaptive http Streaming (AHS) Defined in 3GPP Release 9 X http Adaptive Streaming (HAS) Defined in Open IPTV Forum Release 2 X Dy
37、namic Adaptive Streaming over HTTP (DASH) Developed by MPEG, and developing beyond AHS and HAS. Adopted by 3GPP R10. X http Dynamic Streaming Developed by Adobe Systems X http Live Streaming (HLS) Developed by Apple X Microsoft Smooth Streaming Developed by Microsoft X 2 References The following ITU
38、-T Recommendations and other references contain provisions which, through reference in this text, constitute provisions of this Recommendation. At the time of publication, the editions indicated were valid. All Recommendations and other references are subject to revision; users of this Recommendatio
39、n are therefore encouraged to investigate the possibility of applying the most recent edition of the Recommendations and other references listed below. A list of the currently valid ITU-T Recommendations is regularly published. The reference to a document within this Recommendation does not give it,
40、 as a stand-alone document, the status of a Recommendation. ITU-T G.9960 Recommendation ITU-T G.9960 (2011), Unified high-speed wireline-based home networking transceivers System architecture and physical layer specification. Rec. ITU-T G.1022 (07/2016) 3 ITU-T H.262 Recommendation ITU-T H.262 (2012
41、) | ISO/IEC 13818-2:2013, Information technology Generic coding of moving pictures and associated audio information: Video. ITU-T I.113 Recommendation ITU-T I.113 (1997), Vocabulary of terms for broadband aspects of ISDN. ITU-T J.123 Recommendation ITU-T J.123 (2002), Multiplexing format for webcast
42、ing on the TCP/IP network. ITU-T J.124 Recommendation ITU-T J.124 (2004), Multiplexing format for multimedia webcasting over TCP/IP networks. ITU-T P.1202 Recommendation ITU-T P.1202 (2012), Parametric non-intrusive bitstream assessment of video media streaming quality. ITU-T Y.1540 Recommendation I
43、TU-T Y.1540 (2016), Internet protocol data communication service IP packet transfer and availability performance parameters. ITU-T Y.2770 Recommendation ITU-T Y.2770 (2012), Requirements for deep packet inspection in next generation networks. IETF RFC 793 IETF RFC 793 (1981), Transmission Control Pr
44、otocol. IETF RFC 2460 IETF RFC 2460 (1998), Internet Protocol, Version 6 (IPv6). IETF RFC 2616 IETF RFC 2616 (1999), Hypertext Transfer Protocol Version HTTP/1.1. IETF RFC 2818 IETF RFC 2818 (HTTP Over TLS, 2000), Hypertext Transfer Protocol Version Over TLS. IETF RFC 3550 IETF RFC 3550 (2003), RTP:
45、 A Transport Protocol for Real-Time Applications. 3 Definitions 3.1 Terms defined elsewhere This Recommendation uses the following terms defined elsewhere: 3.1.1 rebuffering artefacts ITU-T P.1202 3.1.2 payload ITU-T Y.2770 3.1.3 flow ITU-T G.9960 3.1.4 constant bitrate coded video ITU-T H.262 3.1.5
46、 variable bitrate ITU-T H.262 3.1.6 chunk ITU-T J.123, ITU-T J.124 3.1.7 block ITU-T I.113: A unit of information consisting of a header and an information field. 3.2 Terms defined in this Recommendation This Recommendation defines the following terms: 4 Rec. ITU-T G.1022 (07/2016) Figure 3-1 Terms
47、defined in this Recommendation 3.2.1 audio and video frames: Mandatory: arrival time at client or measurement point, frame duration or frame rate (from meta-data). Optional: type, frame size, PTS, and/or DTS 3.2.2 overall meta-data: The meta-data that can be found in various places, such as the cont
48、ainer meta-data and the manifest files. Mandatory: frame rate, or mean frame rate calculated from session duration for encrypted streams Optional: The total number of media frames, encoder information, encoder profile and resolution, number of audio channels, encoder bit stream rate 3.2.3 manifest f
49、iles: Files delivered separately from the media container and elementary stream that optionally describe operating points and provide a list where chunks or segments of a media file can be downloaded or streamed. 3.2.4 container layer: Optionally (raw data may be provided), the layer which encapsulated the elementary streams. Examples include, MP4, MPEG2-TS, WEBM. 3.2.5 streaming media application layer: Aggregates data from network connections and provides raw data or da