1、 I n t e r n a t i o n a l T e l e c o m m u n i c a t i o n U n i o n ITU-T P.1203.3 TELECOMMUNICATION STANDARDIZATION SECTOR OF ITU (10/2017) SERIES P: TELEPHONE TRANSMISSION QUALITY, TELEPHONE INSTALLATIONS, LOCAL LINE NETWORKS Models and tools for quality assessment of streamed media Parametric
2、bitstream-based quality assessment of progressive download and adaptive audiovisual streaming services over reliable transport Quality integration module Recommendation ITU-T P.1203.3 ITU-T P-SERIES RECOMMENDATIONS TELEPHONE TRANSMISSION QUALITY, TELEPHONE INSTALLATIONS, LOCAL LINE NETWORKS Vocabula
3、ry and effects of transmission parameters on customer opinion of transmission quality Series P.10 Voice terminal characteristics Series P.30 P.300 Reference systems Series P.40 Objective measuring apparatus Series P.50 P.500 Objective electro-acoustical measurements Series P.60 Measurements related
4、to speech loudness Series P.70 Methods for objective and subjective assessment of speech quality Series P.80 Methods for objective and subjective assessment of speech and video quality Series P.800 Audiovisual quality in multimedia services Series P.900 Transmission performance and QoS aspects of IP
5、 end-points Series P.1000 Communications involving vehicles Series P.1100 Models and tools for quality assessment of streamed media Series P.1200 Telemeeting assessment Series P.1300 Statistical analysis, evaluation and reporting guidelines of quality measurements Series P.1400 Methods for objective
6、 and subjective assessment of quality of services other than speech and video Series P.1500 For further details, please refer to the list of ITU-T Recommendations. Rec. ITU-T P.1203.3 (10/2017) i Recommendation ITU-T P.1203.3 Parametric bitstream-based quality assessment of progressive download and
7、adaptive audiovisual streaming services over reliable transport Quality integration module Summary Recommendation ITU-T P.1203.3 specifies the quality integration module for Recommendation ITU-T P.1203. The ITU-T P.1203 series of ITU-T Recommendations specify modules for monitoring the audio, video
8、and audiovisual quality of video services such as adaptive bit-rate video streaming. The respective ITU-T work item has formerly been referred to as P.NATS (parametric non-intrusive assessment of TCP-based multimedia streaming quality). The ITU-T P.1203.3 part of Recommendation ITU-T P.1203 can be a
9、pplied to the monitoring of performance and quality of experience (QoE) of video services such as adaptive bit-rate video streaming. Besides stream-based input information, the ITU-T P.1203.3 quality integration module takes the per-one-second video- and audio-quality scores calculated according to
10、ITU-T P.1203.1 and ITU-T P.1203.2, respectively, as input. Only one quality integration module is recommended for all four modes 0 to 3 of the Recommendation ITU-T P.1203 model series, unique across all modes. This Recommendation includes an electronic attachment containing the 20 trees described in
11、 clause 8.4. History Edition Recommendation Approval Study Group Unique ID* 1.0 ITU-T P.1203.3 2016-12-22 12 11.1002/1000/13161 2.0 ITU-T P.1203.3 2017-10-29 12 11.1002/1000/13402 Keywords Adaptive streaming, audio, audiovisual, IPTV, mean opinion score (MOS), mobile video, mobile TV, monitoring, mu
12、ltimedia, progressive download, QoE, TV, video. * To access the Recommendation, type the URL http:/handle.itu.int/ in the address field of your web browser, followed by the Recommendations unique ID. For example, http:/handle.itu.int/11.1002/1000/11830-en. ii Rec. ITU-T P.1203.3 (10/2017) FOREWORD T
13、he International Telecommunication Union (ITU) is the United Nations specialized agency in the field of telecommunications, information and communication technologies (ICTs). The ITU Telecommunication Standardization Sector (ITU-T) is a permanent organ of ITU. ITU-T is responsible for studying techn
14、ical, operating and tariff questions and issuing Recommendations on them with a view to standardizing telecommunications on a worldwide basis. The World Telecommunication Standardization Assembly (WTSA), which meets every four years, establishes the topics for study by the ITU-T study groups which,
15、in turn, produce Recommendations on these topics. The approval of ITU-T Recommendations is covered by the procedure laid down in WTSA Resolution 1. In some areas of information technology which fall within ITU-Ts purview, the necessary standards are prepared on a collaborative basis with ISO and IEC
16、. NOTE In this Recommendation, the expression “Administration“ is used for conciseness to indicate both a telecommunication administration and a recognized operating agency. Compliance with this Recommendation is voluntary. However, the Recommendation may contain certain mandatory provisions (to ens
17、ure, e.g., interoperability or applicability) and compliance with the Recommendation is achieved when all of these mandatory provisions are met. The words “shall“ or some other obligatory language such as “must“ and the negative equivalents are used to express requirements. The use of such words doe
18、s not suggest that compliance with the Recommendation is required of any party. INTELLECTUAL PROPERTY RIGHTSITU draws attention to the possibility that the practice or implementation of this Recommendation may involve the use of a claimed Intellectual Property Right. ITU takes no position concerning
19、 the evidence, validity or applicability of claimed Intellectual Property Rights, whether asserted by ITU members or others outside of the Recommendation development process. As of the date of approval of this Recommendation, ITU had received notice of intellectual property, protected by patents, wh
20、ich may be required to implement this Recommendation. However, implementers are cautioned that this may not represent the latest information and are therefore strongly urged to consult the TSB patent database at http:/www.itu.int/ITU-T/ipr/. ITU 2017 All rights reserved. No part of this publication
21、may be reproduced, by any means whatsoever, without the prior written permission of ITU. Rec. ITU-T P.1203.3 (10/2017) iii Table of Contents Page 1 Scope . 1 2 References . 2 3 Definitions 2 3.1 Terms defined elsewhere 2 3.2 Terms defined in this Recommendation . 2 4 Abbreviations and acronyms 3 5 C
22、onventions 3 6 Building blocks . 3 7 Model input . 3 7.1 I.14 input specification . 4 8 Model output . 4 8.1 Media parameter extraction 5 8.2 Model output O.34 10 8.3 Model output O.35 10 8.4 Model output O.46 11 8.5 Model output O.23 12 Electronic attachment: Decision trees to calculate the model o
23、utput described in clause 8.4. Rec. ITU-T P.1203.3 (10/2017) 1 Recommendation ITU-T P.1203.3 Parametric bitstream-based quality assessment of progressive download and adaptive audiovisual streaming services over reliable transport Quality integration module 1 Scope This Recommendation1 describes an
24、objective parametric quality assessment model as part of the ITU-T P.1203 series of ITU-T Recommendations. The model specified herein predicts the impact of audiovisual quality variations and stalling events on quality experienced by the end user in multimedia mobile streaming and fixed network appl
25、ications using adaptive bit-rate streaming, based on a previous estimation of audio and video quality and information on stalling events during the media session. The model predicts mean opinion scores (MOS) on a 5-point ACR scale (see ITU-T P.910) as a final audiovisual quality MOS score (as define
26、d in ITU-T P.911, for instance). The model also outputs a perceptual stalling quality indicator, a final audiovisual compression quality score and a list of integrated audiovisual quality scores for diagnostic purposes. This model cannot provide a comprehensive evaluation of transmission quality as
27、perceived by a particular end user because its scores reflect the impairments on the IP network being measured, which may only be part of the end-to-end connection. Furthermore, the scores predicted by a parametric model necessarily reflect an average perceptual impairment. Note also that the model
28、is developed with a specific encoder and decoder pair. If a different encoder and decoder pair is used in a monitoring situation the scores will not reflect that. Note that user interactions (such as pausing, seeking, user initiated quality change, user initiated play or user initiated end) are NOT
29、considered in this model either. The effects of audio level, noise, delay (and corresponding similar video factors) and other impairments related to the payload are not reflected in the scores computed by this model. Therefore, it is possible to have high scores with this model, yet have a poor qual
30、ity stream overall. Moreover, the scores predicted by a parametric model (i.e., without access to payload information) necessarily reflect a somewhat simplified representation of the perceptual impairment of the considered stream. The application ranges of the ITU-T P.1203.3 model are summarized in
31、Table 1. Table 1 Factors and application ranges of the ITU-T P.1203.3 model algorithms Video sequence duration 60 seconds 5 minutes Initial loading delay and stalling 0-10 seconds Maximum number of stalling events 5 Maximum length of a single stalling event 15 seconds Total stalling duration 30 seco
32、nds Other details No stalling within 5 seconds of the start of the video playing. 1 This Recommendation includes an electronic attachment containing the full 20 decision trees to calculate the model output described in clause 8.4. 2 Rec. ITU-T P.1203.3 (10/2017) 2 References The following ITU-T Reco
33、mmendations and other references contain provisions which, through reference in this text, constitute provisions of this Recommendation. At the time of publication, the editions indicated were valid. All Recommendations and other references are subject to revision; users of this Recommendation are t
34、herefore encouraged to investigate the possibility of applying the most recent edition of the Recommendations and other references listed below. A list of the currently valid ITU-T Recommendations is regularly published. The reference to a document within this Recommendation does not give it, as a s
35、tand-alone document, the status of a Recommendation. ITU-T P.800.1 Recommendation ITU-T P.800.1 (2016), Mean opinion score (MOS) terminology. ITU-T P.910 Recommendation ITU-T P.910 (2008), Subjective video quality assessment methods for multimedia applications. ITU-T P.911 Recommendation ITU-T P.911
36、 (1998), Subjective audiovisual quality assessment methods for multimedia applications. ITU-T P.1201 Recommendation ITU-T P.1201 (2012), Parametric non-intrusive assessment of audiovisual media streaming quality. ITU-T P.1203 Recommendation ITU-T P.1203 (2016), Parametric bitstream-based quality ass
37、essment of progressive download and adaptive audiovisual streaming services over reliable transport. ITU-T P.1203.1 Recommendation ITU-T P.1203.1 (2016), Parametric bitstream-based quality assessment of progressive download and adaptive audiovisual streaming services over reliable transport Video qu
38、ality estimation module. ITU-T P.1203.2 Recommendation ITU-T P.1203.2 (2016), Parametric bitstream-based quality assessment of progressive download and adaptive audiovisual streaming services over reliable transport Audio quality estimation module. ITU-T P.1401 Recommendation ITU-T P.1401 (2012), Me
39、thods, metrics and procedures for statistical evaluation, qualification and comparison of objective quality prediction models. 3 Definitions 3.1 Terms defined elsewhere This Recommendation uses the following term defined elsewhere: 3.1.1 mean opinion score (MOS): ITU-T P.800.1 Further terms are defi
40、ned in Recommendation ITU-T P.1203. 3.2 Terms defined in this Recommendation This Recommendation defines the following term: 3.2.1 T (media length): The length (in seconds) of the input media signal under consideration (O.21 or O.22). If both audio and video are considered, and they are not of equal
41、 length, the shorter duration of both inputs shall be used. The longer input shall be truncated at the end to match the duration of the shorter one: = (.21),(.22) Rec. ITU-T P.1203.3 (10/2017) 3 4 Abbreviations and acronyms This Recommendation uses the following abbreviations and acronyms: MOS Mean
42、Opinion Score 5 Conventions None. 6 Building blocks The overall ITU-T P.1203 architecture is shown in Figure 6-1. The model described in this Recommendation is highlighted as the “Pq: Quality integration module“. The model takes as input: 1) audio coding quality per output sampling interval (O.21),
43、2) video coding quality per output sampling interval (O.22), 3) stalling positions and their length (I.14). An internal media parameter extraction module extracts audio- and video-related parameters. The model outputs: 1) final media session quality score (O.46), 2) audiovisual segment coding qualit
44、y per output sampling interval (O.34), 3) final audiovisual coding quality score (O.35), 4) perceptual stalling indication (O.23). The model details are described in clauses 7 and 8. Figure 6-1 Pq module in context of building blocks of the ITU-T P.1203 model 7 Model input The model must receive inf
45、ormation about the estimated quality of the audio and video streams and the occurrence of stalling events during playback. In various modes of operation, the following inputs 4 Rec. ITU-T P.1203.3 (10/2017) may be extracted or estimated from network transmissions in different ways, which are outside
46、 the scope of this Recommendation: The model must receive the following input signals regardless of the mode of operation: O.21: audio quality per output sampling interval, as specified in ITU-T P.1203.2. O.22: video quality per output sampling interval, as specified in ITU-T P.1203.1. I.14: stallin
47、g events, as described in clause 7.1. I.GEN: device type (either “PC“ or “Mobile“), as specified in ITU-T P.1203. The audio and video coding quality estimations (O.21 and O.22) may be generated using ITU-T P.1203.2 and ITU-T P.1203.1 modules, respectively, but may in principle also be generated usin
48、g other audiovisual quality prediction models, provided that they 1) operate on the same output score range and 2) output sampling interval as defined in ITU-T P.1203. It is noted that in case that short-term video and/or audio quality modules are used, other than the ones specified in ITU-T P.1203.
49、1 and ITU-T P.1203.2, respectively, the performance figures will be different from the ones described in ITU-T P.1203. Also, to be fully compliant with ITU-T P.1203, the implementation of a complete set of modules Pv, Pa and Pq according to ITU-T P.1203.1, ITU-T P.1203.2 and this Recommendation is required. 7.1 I.14 input specification I.14 consists of a vector of stalling events observed in the media session. Each event consists of a tuple of start time and duration