1、 I n t e r n a t i o n a l T e l e c o m m u n i c a t i o n U n i o n ITU-T J.343 TELECOMMUNICATION STANDARDIZATION SECTOR OF ITU (11/2014) SERIES J: CABLE NETWORKS AND TRANSMISSION OF TELEVISION, SOUND PROGRAMME AND OTHER MULTIMEDIA SIGNALS Measurement of the quality of service - Part 3 Hybrid per
2、ceptual bitstream models for objective video quality measurements Recommendation ITU-T J.343 Rec. ITU-T J.343 (11/2014) i Recommendation ITU-T J.343 Hybrid perceptual bitstream models for objective video quality measurements Summary Recommendation ITU-T J.343 specifies objective video quality measur
3、ement methods which use bitstream data in addition to processed video sequences. From bitstream data, the models can obtain additional information on the codec type, bit rate, frame rate, some transmission errors and spatial/temporal shifts. Consequently, such models may provide improved performance
4、 compared to objective video quality models, which use only processed video sequences. History Edition Recommendation Approval Study Group Unique ID* 1.0 ITU-T J.343 2014-11-29 9 11.1002/1000/12315 _ * To access the Recommendation, type the URL http:/handle.itu.int/ in the address field of your web
5、browser, followed by the Recommendations unique ID. For example, http:/handle.itu.int/11.1002/1000/11830-en. ii Rec. ITU-T J.343 (11/2014) FOREWORD The International Telecommunication Union (ITU) is the United Nations specialized agency in the field of telecommunications, information and communicati
6、on technologies (ICTs). The ITU Telecommunication Standardization Sector (ITU-T) is a permanent organ of ITU. ITU-T is responsible for studying technical, operating and tariff questions and issuing Recommendations on them with a view to standardizing telecommunications on a worldwide basis. The Worl
7、d Telecommunication Standardization Assembly (WTSA), which meets every four years, establishes the topics for study by the ITU-T study groups which, in turn, produce Recommendations on these topics. The approval of ITU-T Recommendations is covered by the procedure laid down in WTSA Resolution 1. In
8、some areas of information technology which fall within ITU-Ts purview, the necessary standards are prepared on a collaborative basis with ISO and IEC. NOTE In this Recommendation, the expression “Administration“ is used for conciseness to indicate both a telecommunication administration and a recogn
9、ized operating agency. Compliance with this Recommendation is voluntary. However, the Recommendation may contain certain mandatory provisions (to ensure, e.g., interoperability or applicability) and compliance with the Recommendation is achieved when all of these mandatory provisions are met. The wo
10、rds “shall“ or some other obligatory language such as “must“ and the negative equivalents are used to express requirements. The use of such words does not suggest that compliance with the Recommendation is required of any party. INTELLECTUAL PROPERTY RIGHTSITU draws attention to the possibility that
11、 the practice or implementation of this Recommendation may involve the use of a claimed Intellectual Property Right. ITU takes no position concerning the evidence, validity or applicability of claimed Intellectual Property Rights, whether asserted by ITU members or others outside of the Recommendati
12、on development process. As of the date of approval of this Recommendation, ITU had received notice of intellectual property, protected by patents, which may be required to implement this Recommendation. However, implementers are cautioned that this may not represent the latest information and are th
13、erefore strongly urged to consult the TSB patent database at http:/www.itu.int/ITU-T/ipr/. ITU 2015 All rights reserved. No part of this publication may be reproduced, by any means whatsoever, without the prior written permission of ITU. Rec. ITU-T J.343 (11/2014) iii Table of Contents Page 1 Scope
14、. 1 1.1 Applications 1 2 References . 1 3 Definitions 2 3.1 Terms defined elsewhere 2 3.2 Terms defined in this Recommendation . 2 4 Abbreviations and acronyms 3 5 Conventions 3 6 Description of hybrid perceptual bitstream model types 3 Annex A Summary of VQEG validation of hybrid models 6 A.1 Subje
15、ctive datasets 6 A.2 Model performance summary . 8 Bibliography. 17 iv Rec. ITU-T J.343 (11/2014) Introduction Generally video quality estimation models are, depending upon the required input signals, categorized as no reference (NR), reduced reference (RR) and full reference (FR) models. NR models
16、are provided with the processed video sequences only. RR models require that features extracted from the reference video sequences and the processed video sequences are provided. For FR models, the unimpaired reference and the processed video sequences must be provided. In addition, the models descr
17、ibed in this Recommendation need access to the received bitstream data from which the model can obtain information on transmission errors (e.g., delay, packet loss), codec (e.g., type, bit-rates, frame rates, codec parameters), etc. Consequently, the models described here are categorized as hybrid m
18、odels (i.e., Hybrid-NR, Hybrid-RR and Hybrid-FR). Rec. ITU-T J.343 (11/2014) 1 Recommendation ITU-T J.343 Hybrid perceptual bitstream models for objective video quality measurements 1 Scope This Recommendation describes recommended objective models for non-intrusive monitoring of the video quality o
19、f IP-based video services based on the decoded video frames and packet-header. Some types of models also utilize the reference video or bitstream information. This Recommendation addresses six application areas: ITU-T J.343.1 specifies Hybrid-NRe models ITU-T J.343.2 specifies Hybrid-NR models ITU-T
20、 J.343.3 specifies Hybrid-RRe models ITU-T J.343.4 specifies Hybrid-RR models ITU-T J.343.5 specifies Hybrid-FRe models ITU-T J.343.6 specifies Hybrid-FR models ITU-T J.343.1 includes two models, ITU-T J.343.2 includes one model, ITU-T J.343.3 includes one model that operates at multiple side channe
21、l bandwidths to transmit the reduced reference information, ITU-T J.343.4 includes one model that operates at multiple side channel bandwidths to transmit the reduced reference information, ITU-T J.343.5 includes two models and ITU-T J.343.6 includes two models. All of these models predict video qua
22、lity in terms of mean opinion score (MOS), for example on a five-level absolute category rating (ACR) scale (see ITU-T P.910 or ITU-T P.913). 1.1 Applications This Recommendation describes models that estimate perceptual video quality. The applications for the estimation models described in this Rec
23、ommendation include, but are not limited to: real-time, in-service quality monitoring at the source; remote destination quality monitoring; quality measurement of transmission systems that utilize video compression and decompression techniques, including concatenations of such techniques. More infor
24、mation about applications can be found in the individual Recommendations that address these six application areas. 2 References The following ITU-T Recommendations and other references contain provisions which, through reference in this text, constitute provisions of this Recommendation. At the time
25、 of publication, the editions indicated were valid. All Recommendations and other references are subject to revision; users of this Recommendation are therefore encouraged to investigate the possibility of applying the most recent edition of the Recommendations and other references listed below. A l
26、ist of the currently valid ITU-T Recommendations is regularly published. The reference to a document within this Recommendation does not give it, as a stand-alone document, the status of a Recommendation. ITU-T J.340 Recommendation ITU-T J.340 (2010), Reference algorithm for computing peak signal to
27、 noise ratio of a processed video sequence with compensation for constant spatial shifts, constant temporal shift, and constant luminance gain and offset. 2 Rec. ITU-T J.343 (11/2014) ITU-T J.343.1 Recommendation ITU-T J.343.1 (2014), Hybrid-NRe objective perceptual video quality measurement for HDT
28、V and multimedia IP-based video services in the presence of encrypted bitstream data. ITU-T J.343.2 Recommendation ITU-T J.343.2 (2014), Hybrid-NR objective perceptual video quality measurement for HDTV and multimedia IP-based video services in the presence of non-encrypted bitstream data. ITU-T J.3
29、43.3 Recommendation ITU-T J.343.3 (2014), Hybrid-RRe objective perceptual video quality measurement for HDTV and multimedia IP-based video services in the presence of a reduced reference signal and encrypted bitstream data. ITU-T J.343.4 Recommendation ITU-T J.343.4 (2014), Hybrid-RR objective perce
30、ptual video quality measurement for HDTV and multimedia IP-based video services in the presence of a reduced reference signal and non-encrypted bitstream data. ITU-T J.343.5 Recommendation ITU-T J.343.5 (2014), Hybrid-FRe objective perceptual video quality measurement for HDTV and multimedia IP-base
31、d video services in the presence of a full reference signal and encrypted bitstream data. ITU-T J.343.6 Recommendation ITU-T J.343.6 (2014), Hybrid-FR objective perceptual video quality measurement for HDTV and multimedia IP-based video services in the presence of a full reference signal and non-enc
32、rypted bitstream data. ITU-T P.910 Recommendation ITU-T P.910 (2008), Subjective video quality assessment methods for multimedia applications. ITU-T P.913 Recommendation ITU-T P.913 (2014), Methods for the subjective assessment of video quality, audio quality and audiovisual quality of Internet vide
33、o and distribution quality television in any environment. 3 Definitions 3.1 Terms defined elsewhere This Recommendation uses the following terms defines elsewhere: 3.1.1 processed ITU-T P.913: The reference stimuli presented through a system under test. 3.1.2 processed video sequence ITU-T P.913: Th
34、e processed video sequence (PVS) is the impaired version of a video sequence. 3.1.3 reference ITU-T P.913: The original version of each source stimulus. This is the highest quality version available of the audio sample, video clip or audiovisual sequence. 3.2 Terms defined in this Recommendation Thi
35、s Recommendation defines the following terms: 3.2.1 hybrid full reference model: An objective video quality model that predicts subjective quality using the reference video, the decoded video frames, packet headers, and the video payload. Such models cannot analyse encrypted video. 3.2.2 hybrid full
36、 reference encrypted model: An objective video quality model that predicts subjective quality using the reference video, the decoded video frames, and packet headers. Such models are suitable for use with encrypted video. 3.2.3 hybrid no reference model: An objective video quality model that predict
37、s subjective quality using the decoded video frames, packet headers, and video payload. Such models can be deployed in-service but cannot analyse encrypted video. Rec. ITU-T J.343 (11/2014) 3 3.2.4 hybrid no reference encrypted model: An objective video quality model that predicts subjective quality
38、 using the decoded video frames and packet headers. Such models can be deployed in-service and are suitable for use with encrypted video. 3.2.5 hybrid reduced reference model: An objective video quality model that predicts subjective quality using the decoded video frames, packet headers, video payl
39、oad and features extracted from the reference video. Such models can be deployed in-service but cannot analyse encrypted video. 3.2.6 hybrid reduced reference encrypted model: An objective video quality model that predicts subjective quality using the decoded video frames, packet headers, and featur
40、es extracted from the reference video. These models can be deployed in-service and are suitable for use with encrypted video. 4 Abbreviations and acronyms This Recommendation uses the following abbreviations and acronyms: ACR Absolute Category Rating CODEC Coder-Decoder ES Elementary bitStream FR Fu
41、ll Reference Hybrid-FR Hybrid Full Reference Hybrid-FRe Hybrid Full Reference encrypted Hybrid-NR Hybrid No Reference Hybrid-NRe Hybrid No Reference encrypted Hybrid-RR Hybrid Reduced Reference Hybrid-RRe Hybrid Reduced Reference encrypted MOS Mean Opinion Score MPEG Moving Picture Experts Group NR
42、No (or zero) Reference PES Packetized Elementary bitStream PSNR Peak Signal to Noise Ratio PVS Processed Video Sequence RMSE Root-Mean Square Error RR Reduced Reference SRC Source Reference Channel or Circuit VQEG Video Quality Experts Group 5 Conventions None. 6 Description of hybrid perceptual bit
43、stream model types This Recommendation specifies objective video quality measurement methods which use both processed video sequences and bitstream data. The bitstream data may be provided in the forms of 4 Rec. ITU-T J.343 (11/2014) elementary bitstream (ES), packetized elementary bitstream (PES) o
44、r packet video (Figure 1). Table 1 shows required inputs for each model. Table 1 Required inputs Model type Model name Required inputs Hybrid NRe RST-V model YHyNRe Processed video sequence (PVS) Encrypted bitstream Hybrid NR YHyNR PVS Non-encrypted bitstream Hybrid RRe YHyRRe PVS Features extracted
45、 from source reference channel (SRC) Encrypted bitstream Hybrid RR YHyRR PVS Features extracted from SRC Non-encrypted bitstream Hybrid FRe PEVQ-S (e) YHyFRe PVS SRC Encrypted bitstream Hybrid FR PEVQ-S YHyFR PVS SRC Non-encrypted bitstream Hybrid-NR and Hybrid-NRe models use only PVS and bitstream
46、data, as shown in Figure 1 and Figure 2. Where Hybrid-NR models have access to all of this data, Hybrid-NRe models do not have access to the video payload. Therefore, these models can be used with encrypted bitstreams. Figure 1 Block-diagram depicts the core concept of hybrid perceptual bitstream mo
47、dels Rec. ITU-T J.343 (11/2014) 5 MOSp: predicted MOS by the model Figure 2 Block-diagram of the Hybrid-NR model In addition to the data available to a Hybrid-NR model, Hybrid-RR and Hybrid-RRe models also use features extracted from source video sequences. Figure 3 shows a Hybrid-RR model. In addit
48、ion to the bitstream data, the Hybrid-RR model uses the features extracted from the SRC. Where Hybrid-RR models have access to all of this data, Hybrid-RRe models do not have access to the video payload. Therefore, these models can be used with encrypted bitstreams. DMOSp: predicted DMOS by the mode
49、l Figure 3 Block-diagram depicts the Hybrid-RR model In addition to the data available to a Hybrid-NR model, the Hybrid-FR and Hybrid-FRe models also use reference video sequences. Figure 4 shows a Hybrid-FR model. The Hybrid-FR and Hybrid-FRe models needs the SRC. Where Hybrid-FR models have access to all of this data, Hybrid-FRe models do not have access to the video payload. Therefore, these models can be used with encrypted bitstreams. Figure 4 Block-diagram de