ImageVerifierCode 换一换
格式:PDF , 页数:18 ,大小:331.37KB ,
资源ID:800635      下载积分:10000 积分
快捷下载
登录下载
邮箱/手机:
温馨提示:
如需开发票,请勿充值!快捷下载时,用户名和密码都是您填写的邮箱或者手机号,方便查询和重复下载(系统自动生成)。
如填写123,账号就是123,密码也是123。
特别说明:
请自助下载,系统不会自动发送文件的哦; 如果您已付费,想二次下载,请登录后访问:我的下载记录
支付方式: 支付宝扫码支付 微信扫码支付   
注意:如需开发票,请勿充值!
验证码:   换一换

加入VIP,免费下载
 

温馨提示:由于个人手机设置不同,如果发现不能下载,请复制以下地址【http://www.mydoc123.com/d-800635.html】到电脑端继续下载(重复下载不扣费)。

已注册用户请登录:
账号:
密码:
验证码:   换一换
  忘记密码?
三方登录: 微信登录  

下载须知

1: 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。
2: 试题试卷类文档,如果标题没有明确说明有答案则都视为没有答案,请知晓。
3: 文件的所有权益归上传用户所有。
4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
5. 本站仅提供交流平台,并不能对任何下载内容负责。
6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

版权提示 | 免责声明

本文(ITU-T P 800 2-2016 Mean opinion score interpretation and reporting (Study Group 12)《平均意见得分解释和报告(研究组12)》.pdf)为本站会员(boatfragile160)主动上传,麦多课文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。 若此文所含内容侵犯了您的版权或隐私,请立即通知麦多课文库(发送邮件至master@mydoc123.com或直接QQ联系客服),我们立即给予删除!

ITU-T P 800 2-2016 Mean opinion score interpretation and reporting (Study Group 12)《平均意见得分解释和报告(研究组12)》.pdf

1、 I n t e r n a t i o n a l T e l e c o m m u n i c a t i o n U n i o n ITU-T P.800.2 TELECOMMUNICATION STANDARDIZATION SECTOR OF ITU (07/2016) SERIES P: TERMINALS AND SUBJECTIVE AND OBJECTIVE ASSESSMENT METHODS Methods for objective and subjective assessment of speech and video quality Mean opinion

2、score interpretation and reporting Recommendation ITU-T P.800.2 ITU-T P-SERIES RECOMMENDATIONS TERMINALS AND SUBJECTIVE AND OBJECTIVE ASSESSMENT METHODS Vocabulary and effects of transmission parameters on customer opinion of transmission quality Series P.10 Voice terminal characteristics Series P.3

3、0 P.300 Reference systems Series P.40 Objective measuring apparatus Series P.50 P.500 Objective electro-acoustical measurements Series P.60 Measurements related to speech loudness Series P.70 Methods for objective and subjective assessment of speech quality Series P.80 Methods for objective and subj

4、ective assessment of speech and video quality Series P.800 Audiovisual quality in multimedia services Series P.900 Transmission performance and QoS aspects of IP end-points Series P.1000 Communications involving vehicles Series P.1100 Models and tools for quality assessment of streamed media Series

5、P.1200 Telemeeting assessment Series P.1300 Statistical analysis, evaluation and reporting guidelines of quality measurements Series P.1400 Methods for objective and subjective assessment of quality of services other than speech and video Series P.1500 For further details, please refer to the list o

6、f ITU-T Recommendations. Rec. ITU-T P.800.2 (07/2016) i Recommendation ITU-T P.800.2 Mean opinion score interpretation and reporting Summary Recommendation ITU-T P.800.2 introduces some of the more common types of mean opinion score (MOS) and describes the minimum information that should accompany M

7、OS values to enable them to be correctly interpreted History Edition Recommendation Approval Study Group Unique ID* 1.0 ITU-T P.800.2 2013-05-14 12 11.1002/1000/11934 2.0 ITU-T P.800.2 2016-07-29 12 11.1002/1000/12973 Keywords Absolute category rating, ACR, mean opinion score, MOS, objective model,

8、reporting, subjective experiment. _ * To access the Recommendation, type the URL http:/handle.itu.int/ in the address field of your web browser, followed by the Recommendations unique ID. For example, http:/handle.itu.int/11.1002/1000/11830-en. ii Rec. ITU-T P.800.2 (07/2016) FOREWORD The Internatio

9、nal Telecommunication Union (ITU) is the United Nations specialized agency in the field of telecommunications, information and communication technologies (ICTs). The ITU Telecommunication Standardization Sector (ITU-T) is a permanent organ of ITU. ITU-T is responsible for studying technical, operati

10、ng and tariff questions and issuing Recommendations on them with a view to standardizing telecommunications on a worldwide basis. The World Telecommunication Standardization Assembly (WTSA), which meets every four years, establishes the topics for study by the ITU-T study groups which, in turn, prod

11、uce Recommendations on these topics. The approval of ITU-T Recommendations is covered by the procedure laid down in WTSA Resolution 1. In some areas of information technology which fall within ITU-Ts purview, the necessary standards are prepared on a collaborative basis with ISO and IEC. NOTE In thi

12、s Recommendation, the expression “Administration“ is used for conciseness to indicate both a telecommunication administration and a recognized operating agency. Compliance with this Recommendation is voluntary. However, the Recommendation may contain certain mandatory provisions (to ensure, e.g., in

13、teroperability or applicability) and compliance with the Recommendation is achieved when all of these mandatory provisions are met. The words “shall“ or some other obligatory language such as “must“ and the negative equivalents are used to express requirements. The use of such words does not suggest

14、 that compliance with the Recommendation is required of any party. INTELLECTUAL PROPERTY RIGHTSITU draws attention to the possibility that the practice or implementation of this Recommendation may involve the use of a claimed Intellectual Property Right. ITU takes no position concerning the evidence

15、, validity or applicability of claimed Intellectual Property Rights, whether asserted by ITU members or others outside of the Recommendation development process. As of the date of approval of this Recommendation, ITU had not received notice of intellectual property, protected by patents, which may b

16、e required to implement this Recommendation. However, implementers are cautioned that this may not represent the latest information and are therefore strongly urged to consult the TSB patent database at http:/www.itu.int/ITU-T/ipr/. ITU 2016 All rights reserved. No part of this publication may be re

17、produced, by any means whatsoever, without the prior written permission of ITU. Rec. ITU-T P.800.2 (07/2016) iii Table of Contents Page 1 Scope . 1 2 References . 1 3 Definitions 1 3.1 Terms defined elsewhere 1 3.2 Terms defined in this Recommendation . 1 4 Abbreviations and acronyms 1 5 Conventions

18、 2 6 Introductory information 2 7 Subjective MOS values . 2 8 Interpreting MOS values . 4 9 Video considerations 4 10 Statistical analysis of MOS . 5 11 Objective MOS values 5 12 Reporting subjective MOS values 6 13 Reporting objective MOS values 7 14 Notation 7 Bibliography. 8 Rec. ITU-T P.800.2 (0

19、7/2016) 1 Recommendation ITU-T P.800.2 Mean opinion score interpretation and reporting 1 Scope This Recommendation introduces some of the more common types of mean opinion score (MOS) and describes the minimum information that should accompany MOS values to enable them to be correctly interpreted. I

20、t should be noted that this text does not aim to provide a definitive guide to subjective or objective testing. The bibliography at the end of this Recommendation provides information on more detailed material. 2 References The following ITU-T Recommendations and other references contain provisions

21、which, through reference in this text, constitute provisions of this Recommendation. At the time of publication, the editions indicated were valid. All Recommendations and other references are subject to revision; users of this Recommendation are therefore encouraged to investigate the possibility o

22、f applying the most recent edition of the Recommendations and other references listed below. A list of the currently valid ITU-T Recommendations is regularly published. The reference to a document within this Recommendation does not give it, as a stand-alone document, the status of a Recommendation.

23、 ITU-T P.800.1 Recommendation ITU-T P.800.1 (2006), Mean Opinion Score (MOS) terminology. 3 Definitions 3.1 Terms defined elsewhere None. 3.2 Terms defined in this Recommendation This Recommendation defines the following terms: 3.2.1 condition: One of a set of use cases being evaluated in a subjecti

24、ve experiment; often referred to as a hypothetical reference circuit (HRC) in video experiments. 3.2.2 sub-condition: A subset of a condition defined by a specific characteristic of the use case, e.g., speech material from a particular talker. 3.2.3 subject: A participant in a subject experiment. 3.

25、2.4 vote: A subjects response to a question in a rating scale for an individual test sample or interaction. 4 Abbreviations and acronyms This Recommendation uses the following abbreviations and acronyms: ACR Absolute Category Rating DCR Degradation Category Rating DMOS Degradation Mean Opinion Score

26、 HRC Hypothetical Reference Circuit 2 Rec. ITU-T P.800.2 (07/2016) MOS Mean Opinion Score MUSHRA Multi-stimulus test with Hidden Reference and Anchor QCIF Quarter Common Intermediate Format SSCQE Single Stimulus Continuous Quality Evaluation VGA Video Graphics Array 5 Conventions None. 6 Introductor

27、y information Audio and video quality are inherently subjective quantities. This means that the baseline for audio and video quality is the opinion of the user. However, one persons opinion of what is good may be quite different to another persons opinion neither person is correct, neither person is

28、 incorrect. Before a new audio or video transmission technology is deployed, it is good practice to assess the transmission quality using one or more subjective experiments. The purpose of a subjective experiment is to collect the opinions of multiple people (“subjects“) about the performance of the

29、 system for a number of well-defined use cases (“conditions“)1. The mean opinion score (MOS) for a given condition is simply the average of the opinions (“votes“) collected for that use case. Objective quality measurement algorithms aim to predict the MOS value that a given input signal would produc

30、e in a subjective experiment. Hence, when interpreting an objectively derived MOS value, it is essential to understand the basic design of the experiment being predicted. There are several different types of MOS value and many different test methodologies for producing them. The purpose of this Reco

31、mmendation is to give the reader an appreciation of the main points to consider when interpreting MOS values and the minimum information that should accompany MOS values when they are reported. 7 Subjective MOS values Types of MOS There is a common misconception that MOS values only pertain to voice

32、 services, but the process of asking subjects to provide their assessment of quality can be just as easily applied to video and general audio services as it can to voice services. It is also possible to ask subjects to rate the overall audiovisual quality of a service. The ITU has produced various s

33、tandards describing different aspects of subjective testing for video and general audio applications in addition to voice applications, and these are listed in the bibliography. Subjective experiments may be broadly divided into two types: passive and interactive. In a passive subjective experiment,

34、 subjects are presented with pre-recorded test samples representing the conditions of interest. The subjects are asked to passively listen to and/or watch the test material and provide their opinion using the rating scale provided. In an interactive experiment, two or more subjects actively engage i

35、n conversation using equipment designed to emulate the use cases of interest. The subjects are often given tasks in order to stimulate conversation and interaction. Most experiments tend to be passive in nature. However, there are some aspects of user experience, for example, the effects of delay an

36、d echo, that only become apparent in conversational scenarios. _ 1 In video experiments, conditions are often referred to as hypothetical reference circuits (HRCs). Rec. ITU-T P.800.2 (07/2016) 3 Test methodology and rating scale In a subjective experiment, subjects are asked to provide their opinio

37、ns using a “rating scale“. The purpose of the scale is to translate a subjects quality assessment into a numerical value that can be averaged across subjects and other experimental factors. There are several rating scales in common use, and the relative benefits of different scales are outside the s

38、cope of this Recommendation. The most commonly used scale is the 5-point absolute category rating (ACR) scale: Excellent 5 Good 4 Fair 3 Poor 2 Bad 1 The ACR scale is a discrete scale, meaning that the subjects response is limited to one of the five values listed above. However, the averaging proces

39、s used to combine results from different subjects means that MOS values are not confined to integer values. Some rating scales have more than five discrete labels, while others allow the subject to provide intermediate responses at points between the labels. The “absolute“ part of ACR relates to the

40、 fact that subjects are asked to independently rate each sample. Some rating scales, such as the degradation category rating (DCR) scale, ask for a subjects opinion about the difference between a sample processed through the condition of interest and an unprocessed version of the same sample. The MO

41、S value produced in such an experiment is often called a degradation MOS or DMOS. In most experimental designs, subjects are asked to rate the quality of short audio or video samples. The duration of such samples is usually in the range of 6 to 10 seconds, as this provides enough time for the subjec

42、t to form an opinion without introducing any bias towards the end of the sample. It is difficult for a single sample of this duration to represent a whole condition, and hence subjects are typically asked to rate multiple test samples derived from the same use case. For example, in a voice experimen

43、t, each network condition under test might be represented with speech samples from three male and three female talkers. This means that MOS values can be produced for the entire condition, by averaging across both subjects and talkers, or for a sub-condition, such as a particular talker or gender of

44、 talker. Test methods, such as single stimulus continuous quality evaluation (SSCQE), use much longer test samples, and require the subject to continuously update their opinion of quality as the test sample is being played. This results in a time sequence of quality ratings from each subject, rather

45、 than a single opinion value. Some test methodologies require the subject to answer multiple questions. Not only does this yield more information about the conditions under test, it can be a necessary part of the test design. For example, the ITU-T P.835 test method requires the subject to provide s

46、eparate opinions about the speech quality and the noise quality of a sample before providing an overall quality score. This process has been found to yield more stable results with noise suppression systems than the single question ACR test method. It should be noted that some questions may not rela

47、te directly to quality, but may address a different aspect of communications, for example, b-ITU-T P.800 defines a listening effort scale for voice experiments. Similarly, some conversational experiments ask the subject about their experience when talking, rather than when listening. 4 Rec. ITU-T P.

48、800.2 (07/2016) 8 Interpreting MOS values The following discussion initially focuses on voice MOS values; however, many of the points made in the subsections apply equally to video, audio and audio-video MOS values. The main differences for video are described in the following clause. The idea that

49、a particular voice codec has a particular MOS score is another common misconception. One source of this misconception is the widespread use of objective quality assessment models, which produce very repeatable results. Such models are designed to predict or estimate the output of subjective experiments; however, for any given codec at a given bit rate, the MOS value obtained in a subjective experiment can vary substantially from experiment to experiment. There are a number of reasons for this. Firstly, the exact MOS val

copyright@ 2008-2019 麦多课文库(www.mydoc123.com)网站版权所有
备案/许可证编号:苏ICP备17064731号-1