1、 I n t e r n a t i o n a l T e l e c o m m u n i c a t i o n U n i o n ITU-T P.863 TELECOMMUNICATION STANDARDIZATION SECTOR OF ITU (03/2018) SERIES P: TELEPHONE TRANSMISSION QUALITY, TELEPHONE INSTALLATIONS, LOCAL LINE NETWORKS Methods for objective and subjective assessment of speech and video qual
2、ity Perceptual objective listening quality prediction Recommendation ITU-T P.863 ITU-T P-SERIES RECOMMENDATIONS TELEPHONE TRANSMISSION QUALITY, TELEPHONE INSTALLATIONS, LOCAL LINE NETWORKS Vocabulary and effects of transmission parameters on customer opinion of transmission quality Series P.10 Voice
3、 terminal characteristics Series P.30 P.300 Reference systems Series P.40 Objective measuring apparatus Series P.50 P.500 Objective electro-acoustical measurements Series P.60 Measurements related to speech loudness Series P.70 Methods for objective and subjective assessment of speech quality Series
4、 P.80 Methods for objective and subjective assessment of speech and video quality Series P.800 Audiovisual quality in multimedia services Series P.900 Transmission performance and QoS aspects of IP end-points Series P.1000 Communications involving vehicles Series P.1100 Models and tools for quality
5、assessment of streamed media Series P.1200 Telemeeting assessment Series P.1300 Statistical analysis, evaluation and reporting guidelines of quality measurements Series P.1400 Methods for objective and subjective assessment of quality of services other than speech and video Series P.1500 For further
6、 details, please refer to the list of ITU-T Recommendations. Rec. ITU-T P.863 (03/2018) i Recommendation ITU-T P.863 Perceptual objective listening quality prediction Summary Recommendation ITU-T P.863 describes an objective method for predicting overall listening speech quality from narrowband (NB)
7、 (300 to 3 400 Hz) to fullband (FB) (20 to 20 000 Hz) telecommunication scenarios as perceived by a user in an ITU-T P.800 or ITU-T P.830 absolute category rating (ACR) listening-only test. Recommendation ITU-T P.863 supports two operational modes, one for narrowband and one for fullband. Super-wide
8、band (SWB) (50 to 14 000 Hz) experiments can be simulated by band limiting the reference and accordingly the degraded signal. This Recommendation presents a high-level description of the method and advice on how to use it. All essential parts of the model are described in detail, and are provided in
9、 separate pdf files (see Annex B). These files form an integral part of this Recommendation and shall take precedence in case of conflicts between the high-level descriptions included in the main body of this Recommendation and the corresponding detailed description parts. A conformance testing proc
10、edure is also specified in Annex A to allow a user to validate that an alternative implementation of the model is correct. This Recommendation includes an electronic attachment containing detailed descriptions in pdf format (see Annex B) and conformance testing data (see Annex A). Edition 3 of ITU-T
11、 P.863 no longer applies an initial 14 kHz low-pass filter, therefore it is now able to consider spectral components above 14 kHz in its analysis. This expands the scope of application to fullband speech codecs (e.g., OPUS, enhanced voice service (EVS). Additionally, the shift-jitter which could be
12、observed by repeated measurements with slightly differing delay, is decreased. The gain variation introduced by automatic gain control as well as slowly time-varying linear frequency distortions are now adequately considered. Furthermore, the problems of ITU-T P.863 version 2 documented in the follo
13、wing implementers guides were addressed and have been resolved: former implementers guide on assessment of EVS coded speech with Recommendation ITU-T P.863 (2016); former implementers guide on non-validated test conditions with inserted gaps in speech by Recommendation ITU-T P.863 (2016); former imp
14、lementers guide on discrimination of wideband (WB) and super-wideband speech by Recommendation ITU-T P.863 (2016); former implementers guide on correction of Recommendation ITU-T P.863 regarding reverb (2016). History Edition Recommendation Approval Study Group Unique ID* 1.0 ITU-T P.863 2011-01-13
15、12 11.1002/1000/11009 1.1 ITU-T P.863 (2011) Amd. 1 2011-11-09 12 11.1002/1000/11463 2.0 ITU-T P.863 2014-09-11 12 11.1002/1000/12174 3.0 ITU-T P.863 2018-03-16 12 11.1002/1000/13570 Keywords Listening quality, objective quality, perceptual model, voice quality. _ * To access the Recommendation, typ
16、e the URL http:/handle.itu.int/ in the address field of your web browser, followed by the Recommendations unique ID. For example, http:/handle.itu.int/11.1002/1000/11830-en. ii Rec. ITU-T P.863 (03/2018) FOREWORD The International Telecommunication Union (ITU) is the United Nations specialized agenc
17、y in the field of telecommunications, information and communication technologies (ICTs). The ITU Telecommunication Standardization Sector (ITU-T) is a permanent organ of ITU. ITU-T is responsible for studying technical, operating and tariff questions and issuing Recommendations on them with a view t
18、o standardizing telecommunications on a worldwide basis. The World Telecommunication Standardization Assembly (WTSA), which meets every four years, establishes the topics for study by the ITU-T study groups which, in turn, produce Recommendations on these topics. The approval of ITU-T Recommendation
19、s is covered by the procedure laid down in WTSA Resolution 1. In some areas of information technology which fall within ITU-Ts purview, the necessary standards are prepared on a collaborative basis with ISO and IEC. NOTE In this Recommendation, the expression “Administration“ is used for conciseness
20、 to indicate both a telecommunication administration and a recognized operating agency. Compliance with this Recommendation is voluntary. However, the Recommendation may contain certain mandatory provisions (to ensure, e.g., interoperability or applicability) and compliance with the Recommendation i
21、s achieved when all of these mandatory provisions are met. The words “shall“ or some other obligatory language such as “must“ and the negative equivalents are used to express requirements. The use of such words does not suggest that compliance with the Recommendation is required of any party. INTELL
22、ECTUAL PROPERTY RIGHTSITU draws attention to the possibility that the practice or implementation of this Recommendation may involve the use of a claimed Intellectual Property Right. ITU takes no position concerning the evidence, validity or applicability of claimed Intellectual Property Rights, whet
23、her asserted by ITU members or others outside of the Recommendation development process. As of the date of approval of this Recommendation, ITU had received notice of intellectual property, protected by patents, which may be required to implement this Recommendation. However, implementers are cautio
24、ned that this may not represent the latest information and are therefore strongly urged to consult the TSB patent database at http:/www.itu.int/ITU-T/ipr/. ITU 2018 All rights reserved. No part of this publication may be reproduced, by any means whatsoever, without the prior written permission of IT
25、U. Rec. ITU-T P.863 (03/2018) iii Table of Contents Page 1 Scope . 1 2 References . 5 3 Definitions 6 3.1 Terms defined elsewhere 6 4 Abbreviations and acronyms 6 5 Conventions 7 6 Overview of the ITU-T P.863 algorithm 7 7 Comparison between objective and subjective scores 9 8 Speech material . 10 8
26、.1 Recommendations on source speech material 10 8.2 Insertion of source speech material into the system under test 12 8.3 Recommendations on processed and degraded speech material 12 8.4 Special requirements for acoustical captured speech material . 13 8.5 Acoustical insertion/capture for loudspeake
27、r phones . 14 8.6 Technical requirements on signals to be processed by ITU-T P.863 . 14 8.7 Scores predicted by the model 14 9 Description of the ITU-T P.863 algorithm . 14 9.1 Overview 15 9.2 Temporal alignment 16 9.3 Joining sections with constant delay 30 9.4 Sample rate ratio detection . 30 9.5
28、Resampling . 31 9.6 Level, frequency response and time alignment pre-processing 31 9.7 Perceptual model 32 10 Conclusions. 46 Annex A Conformance data and conformance tests . 48 A.1 List of files provided for conformance validation 48 A.2 Conformance tests 48 A.3 Conversion of sampling rates . 50 A.
29、4 Digital attachments . 51 Annex B Detailed Descriptions of the ITU-T P.863 algorithm in pdf-format 52 Appendix I Reporting of the performance results for the ITU-T P.863 algorithm based on the rmse* metric 53 I.1 Purpose of this appendix 53 I.2 Overview 53 I.3 Performance results for the ITU-T P.86
30、3 algorithm 54 I.4 Scatter plots 56 iv Rec. ITU-T P.863 (03/2018) Page Appendix II Description of the “full-scale“ subjective tests in a super-wideband context conducted for the ITU-T P.863 algorithm training and validation . 60 II.1 Database structure and subjects requirement . 60 II.2 Anchor condi
31、tions 60 II.3 Design rules of test conditions for full-scale mandatory tests 61 II.4 Reference and degraded speech material . 61 II.5 Transmission and capturing capture of speech material superimposed interlaced with background noises . 62 II.6 Transmission and capturing capture of speech material u
32、nder time warping conditions . 63 II.7 Subjective test set up for assessing super-wideband speech quality 63 II.8 Limitations in subjective test results 63 Appendix III Prediction of acoustically recorded narrowband speech . 65 III.1 Background . 65 III.2 Requirements for acoustically recorded speec
33、h data to be assessed by ITU-T P.863 . 65 III.3 Pre-processing of speech and use of ITU-T P.863 . 65 III.4 Interpretation of results . 66 III.5 Example results 66 Bibliography. 69 Electronic attachment: Detailed descriptions in pdf format, and conformance testing data. Rec. ITU-T P.863 (03/2018) v I
34、ntroduction Recommendation ITU-T P.863 defines a single algorithm for assessing the speech quality of current and near future telephony systems that utilize a broad variety of coding, transport and enhancement technologies. The measurement algorithm is a full reference model which operates by perfor
35、ming a comparison between a known reference signal and a captured degraded signal. This is consistent with the algorithms described in Recommendations ITU-T P.861 and ITU-T P.862. Recommendation ITU-T P.861, published in 1996, was primarily focused on identifying the quality impact of codecs. Subseq
36、uent to its release, work on a successor was started to create an algorithm suitable for assessing the additional impact of network impairments. The work resulted in the publishing of Recommendation ITU-T P.862 in 2001. Recommendation ITU-T P.863 incorporates current industry requirements and in par
37、ticular allows the assessment of fullband speech as well as networks and codecs that introduce time warping. Rec. ITU-T P.863 (03/2018) 1 Recommendation ITU-T P.863 Perceptual objective listening quality prediction 1 Scope This Recommendation1 defines a single algorithm for assessing the speech qual
38、ity of current and near future telephony systems that utilize a broad variety of coding, transport and speech enhancement technologies. Based on the benchmark results presented within the studies of ITU-T, an overview of the test factors, coding technologies and applications to which this Recommenda
39、tion applies is given in Tables 1 to 4. Table 1 presents factors and applications included in the requirement specification and which were used in the selection phase of the ITU-T P.863 algorithm. The performance of the ITU-T P.863 algorithm under each individual condition in Table 1 is not reflecte
40、d in the table. Table 2 presents a list of conditions for which this Recommendation is not intended to be used. Table 3 presents test variables for which further investigation is needed, or for which ITU-T P.863 is subject to claims of providing inaccurate predictions when used in conjunction with t
41、hese. Finally, Table 4 lists factors, technologies and applications for which the ITU-T P.863 algorithm has not currently been validated. The ITU-T P.863 algorithm cannot be used to replace subjective testing. The ITU-T P.863 algorithm does not provide a comprehensive evaluation of transmission qual
42、ity. It only measures the effects of one-way speech distortion and noise on speech quality. The effects of delay, sidetone, echo, and other impairments related to two-way interaction (e.g., centre clipper) are not reflected in the ITU-T P.863 scores. Therefore, it is possible to have high ITU-T P.86
43、3 scores, yet poor overall conversational quality. Table 1 Factors and applications included in the requirement specification and used in the selection phase of the ITU-T P.863 algorithm Test factors Speech input levels to a codec Transmission channel errors Packet loss and packet loss concealment B
44、it rates if a codec has more than one bit-rate mode Transcodings Acoustic noise in sending environment Effect of varying delay in listening-only tests Short-term time warping of audio signal Long-term time warping of audio signal Listening levels between 53 and 78 dB(A) sound pressure level (SPL) in
45、 fullband mode Packet loss and packet loss concealment with pulse code modulation (PCM) type codecs _ 1 This Recommendation includes an electronic attachment containing detailed descriptions in pdf format (see Annex B) and conformance testing data (see Annex A). 2 Rec. ITU-T P.863 (03/2018) Table 1
46、Factors and applications included in the requirement specification and used in the selection phase of the ITU-T P.863 algorithm Test factors Temporal and amplitude clipping of speech Linear distortions, including bandwidth limitations and spectral shaping (non-flat frequency responses) Frequency res
47、ponse Coding technologies ITU-T G.711, ITU-T G.711 PLC, ITU-T G.711.1 ITU-T G.718, ITU-T G.719, ITU-T G.722, ITU-T G.722.1, ITU-T G.723.1, ITU-T G.726, ITU-T G.728, ITU-T G.729 GSM-FR, GSM-HR, global system for mobile communications (GSM), enhanced full rate codec (EFR) AMR-NB, AMR-WB (ITU-T G.722.2
48、), AMR-WB+ PDC-FR, PDC-HR Enhanced variable rate codec (EVRC) (ANSI/TIA-127-A), EVRC-B (TIA-718-B) Skype (SILK V3, iLBC, iSAC and ITU-T G.729) Speex, QCELP (TIA-EIA-IS-733), iLBC, CVSD (64 kbit/s, “Bluetooth“) MPEG-1 audio layer 3(MP3), advanced audio coding (AAC), AAC-LD EVS OPUS Applications Codec
49、 evaluation Terminal testing, influence of the acoustical path and the transducer in sending and receiving direction. (NOTE Acoustical path in receiving direction only for fullband mode.) Bandwidth extensions Live network testing using digital or analogue connection to the network Testing of emulated and prototype networks Universal mobile telecommunications system (UMTS), code division multiple access (CDMA), GSM, terrestrial trunked radio (TETRA), WB-DECT, voice over IP (VoIP), plain
copyright@ 2008-2019 麦多课文库(www.mydoc123.com)网站版权所有
备案/许可证编号:苏ICP备17064731号-1