1、 I n t e r n a t i o n a l T e l e c o m m u n i c a t i o n U n i o n ITU-T P.502 TELECOMMUNICATION STANDARDIZATION SECTOR OF ITU Amendment 2 (09/2014) SERIES P: TERMINALS AND SUBJECTIVE AND OBJECTIVE ASSESSMENT METHODS Objective measuring apparatus Objective test methods for speech communication s
2、ystems using complex test signals Amendment 2: Updated Appendix III Automated double talk analysis procedure Recommendation ITU-T P.502 (2000) Amendment 2 ITU-T P-SERIES RECOMMENDATIONS TERMINALS AND SUBJECTIVE AND OBJECTIVE ASSESSMENT METHODS Vocabulary and effects of transmission parameters on cus
3、tomer opinion of transmission quality Series P.10 Voice terminal characteristics Series P.30 P.300 Reference systems Series P.40 Objective measuring apparatus Series P.50 P.500 Objective electro-acoustical measurements Series P.60 Measurements related to speech loudness Series P.70 Methods for objec
4、tive and subjective assessment of speech quality Series P.80 P.800 Audiovisual quality in multimedia services Series P.900 Transmission performance and QoS aspects of IP end-points Series P.1000 Communications involving vehicles Series P.1100 Models and tools for quality assessment of streamed media
5、 Series P.1200 Telemeeting assessment Series P.1300 Statistical analysis, evaluation and reporting guidelines of quality measurements Series P.1400 Methods for objective and subjective assessment of quality of services other than voice services Series P.1500 For further details, please refer to the
6、list of ITU-T Recommendations. Rec. ITU-T P.502 (2000)/Amd.2 (09/2014) i Recommendation ITU-T P.502 Objective test methods for speech communication systems using complex test signals Amendment 2 Updated Appendix III Automated double talk analysis procedure Summary Amendment 2 to Recommendation ITU-T
7、 P.502 updates Appendix III to the Recommendation. The appendix describes the methodology for an objective post analysis of measured double talk curves. History Edition Recommendation Approval Study Group Unique ID* 1.0 ITU-T P.502 2000-05-18 12 11.1002/1000/5083 1.1 ITU-T P.502 (2000) Amd. 1 2010-0
8、5-27 12 11.1002/1000/10867 1.2 ITU-T P.502 (2000) Amd. 2 2014-09-11 12 11.1002/1000/12331 _ * To access the Recommendation, type the URL http:/handle.itu.int/ in the address field of your web browser, followed by the Recommendations unique ID. For example, http:/handle.itu.int/11.1002/1000/11830-en.
9、 ii Rec. ITU-T P.502 (2000)/Amd.2 (09/2014) FOREWORD The International Telecommunication Union (ITU) is the United Nations specialized agency in the field of telecommunications, information and communication technologies (ICTs). The ITU Telecommunication Standardization Sector (ITU-T) is a permanent
10、 organ of ITU. ITU-T is responsible for studying technical, operating and tariff questions and issuing Recommendations on them with a view to standardizing telecommunications on a worldwide basis. The World Telecommunication Standardization Assembly (WTSA), which meets every four years, establishes
11、the topics for study by the ITU-T study groups which, in turn, produce Recommendations on these topics. The approval of ITU-T Recommendations is covered by the procedure laid down in WTSA Resolution 1. In some areas of information technology which fall within ITU-Ts purview, the necessary standards
12、are prepared on a collaborative basis with ISO and IEC. NOTE In this Recommendation, the expression “Administration“ is used for conciseness to indicate both a telecommunication administration and a recognized operating agency. Compliance with this Recommendation is voluntary. However, the Recommend
13、ation may contain certain mandatory provisions (to ensure, e.g., interoperability or applicability) and compliance with the Recommendation is achieved when all of these mandatory provisions are met. The words “shall“ or some other obligatory language such as “must“ and the negative equivalents are u
14、sed to express requirements. The use of such words does not suggest that compliance with the Recommendation is required of any party. INTELLECTUAL PROPERTY RIGHTSITU draws attention to the possibility that the practice or implementation of this Recommendation may involve the use of a claimed Intelle
15、ctual Property Right. ITU takes no position concerning the evidence, validity or applicability of claimed Intellectual Property Rights, whether asserted by ITU members or others outside of the Recommendation development process. As of the date of approval of this Recommendation, ITU had not received
16、 notice of intellectual property, protected by patents, which may be required to implement this Recommendation. However, implementers are cautioned that this may not represent the latest information and are therefore strongly urged to consult the TSB patent database at http:/www.itu.int/ITU-T/ipr/.
17、ITU 2014 All rights reserved. No part of this publication may be reproduced, by any means whatsoever, without the prior written permission of ITU. Rec. ITU-T P.502 (2000)/Amd.2 (09/2014) 1 Recommendation ITU-T P.502 Objective test methods for speech communication systems using complex test signals A
18、mendment 2 Updated Appendix III Automated double talk analysis procedure 1) Updated Appendix III Replace Appendix III with the following. Appendix III Automated double talk analysis procedure (This appendix does not form an integral part of this Recommendation.) III.1 Introduction ITU-T P.340 descri
19、bes test methods for double talk analyses of hands-free terminals based on signals and analysis procedures standardized in this Recommendation and in ITU-T P.501. These measurements allow the analysis of the attenuation range in sending and receiving directions based on double talk composite source
20、signals. Basically, the measurements are performed by subtracting the time dependent level of a transmitted test signal during double-talk by its corresponding time synchronized level during single talk. However, due to slight time misalignments, non-linear, time variant signal processing and other
21、fractures, an objective analysis which can be performed reproducibly and unambiguously in different labs is not always possible. The result often depends on the interpretation of the operator. This appendix describes a new procedure which allows the automated objective analysis in a clearly defined
22、way. The method has been proven to reduce the ambiguity in interpretation of double talk measurements. The accuracy of this method in predicting the mean value of the attenuation range derived by a jury of test conductors manually calculating the attenuation range is unknown. Currently, this method
23、is only intended to check consistency in measurements between labs. The method was not validated in a formal double talk test. The procedure described is based on the double talk composite source signal (CSS) and the double talk speech signals described in ITU-T P.501. It should be noted that if the
24、 method described here is applied to other test signals, the method needs to be adapted especially with respect to the histogram limitations. 2 Rec. ITU-T P.502 (2000)/Amd.2 (09/2014) III.2 Automated double talk analysis procedure using CSS A typical result of the analysis of modern terminals with r
25、espect to the attenuation range in the sending direction is given in Figure III.1. Figure III.1 Typical results of a double talk test based on the attenuation range measurement ITU-T P.340 It can be seen, that the blue (HFT 2) curve is already somewhat difficult to interpret with respect to the amou
26、nt of attenuation inserted during double talk. An unambiguous interpretation of the red (HFT 1) curve is nearly impossible because at different times different attenuations are inserted; the attenuation versus time is highly timevariant. In such cases, the time history signal, as well as some additi
27、onal subjective verification of the measured result, is currently used in order to finally determine the attenuation range during double talk. However, this does not always lead to the same interpretation of the results by different operators. In order to achieve a clear, consistent and unambiguous
28、interpretation result, an objective procedure is needed. A proposal for such a procedure, which could also be applied for other types of double talk test signals, is shown in Figure III.2. Rec. ITU-T P.502 (2000)/Amd.2 (09/2014) 3 Figure III.2 Principle of the proposed automated double talk analysis
29、 The double talk signal is time aligned to the delay inserted by the telephone and the test equipment, and presented to the terminal as described in ITU-T P.340 (“t“). Instead of using the unfiltered double talk CSS signal as reference, the reference signal used is the transmitted test signal for th
30、e individual direction (sending or receiving) without the double talk signal present. Such reference signal includes possible automatic gain control (AGC) influence as well as the individual transfer function of the device under test. Based on the transmitted signal and the new reference signal, the
31、 double talk analysis is performed through the individual interpretation of each CSS burst from the transmitted double talk signal (see Figure III.2). For each CSS burst a level histogram is created, and from this level histogram the amount of attenuation is determined (ah,DT,SND and ah,DT,RCV respe
32、ctively, according to ITU-T P.340). The ah,DT,SND that is finally observed determines the double talk category (for the sending direction in this case). The histogram creation and the calculation of attenuation relevant for the final double talk rating is described in the following more in detail us
33、ing the analysis example in Figure III.3. It shows the level versus time difference representation of the CSS bursts of a modern terminal in the sending direction (level of the transmitted signal referred to the level of filtered reference signal): The level versus time L(k) is calculated according
34、to b-IEC 61672 (i.e., DIN EN 61672) with a time constant of 5 ms for both signals (LDT,SND(k) and LRef(k). The difference between both signals L(k) is calculated as L(k) = LDT,SND(k) LRef(k). Minimum and maximum limits for the histogram are derived from minimum and maximum level difference (Lmin = m
35、inL(k) and Lmax = maxL(k). Division of histogram in 100 equally spaced bins between the minimum and the maximum histogram limits, Lmin and Lmax. 4 Rec. ITU-T P.502 (2000)/Amd.2 (09/2014) Deletion of the lower 20% and the upper 15% histogram values. New, “effective“ histogram limits are given by Lmin
36、20% and Lmax15%. This can be interpreted as a smoothing of the curve, which allows the suppression of slight level variations not important for the subjective perception, as well as of some strong peaks which last only for a short period of time and are also not important for the subjective double t
37、alk quality perception. Calculation of attenuation range ah,DT,SND according to ITU-T P.340 as the difference between Lmin20% and Lmax15%, i.e., ah,DT,SND = Lmax15% Lmin20%. Figure III.3 Result of a double talk analysis of a modern terminal displayed as level difference versus time The result of thi
38、s level bin is a histogram representation as shown in Figure III.4, which is different for each CSS burst. P . 5 0 2 ( 0 0 ) A m d . 1 ( 1 0 ) _ F I I I . 47005004003002001000600-1 1, 05-10, 682-10, 313-9, 9438-9, 5749-8, 8372-9, 2061-8, 4684-8, 0995-7, 7307-7, 3618-6, 993-6, 6242-6, 2553-5, 8865-5,
39、 5176-5, 1488-4, 7799-4, 41 11-4, 0422-3, 6734-3, 3045-2, 9357-2, 5668-2, 198N u m b e r ofl e ve l s p e r b i nD Lm ax 15 %D Lm i n 20 %Figure III.4 Histogram representation of the first level versus time representation from Figure III.3 The values 15% and 20% were developed empirically based on s
40、ubjective experts evaluation of 58 different types of mobile phones. With this subjective experts evaluation, a double talk type class mean error between the subjective experts evaluation and the described objective procedures of a mean = 0.0172 and an RMSI = 0.82 could be achieved. Rec. ITU-T P.502
41、 (2000)/Amd.2 (09/2014) 5 Although this objective post analysis for ITU-T P.340 signals has not yet been evaluated in a formal subjective listening test, it can be stated that such defined analysis of double talk would greatly improve the comparability of measurement results in different labs, while
42、 at the same time showing some reasonable correlation with subjective experts evaluations. The described post analysis is not limited to the CSS bursts currently defined in ITU-T P.501, the main body of this Recommendation and ITU-T P.340, but can be performed in the same way with different types of
43、 double talk signals. III.3 Applying the automated double talk analysis to speech signals The procedure when using the double talk speech signal defined in ITU-T P.501 is basically the same as described for the CSS based analysis. The following modifications are applied to the procedure as described
44、 in III.2: The integration time constant for the level calculation from is changed from T = 5 ms to T = 30 ms. The double talk speech samples have to be selected as illustrated in Figure III.5. Based on the time signal, the speech parts are detected in the same way as the CSS bursts. In order to har
45、monize the naming convention between CSS and speech test signal, each located speech part will be marked as block, which can also be used for CSS signals. Thus a block may either contain a CSS burst, a single word or a test sentence. After the localization of all blocks, the calculation of the level
46、-vs-time difference between single and double talk run is applied. For each block, a histogram of the level difference is created. From this representation, the smoothed maximum level attenuation is determined according to the method described in III.2. The principle is shown in Figure III.6 for one
47、 exemplary block of Figure III.5. Figure III.5 Time signals of single and double talk measurement (top) and corresponding level difference versus time (bottom) 6 Rec. ITU-T P.502 (2000)/Amd.2 (09/2014) Figure III.6 Principle of double talk attenuation per block Instead of picking the maximum attenua
48、tion per block (as it is described for the CSS), the median over all blocks is reported. , = , 2) Bibliography Add the following reference: b-IEC 61672 IEC 61672, Electroacoustics Sound level meters. Part 1 (2002): Specifications. Part 2 (2003): Pattern evaluation tests. Part 3 (2006): Periodic test
49、s. Printed in Switzerland Geneva, 2014 SERIES OF ITU-T RECOMMENDATIONS Series A Organization of the work of ITU-T Series D General tariff principles Series E Overall network operation, telephone service, service operation and human factors Series F Non-telephone telecommunication services Series G Transmission systems and media, digital systems and networks Series H Audiovisual and multimedia systems Series I Integrated