ITU-T SERIES P SUPP 25-2011 Parameters describing the interaction with multimodal dialogue systems (Study Group 12)《描述与多模态对话系统的相互作用的参数研究组12》.pdf

上传人:eveningprove235 文档编号:803370 上传时间:2019-02-04 格式:PDF 页数:28 大小:252.05KB
下载 相关 举报
ITU-T SERIES P SUPP 25-2011 Parameters describing the interaction with multimodal dialogue systems (Study Group 12)《描述与多模态对话系统的相互作用的参数研究组12》.pdf_第1页
第1页 / 共28页
ITU-T SERIES P SUPP 25-2011 Parameters describing the interaction with multimodal dialogue systems (Study Group 12)《描述与多模态对话系统的相互作用的参数研究组12》.pdf_第2页
第2页 / 共28页
ITU-T SERIES P SUPP 25-2011 Parameters describing the interaction with multimodal dialogue systems (Study Group 12)《描述与多模态对话系统的相互作用的参数研究组12》.pdf_第3页
第3页 / 共28页
ITU-T SERIES P SUPP 25-2011 Parameters describing the interaction with multimodal dialogue systems (Study Group 12)《描述与多模态对话系统的相互作用的参数研究组12》.pdf_第4页
第4页 / 共28页
ITU-T SERIES P SUPP 25-2011 Parameters describing the interaction with multimodal dialogue systems (Study Group 12)《描述与多模态对话系统的相互作用的参数研究组12》.pdf_第5页
第5页 / 共28页
点击查看更多>>
资源描述

1、 International Telecommunication Union ITU-T Series PTELECOMMUNICATION STANDARDIZATION SECTOR OF ITU Supplement 25(01/2011) SERIES P: TERMINALS AND SUBJECTIVE AND OBJECTIVE ASSESSMENT METHODS Parameters describing the interaction with multimodal dialogue systems ITU-T P-series Recommendations Supple

2、ment 25 ITU-T P-SERIES RECOMMENDATIONS TERMINALS AND SUBJECTIVE AND OBJECTIVE ASSESSMENT METHODS Vocabulary and effects of transmission parameters on customer opinion of transmission quality Series P.10 Voice terminal characteristics Series P.30 P.300 Reference systems Series P.40 Objective measurin

3、g apparatus Series P.50 P.500 Objective electro-acoustical measurements Series P.60 Measurements related to speech loudness Series P.70 Methods for objective and subjective assessment of speech quality Series P.80 P.800 Audiovisual quality in multimedia services Series P.900 Transmission performance

4、 and QoS aspects of IP end-points Series P.1000 Communications involving vehicles Series P.1100 For further details, please refer to the list of ITU-T Recommendations. P series Supplement 25 (01/2011) i Supplement 25 to ITU-T P-series Recommendations Parameters describing the interaction with multim

5、odal dialogue systems Summary Supplement 25 to the ITU-T P-series Recommendations provides definitions for a set of parameters which can be extracted from services which rely on multimodal dialogue systems. The parameters can be extracted from logged (test) user interactions with the service under c

6、onsideration. They quantify the flow of the interaction, the behaviour of the user and the system, and the performance of the speech technology devices involved in the interaction. They provide useful information for system development, optimization and maintenance, and are complementary to subjecti

7、ve quality judgments. The list is an amendment and extension of the respective list of parameters for speech-based services which is given in Supplement 24 to the ITU-T P-series Recommendations. History Edition Recommendation Approval Study Group 1.0 ITU-T P Suppl. 25 2011-01-27 12 Keywords Assessme

8、nt, automatic speech recognition, automatic speech understanding, dialogue management, gesture recognition, interaction parameter, multimodal dialogue system, speech generation. ii P series Supplement 25 (01/2011) FOREWORD The International Telecommunication Union (ITU) is the United Nations special

9、ized agency in the field of telecommunications, information and communication technologies (ICTs). The ITU Telecommunication Standardization Sector (ITU-T) is a permanent organ of ITU. ITU-T is responsible for studying technical, operating and tariff questions and issuing Recommendations on them wit

10、h a view to standardizing telecommunications on a worldwide basis. The World Telecommunication Standardization Assembly (WTSA), which meets every four years, establishes the topics for study by the ITU-T study groups which, in turn, produce Recommendations on these topics. The approval of ITU-T Reco

11、mmendations is covered by the procedure laid down in WTSA Resolution 1. In some areas of information technology which fall within ITU-Ts purview, the necessary standards are prepared on a collaborative basis with ISO and IEC. NOTE In this publication, the expression “Administration“ is used for conc

12、iseness to indicate both a telecommunication administration and a recognized operating agency. Compliance with this publication is voluntary. However, the publication may contain certain mandatory provisions (to ensure, e.g., interoperability or applicability) and compliance with the publication is

13、achieved when all of these mandatory provisions are met. The words “shall“ or some other obligatory language such as “must“ and the negative equivalents are used to express requirements. The use of such words does not suggest that compliance with the publication is required of any party. INTELLECTUA

14、L PROPERTY RIGHTS ITU draws attention to the possibility that the practice or implementation of this publication may involve the use of a claimed Intellectual Property Right. ITU takes no position concerning the evidence, validity or applicability of claimed Intellectual Property Rights, whether ass

15、erted by ITU members or others outside of the publication development process. As of the date of approval of this publication, ITU had not received notice of intellectual property, protected by patents, which may be required to implement this publication. However, implementers are cautioned that thi

16、s may not represent the latest information and are therefore strongly urged to consult the TSB patent database at http:/www.itu.int/ITU-T/ipr/. ITU 2011 All rights reserved. No part of this publication may be reproduced, by any means whatsoever, without the prior written permission of ITU. P series

17、Supplement 25 (01/2011) iii Table of Contents Page 1 Scope 1 2 References. 1 3 Definitions 4 4 Abbreviations 5 5 Conventions 5 6 Introduction 6 7 Characteristics of interaction parameters 7 8 Review of interaction parameters . 7 8.1 Dialogue- and communication-related parameters . 8 8.2 Meta-communi

18、cation-related parameters . 10 8.3 Cooperativity-related parameters . 13 8.4 Task-related parameters 14 8.5 Input-related parameters . 15 8.6 Output-related parameters 19 8.7 Further parameters 20 9 Interpretation of interaction parameter values 20 P series Supplement 25 (01/2011) 1 Supplement 25 to

19、 ITU-T P-series Recommendations Parameters describing the interaction with multimodal dialogue systems 1 Scope This supplement describes parameters providing information on the interaction with services which are based on multimodal dialogue systems, as seen by the system developer and service opera

20、tor. Multimodal dialogue systems addressed by this supplement enable a multimodal interaction with a human user. Such systems offer one or more modalities for input (e.g., speech, gesture, touch) and one or more output modalities (e.g., a graphical user interface, spoken output, an embodied conversa

21、tional agent) and may have automatic speech, gesture or touch recognition, speech understanding, a fusion module, dialogue management, response generation, a fission module and speech, graphical or audiovisual output capabilities. They may provide access to information stored in a database, or allow

22、 different types of transactions to be performed, and they are frequently offered on smart-phone platforms. The parameters defined here quantify the flow of the interaction, the behaviour of the user and the system, and the performance of the devices involved in the interaction. For extracting all p

23、arameters, the multimodal dialogue system has to be accessible as a glass box; still, some parameters may also be extracted in a black-box approach, i.e., without access to the individual system components. The extraction can partially be performed automatically, and partially relies on a human expe

24、rt transcribing and annotating interaction log files. The parameters address system performance from a system developers point-of-view; thus, they provide complementary information to subjective evaluation experiments. Further guidance on subjective evaluation methods in general and on the assessmen

25、t of speech output devices and spoken dialogue systems is available in ITU-T P.800, ITU-T P.85 and ITU-T P.851, and in the Handbook on Telephonometry. This guidance, however, does not yet cover multimodal systems. 2 References ITU-T P.10 Recommendation ITU-T P.10/G.100 (2006), Vocabulary for perform

26、ance and quality of service. ITU-T P.85 Recommendation ITU-T P.85 (1994), A method for subjective performance assessment of the quality of speech voice output devices. ITU-T P.800 Recommendation ITU-T P.800 (1996), Methods for subjective determination of transmission quality. ITU-T P.851 Recommendat

27、ion ITU-T P.851 (2003), Subjective quality evaluation of telephone services based on spoken dialogue systems. ITU-T Handbook ITU-T Handbook on Telephonometry (1993). IEC 60268-16 IEC Standard 60268-16 (1998), Sound system equipment Part 16: Objective rating of speech intelligibility by speech transm

28、ission index. Bernsen-1 Bernsen, N.O., Dybkjr, H., Dybkjr, L. (1998), Designing interactive speech systems: From first ideas to user testing. Springer, DE-Berlin. Bernsen-2 Bernsen, N.O. (2002), Multimodality in Language and Speech Systems From Theory to Design Support Tool. In Granstrm, B., House,

29、D., and Karlsson, I. (Eds.): Multimodality in Language and Speech Systems, Dordrecht, Kluwer Academic Publishers, 93-148. 2 P series Supplement 25 (01/2011) Billi Billi, R., Castagneri, G., Danieli, M. (1996), Field trial evaluations of two different information inquiry systems. In: Proc. 3rd IEEE W

30、orkshop on Interactive Voice Technology for Telecommunications Applications (IVTTA96), US-Basking Ridge NJ, 129-134. Boros Boros, M., Eckert, W., Gallwitz, F., Gorz, G., Hanrieder, G., Niemann, H. (1996), Towards understanding spontaneous speech: Word accuracy vs. concept accuracy. In: Proc. 4th Int

31、. Conf. on Spoken Language Processing (ICSLP96), IEEE, US-Piscataway NJ, 2, 1009-1012. Carletta Carletta, J. (1996), Assessing agreement of classification tasks: The kappa statistics. Computational Linguistics, 22(2), 249-254. Chu Chu, M., Peng, H. (2001), An objective measure for estimating MOS of

32、synthesized speech. In: Proc. 7th Europ. Conf. on Speech Communication and Technology (Eurospeech 2001 Scandinavia), DK-Aalborg, 3, 2087-2090. Cookson Cookson, S. (1988), Final evaluation of VODIS Voice operated data inquiry system. In: Proc. of Speech88, 7th FASE Symposium, UK-Edinburgh, 4, 1311-13

33、20. Danieli Danieli, M., Gerbino, E. (1995), Metrics for evaluating dialogue strategies in a spoken language system. In: Empirical Methods in Discourse Interpretation and Generation. Papers from the 1995 AAAI Symposium, US-Stanford CA, AAAI Press, US-Menlo Park CA, 34-39. Fraser Fraser, N. (1997), A

34、ssessment of interactive systems. In: Handbook on Standards and Resources for Spoken Language Systems (D. Gibbon, R. Moore and R. Winski, eds.), Mouton de Gruyter, DE-Berlin, 564-615. Gerbino Gerbino, E., Baggia, P., Ciaramella, A., Rullent, C. (1993), Test and evaluation of a spoken dialogue system

35、. In: Proc. Int. Conf. Acoustics Speech and Signal Processing (ICASSP93), IEEE, US-Piscataway NJ, 2, 135-138. Gibbon Gibbon, D., Moore, R., Winski, R., Eds. (2000), Handbook on Standards and Resources for Spoken Language Systems. Mouton de Gruyter, DE-Berlin. Glass Glass, J., Polifroni, J., Seneff,

36、S., Zue, V. (2000), Data collection and performance evaluation of spoken dialogue systems: The MIT experience. In: Proc. 6th Int. Conf. on Spoken Language Processing (ICSLP 2000), CN-Beijing, 4, 1-4. Goodine Goodine, D., Hirschman, L., Polifroni, J., Seneff, S., Zue, V. (1992), Evaluating interactiv

37、e spoken language systems. In: Proc. 2nd Int. Conf. on Spoken Language Processing (ICSLP92), CA-Banff, 1, 201-204. Grice Grice, H.P. (1975), Logic and conversation. In: Syntax and Semantics, Vol. 3: Speech Acts (P. Cole and J.L. Morgan, eds.), Academic Press, US-New York NY, 41-58. Hirschman Hirschm

38、an, L., Pao, C. (1993), The cost of errors in a spoken language system. In: Proc. 3rd Europ. Conf. on Speech Communication and Technology (Eurospeech93), DE-Berlin, 2, 1419-1422. Kamm Kamm, C.A., Litman, D.J., Walker, M.A. (1998), From novice to expert: The effect of tutorials on user expertise with

39、 spoken dialogue systems. In: Proc. 5th Int. Conf. on Spoken Language Processing (ICSLP98), AU-Sydney, 4, 1211-1214. P series Supplement 25 (01/2011) 3 Khnel Khnel, C., Weiss, B., Mller, S., (2010), Parameters describing multimodal interaction Definitions and three usage scenarios. Procedures of the

40、 11th Annual Conference of the ISCA (Interspeech 2010), JP Tokyo 2014-2017. Mller Mller, S. (2005), Quality of telephone-based spoken dialogue systems. Springer, US-New York NY. Nigay Nigay L. and Coutaz J. (1993), A design space for multimodal systems: concurrent processing and data fusion. In: Pro

41、c. Of the INTERACT and CHI, 172-178. NIST SRST NIST Speech Recognition Scoring Toolkit (2001), Speech recognition scoring toolkit. National Institute of Standards and technology, http:/www.nist.gov/speech/tools, US-Gaithersburg MD. Perakakis Perakakis, M and Potamianos, A. (2008), Multimodal system

42、evaluation using modality efficiency and synergy metrics. In: Proc. of the 10th international conference on Multimodal interfaces (ICMI 08). ACM, New York, NY, USA, 9-16. Picone-1 Picone, J., Doddington, G.R., Pallett, D.S. (1990), Phone-mediated word alignment for speech recognition evaluation. IEE

43、E Trans. Acoustics Speech and Signal Processing, 38(3), 559-562. Picone-2 Picone, J., Goudie-Marshall, K.M., Doddington, G.R., Fisher, W. (1986), Automatic text alignment for speech system evaluation. IEEE Trans. Acoustics Speech and Signal Processing, 34(4), 780-784. Polifroni Polifroni, J., Hirsch

44、man, L., Seneff, S., Zue, V. (1992), Experiments in evaluating interactive spoken language systems. In: Proc. DARPA Speech and Natural Language Workshop, US-Harriman CA, 28-33. Price Price, P.J., Hirschman, L., Shriberg, E., Wade, E. (1992), Subject-based evaluation measures for interactive spoken l

45、anguage systems. In: Proc. DARPA Speech and Natural Language Workshop, US-Harriman CA, 34-39. San-Segundo San-Segundo, R., Montero, J.M., Cols, J., Gutirrez, J., Ramos, J.M., Pardo, J.M. (2001), Methodology for dialogue design in telephone-based spoken dialogue systems: A Spanish train information s

46、ystem. In: Proc. 7th Europ. Conf. on Speech Communication and Technology (Eurospeech 2001 Scandinavia), DK-Aalborg, 3, 2165-2168. Simpson Simpson, A., Fraser, N.M. (1993), Black box and glass box evaluation of the SUNDIAL system. In: Proc. 3rd Europ. Conf. on Speech Communication and Technology (Eur

47、ospeech93), DE-Berlin, 2, 1423-1426. Skowronek Skowronek, J. (2002), Entwicklung von Modellierungsanstzen zur Vorhersage der Dienstequalitt bei der Interaktion mit einem natrlichsprachlichen Dialogsystem. Diploma thesis (unpublished), Institut fr Kommunikationsakustik, Ruhr-Universitt, DE-Bochum. St

48、rik-1 Strik, H., Cucchiarini, C., Kessens, J.M. (2001), Comparing the performance of two CSRs: How to determine the significance level of the differences. In: Proc. 7th Europ. Conf. on Speech Communication and Technology (Eurospeech 2001 Scandinavia), DK-Aalborg, 3, 2091-2094. Strik-2 Strik, H., Cuc

49、chiarini, C., Kessens, J.M. (2000), Comparing the recognition performance of CSRs: In search of an adequate metric and statistical significance test. In: Proc. 6th Int. Conf. on Spoken Language Processing (ICSLP 2000), CN-Beijing, 4, 740-743. 4 P series Supplement 25 (01/2011) van Leeuwen van Leeuwen, D., Steeneken, H. (1997), Assessment of recognition systems. In: Handbook on Standards and Resources for Spoken Languag

展开阅读全文
相关资源
猜你喜欢
相关搜索

当前位置:首页 > 标准规范 > 国际标准 > 其他

copyright@ 2008-2019 麦多课文库(www.mydoc123.com)网站版权所有
备案/许可证编号:苏ICP备17064731号-1