1、 I n t e r n a t i o n a l T e l e c o m m u n i c a t i o n U n i o n ITU-T F.746.5 TELECOMMUNICATION STANDARDIZATION SECTOR OF ITU (12/2017) SERIES F: NON-TELEPHONE TELECOMMUNICATION SERVICES Multimedia services Framework for a language learning system based on speech and natural language processi
2、ng (NLP) technology Recommendation ITU-T F.746.5 ITU-T F-SERIES RECOMMENDATIONS NON-TELEPHONE TELECOMMUNICATION SERVICES TELEGRAPH SERVICE Operating methods for the international public telegram service F.1F.19 The gentex network F.20F.29 Message switching F.30F.39 The international telemessage serv
3、ice F.40F.58 The international telex service F.59F.89 Statistics and publications on international telegraph services F.90F.99 Scheduled and leased communication services F.100F.104 Phototelegraph service F.105F.109 MOBILE SERVICE Mobile services and multidestination satellite services F.110F.159 TE
4、LEMATIC SERVICES Public facsimile service F.160F.199 Teletex service F.200F.299 Videotex service F.300F.349 General provisions for telematic services F.350F.399 MESSAGE HANDLING SERVICES F.400F.499 DIRECTORY SERVICES F.500F.549 DOCUMENT COMMUNICATION Document communication F.550F.579 Programming com
5、munication interfaces F.580F.599 DATA TRANSMISSION SERVICES F.600F.699 MULTIMEDIA SERVICES F.700F.799 ISDN SERVICES F.800F.849 UNIVERSAL PERSONAL TELECOMMUNICATION F.850F.899 ACCESSIBILITY AND HUMAN FACTORS F.900F.999 For further details, please refer to the list of ITU-T Recommendations. Rec. ITU-T
6、 F.746.5 (12/2017) i Recommendation ITU-T F.746.5 Framework for a language learning system based on speech and natural language processing (NLP) technology Summary Recommendation ITU-T F.746.5 describes functional requirements and detailed functions for the framework of a language learning system ba
7、sed on speech and natural language processing technology. It provides a framework for a language learning system that will serve as a reference framework for language learning systems to be developed and used as low cost tools in many educational situations. This Recommendation defines the features,
8、 general requirements and functionality to support the language learning system based on speech and natural language processing (NLP) technology. The scope covers a high-level description of architecture, terminals, servers, interface and clients. . History Edition Recommendation Approval Study Grou
9、p Unique ID* 1.0 ITU-T F.746.5 2017-12-14 16 11.1002/1000/13427 Keywords Language learning, natural language processing, speech interface, speech recognition. * To access the Recommendation, type the URL http:/handle.itu.int/ in the address field of your web browser, followed by the Recommendations
10、unique ID. For example, http:/handle.itu.int/11.1002/1000/11830-en. ii Rec. ITU-T F.746.5 (12/2017) FOREWORD The International Telecommunication Union (ITU) is the United Nations specialized agency in the field of telecommunications, information and communication technologies (ICTs). The ITU Telecom
11、munication Standardization Sector (ITU-T) is a permanent organ of ITU. ITU-T is responsible for studying technical, operating and tariff questions and issuing Recommendations on them with a view to standardizing telecommunications on a worldwide basis. The World Telecommunication Standardization Ass
12、embly (WTSA), which meets every four years, establishes the topics for study by the ITU-T study groups which, in turn, produce Recommendations on these topics. The approval of ITU-T Recommendations is covered by the procedure laid down in WTSA Resolution 1. In some areas of information technology wh
13、ich fall within ITU-Ts purview, the necessary standards are prepared on a collaborative basis with ISO and IEC. NOTE In this Recommendation, the expression “Administration“ is used for conciseness to indicate both a telecommunication administration and a recognized operating agency. Compliance with
14、this Recommendation is voluntary. However, the Recommendation may contain certain mandatory provisions (to ensure, e.g., interoperability or applicability) and compliance with the Recommendation is achieved when all of these mandatory provisions are met. The words “shall“ or some other obligatory la
15、nguage such as “must“ and the negative equivalents are used to express requirements. The use of such words does not suggest that compliance with the Recommendation is required of any party. INTELLECTUAL PROPERTY RIGHTSITU draws attention to the possibility that the practice or implementation of this
16、 Recommendation may involve the use of a claimed Intellectual Property Right. ITU takes no position concerning the evidence, validity or applicability of claimed Intellectual Property Rights, whether asserted by ITU members or others outside of the Recommendation development process. As of the date
17、of approval of this Recommendation, ITU had received notice of intellectual property, protected by patents, which may be required to implement this Recommendation. However, implementers are cautioned that this may not represent the latest information and are therefore strongly urged to consult the T
18、SB patent database at http:/www.itu.int/ITU-T/ipr/. ITU 2018 All rights reserved. No part of this publication may be reproduced, by any means whatsoever, without the prior written permission of ITU. Rec. ITU-T F.746.5 (12/2017) iii Table of Contents Page 1 Scope . 1 2 References . 1 3 Definitions 1
19、3.1 Terms defined elsewhere 1 3.2 Terms defined in this Recommendation . 2 4 Abbreviations and acronyms 2 5 Conventions 2 6 Introduction . 3 6.1 Basic concept of speech interface and NLP technology 3 6.2 Advanced technology for dialogue-based speech interface . 4 7 Requirements for language learning
20、 system . 5 8 Functional components and interfaces of a language learning system . 5 8.1 Speech recognition module 6 8.2 Natural language processing module 6 8.3 Speech synthesis module 7 8.4 Pronunciation fluency evaluation module 7 8.5 Dialogue processing module 8 8.6 Dialogue understanding evalua
21、tion module . 9 8.7 Dialogue knowledge learning module 9 8.8 Language learning module . 10 Rec. ITU-T F.746.5 (12/2017) 1 Recommendation ITU-T F.746.5 Framework for a language learning system based on speech and natural language processing (NLP) technology 1 Scope This Recommendation presents an ove
22、rview of the framework for a language learning system based on speech and natural language processing (NLP) technology. It describes the features, general requirements and functionality, which is a framework to support language-learning systems. The scope covers a high-level description of architect
23、ure, devices, servers and clients. 2 References The following ITU-T Recommendations and other references contain provisions which, through reference in this text, constitute provisions of this document. At the time of publication, the editions indicated were valid. All Recommendations and other refe
24、rences are subject to revision; users of this document are therefore encouraged to investigate the possibility of applying the most recent edition of the Recommendations and other references listed below. A list of the currently valid ITU-T Recommendations is regularly published. The reference to a
25、document within this Recommendation does not give it, as a stand-alone document, the status of a Recommendation. ITU-T F.746.3 Recommendation ITU-T F.746.3 (2015), Intelligent question answering service framework. ITU-T H.703 Recommendation ITU-T H.703 (2016), Enhanced user interface framework for I
26、PTV terminal devices. 3 Definitions 3.1 Terms defined elsewhere This Recommendation uses the following terms defined elsewhere: 3.1.1 co-reference resolution ITU-T F.746.3: A function that detects the preceding referents of the pronouns which replace the noun phrases of the input sentences. 3.1.2 na
27、med entity recognition ITU-T F.746.3: A function that recognizes named entities such as PLO which are people, locations and organizations from the sentences. The PLO can be decomposed into more specific named entities depending on the applications. 3.1.3 natural language processing ITU-T F.746.3: A
28、method that analyses text in natural languages through several processes such as part-of- speech recognition, syntactic analysis and semantic analysis. 3.1.4 part-of-speech recognition ITU-T F.746.3: A function that recognizes parts of speech (POS) in the sentences and assigns relevant POS tags cons
29、idering contextual meaning of the target sentences. 3.1.5 semantic analysis ITU-T F.746.3: A function that recognizes the semantic relations among the words around predicates that exist in the same sentence. The semantic analysis function then generates a semantic predicate-argument structure (PAS).
30、 3.1.6 speech ITU-T H.703: Speech is the vocalized form of human communication. 3.1.7 speech recognition ITU-T H.703: A kind of user interface for translation of spoken words into text. 2 Rec. ITU-T F.746.5 (12/2017) 3.1.8 syntactic analysis ITU-T F.746.3: A function that analyses sentence structure
31、s and generates dependency relation among words based on dependency grammars. 3.2 Terms defined in this Recommendation This Recommendation defines the following terms: 3.2.1 dialogue-based speech interface: An interface based on speech, especially dialogues between the user and the device or system.
32、 3.2.2 dialogue act: The users intention or purpose of the utterances in a dialogue. Example: request for information, command for action, agreement. 4 Abbreviations and acronyms This Recommendation uses the following abbreviations and acronyms: DA Dialogue Act DNN Deep Neural Network HCI Human Comp
33、uter Interaction ICT Information and Communication Technology IT Information Technology LLS Language Learning System Based on Speech/NLP Technology NE Named Entity NLP Natural Language Processing PC Personal Computer POS Part of Speech SMS Short Message Service SVM Support Vector Machines TTS Text t
34、o Speech 5 Conventions The following conventions are used in this Recommendation: The keywords “is required to“ indicate a requirement which must be strictly followed and from which no deviation is permitted, if conformance to this Recommendation is to be claimed. The keywords “is prohibited from“ i
35、ndicate a requirement which must be strictly prohibited, if conformance to this Recommendation is to be claimed. The keywords “is recommended“ indicate a requirement which is recommended but which is not absolutely required. Thus, this requirement need not be present to claim conformance. The keywor
36、ds “is not recommended“ indicate a requirement which is not recommended but which is not specifically prohibited. Thus, conformance with this Recommendation can still be claimed even if this requirement is present. The keywords “can optionally“ indicate an optional requirement which is permissible,
37、without implying any sense of being recommended. This term is not intended to imply that the vendors implementation must provide the option and the feature can be optionally enabled by the network operator/service provider. Rather, it means the vendor may optionally provide the feature and still cla
38、im conformance with this Recommendation. Rec. ITU-T F.746.5 (12/2017) 3 6 Introduction Language learning requires a lot of time and cost for individual learners who wish to learn foreign languages, especially when personalized education is needed for each learner with a different level of learning c
39、apabilities. The ratio of students to teachers should be minimized to have an effective learning situation for individualized learning. Advances in information and communication technology (ICT), especially those in speech/language areas have made the language learning experience less expensive and
40、more effective for individualized learning. Speech interface and natural language processing (NLP) technology is an advanced technology that allows a computer to understand what a person says, and facilitates exchange of information in a smooth conversation with the person. Speech interface/NLP tech
41、nology will likely be combined with other technologies in many application areas such as national defence, medical services, etc. Moreover, speech interface/NLP technology can also be applied to language learning systems for conversation training. NLP is a core technology for human computer interact
42、ion (HCI) and a basis for knowledge and information services. It combines syntactic analysis and semantic analysis to understand human languages. NLP technology is used for understanding a users speech in dialogue practices in various language learning scenarios. More advanced dialogue-based speech
43、interface technology is also being developed. The basic dialogue processing flow is shown in Figure 3. The technology recognizes and understands users speech and generates appropriate responses in limited dialogue situations. The core dialogue processing technology is applied to foreign language lea
44、rning systems that simulate one-on-one conversation training. This Recommendation provides requirements, architecture and functions for a language learning system based on speech/NLP technology. 6.1 Basic concept of speech interface and NLP technology Speech interface and NLP technology are the next
45、 generation interface that allows a computer to understand what a person says and also facilitates the exchange of information in a smooth conversation with the person. Figure 1 shows a basic speech interface and Figure 2 shows some basic NLP modules. Recently, speech interface has become one of the
46、 essential elements in the information technology (IT) industry such as intelligent robots, telematics, the next generation personal computer (PC), and digital-home. Speech interface/NLP technology will likely be combined with other technologies in many application areas such as national defence, me
47、dical services, etc. Henceforth, speech interface/NLP technology will grow into a multimodal interface with a united input device such as voice, pen, mouse and gestures. A high-level interactive speech interface will soon be attainable by incorporating circumstantial information and the speakers int
48、ention. Moreover, speech interface/NLP technology plays the role of a core technology for mobile web information services, and can also be applied to language learning systems for conversation training. NLP is a core technology for HCI and the basis for knowledge and information services. It combine
49、s syntactic analysis and semantic analysis to understand human languages. NLP technology is used for understanding a users speech in dialogue practices in various language learning scenarios. 4 Rec. ITU-T F.746.5 (12/2017) Figure 1 Basic speech interface Figure 2 Basic NLP flows with processing modules 6.2 Advanced technology for dialogue-based speech interface More advanced technology for dialogue-based speech interface has also been developed. Figure 3 shows the basic dialogue processing flow. The technology re