ISO IEC TR 29127-2011 Information technology - System Process and Architecture for Multilingual Semantic Reverse Query Expansion《信息技术 多语义逆向查询扩展用系统进程与结构》.pdf

上传人:diecharacter305 文档编号:1257433 上传时间:2019-09-02 格式:PDF 页数:38 大小:7.98MB
下载 相关 举报
ISO IEC TR 29127-2011 Information technology - System Process and Architecture for Multilingual Semantic Reverse Query Expansion《信息技术 多语义逆向查询扩展用系统进程与结构》.pdf_第1页
第1页 / 共38页
ISO IEC TR 29127-2011 Information technology - System Process and Architecture for Multilingual Semantic Reverse Query Expansion《信息技术 多语义逆向查询扩展用系统进程与结构》.pdf_第2页
第2页 / 共38页
ISO IEC TR 29127-2011 Information technology - System Process and Architecture for Multilingual Semantic Reverse Query Expansion《信息技术 多语义逆向查询扩展用系统进程与结构》.pdf_第3页
第3页 / 共38页
ISO IEC TR 29127-2011 Information technology - System Process and Architecture for Multilingual Semantic Reverse Query Expansion《信息技术 多语义逆向查询扩展用系统进程与结构》.pdf_第4页
第4页 / 共38页
ISO IEC TR 29127-2011 Information technology - System Process and Architecture for Multilingual Semantic Reverse Query Expansion《信息技术 多语义逆向查询扩展用系统进程与结构》.pdf_第5页
第5页 / 共38页
点击查看更多>>
资源描述

1、 Reference number ISO/IEC TR 29127:2011(E) ISO/IEC 2011TECHNICAL REPORT ISO/IEC TR 29127 First edition 2011-07-01 Information technology System Process and Architecture for Multilingual Semantic Reverse Query Expansion Technologies de linformation Processus systme et architecture pour lextension mul

2、tilinguale des requtes smantiques inverses ISO/IEC TR 29127:2011(E) COPYRIGHT PROTECTED DOCUMENT ISO/IEC 2011 All rights reserved. Unless otherwise specified, no part of this publication may be reproduced or utilized in any form or by any means, electronic or mechanical, including photocopying and m

3、icrofilm, without permission in writing from either ISO at the address below or ISOs member body in the country of the requester. ISO copyright office Case postale 56 CH-1211 Geneva 20 Tel. + 41 22 749 01 11 Fax + 41 22 749 09 47 E-mail copyrightiso.org Web www.iso.org Published in Switzerland ii IS

4、O/IEC 2011 All rights reservedISO/IEC TR 29127:2011(E) ISO/IEC 2011 All rights reserved iiiContents Page Foreword iv Introduction . v 1 Scope 1 2 Terms and definitions . 1 3 Example SRQE Implementation . 2 3.1 Initialization of the User Interface 3 3.2 Select Query Parameters 4 3.3 Select Word Sense

5、s 6 3.4 Selecting and Translating Appropriate Terms . 8 3.5 Selecting Appropriate Translations and Executing the Query . 10 3.6 Query Returns 11 4 Components and Architecture of the SRQE Process 14 4.1 SRQE Process Flow 15 4.2 Repositories . 15 4.3 Terms and Results 16 4.4 Translators . 16 4.5 Entit

6、y Extraction 17 4.6 Terminology for Query Searches . 17 Annex A (informative) Potential Linkage to Current and Future ISO/IEC JTC1 SC 36 Technology Areas . 18 Annex B (informative) Patent Declaration Form for SRQE Process 20 Annex C (informative) Summary on the Issue of Language Equivalencies 22 Bib

7、liography 30 ISO/IEC TR 29127:2011(E) iv ISO/IEC 2011 All rights reservedForeword ISO (the International Organization for Standardization) and IEC (the International Electrotechnical Commission) form the specialized system for worldwide standardization. National bodies that are members of ISO or IEC

8、 participate in the development of International Standards through technical committees established by the respective organization to deal with particular fields of technical activity. ISO and IEC technical committees collaborate in fields of mutual interest. Other international organizations, gover

9、nmental and non-governmental, in liaison with ISO and IEC, also take part in the work. In the field of information technology, ISO and IEC have established a joint technical committee, ISO/IEC JTC 1. International Standards are drafted in accordance with the rules given in the ISO/IEC Directives, Pa

10、rt 2. The main task of the joint technical committee is to prepare International Standards. Draft International Standards adopted by the joint technical committee are circulated to national bodies for voting. Publication as an International Standard requires approval by at least 75 % of the national

11、 bodies casting a vote. In exceptional circumstances, when the joint technical committee has collected data of a different kind from that which is normally published as an International Standard (“state of the art”, for example), it may decide to publish a Technical Report. A Technical Report is ent

12、irely informative in nature and shall be subject to review every five years in the same manner as an International Standard. ISO/IEC TR 29127 was prepared by Technical Committee ISO/IEC JTC 1, Information technology, Subcommittee SC 36, Information technology for learning, education and training. IS

13、O/IEC TR 29127:2011(E) ISO/IEC 2011 All rights reserved vIntroduction Learning, Education and Training (LET) in the context of multilingual cultures on a local and global scale can be problematic, especially when learners are proficient in only one language. One of the multilingual problems in a LET

14、 environment is how to query LET materials when the requestor cannot understand or is not proficient in the language of the material available. For example, how does a person who is proficient in French search for, find, and readily understand digital LET materials in Arabic, if the person is not pr

15、oficient in Arabic? One solution can be found in a process called the Semantic Reverse Query Expander (SRQE). Based on components such as language ontologies, the SRQE process utilizes Java 2 Platform, Enterprise Edition (J2EE) (J2EE) 1)services that can take a term in one language (source language)

16、, expand the term conceptually, translate the expanded terms (into a target language), and perform a query on a targeted foreign language document set. Returns are translated into the language of the requestor. This Technical Report identifies an existing process and architecture used to query forei

17、gn language text files. Technologies and ontologies (i.e. thesauri) for undertaking this kind of matching and expansion operation have been available for some time (e.g. the work of CYC Corp, Global WordNet, Global WordGrid). Valuable lessons have been learned about what such technologies can and ca

18、nnot accomplish. This Technical Report does not discuss these pre-existing technologies, or describe the improvement or change that the proposed process presented might represent. A particular approach (theory and practice) with respect to the context of difficulties experienced in regard to multili

19、ngual equivalencies and translation are presented in Annex C of this Technical Report. In Clause 3 of this Technical Report, an implementation of the SRQE process is described in a web environment to help clarify the architecture described in Clause 4. Annex A contains possible linkages to ISO/IEC J

20、TC 1, SC 36 projects and future areas of study. The International Organization for Standardization (ISO) and the International Electrotechnical Commission (IEC) draw attention to the fact that it is claimed that the process described in this Technical Report may involve the use of patents. ISO and I

21、EC take no position concerning the evidence, validity and scope of these patent rights. The holders of these patent rights have assured ISO and the IEC that they are willing to grant a free of charge license to an unrestricted number of applicants on a worldwide, non-discriminatory basis and under o

22、ther reasonable terms and conditions to make, use, and sell implementations of the process contained in this Technical Report. In this respect, the statements of the holders of these patent rights are registered with ISO and IEC. Information may be obtained from the companies listed below. Raytheon

23、Company Phillip Berestecki Intellectual Property and Licensing 870 Winter Street Waltham, Massachusetts 02451-1449 USA NOTE 1 This Technical Report refers to one particular process or approach for performing reverse semantic queries; there are other approaches and processes that could be developed f

24、or these same purposes. NOTE 2 The process is not dependent on particular database software, protocols, or data sets. Specific components used in the process are an implementation decision. 1) A widely used platform for server programming in the Java programming language. TECHNICAL REPORT ISO/IEC TR

25、 29127:2011(E) ISO/IEC 2011 All rights reserved 1Information technology System Process and Architecture for Multilingual Semantic Reverse Query Expansion 1 Scope This Technical Report identifies an example of a system-based process to index, query, translate, and manage components used in querying a

26、nd translating documents in multiple foreign languages, enabling learners in learning, education, and training areas to effectively find and share documents on a global scale. 2 Terms and definitions For the purposes of this document, the following terms and definitions apply. NOTE For this Technica

27、l Report, the following terms and definitions are not considered to be normative. They are informative, and apply only within the context of this Technical Report. 2.1 coordinate term words that have the same hypernym EXAMPLE Boat, yacht, and shrimper, all have the same hypernym, ship. NOTE Adapted

28、from ISO 1087-1:2000, definition 3.2.19. 2.2 entity extraction process that seeks to locate, classify, and tag atomic elements in text into predefined categories EXAMPLE Names of persons, organizations, locations, expressions of times, quantities, monetary values, percentages, etc. 2.3 hypernym supe

29、rordinate concept word that is more generic or broad than another given word NOTE 1 Another term for a hypernym is a superordinate concept. NOTE 2 Adapted from ISO 1087-1:2000, definition 3.2.13. 2.4 hyponym subordinate concept word that is more specific than another given term NOTE 1 Another term f

30、or hyponym is a subordinate concept. NOTE 2 Adapted from ISO 1087-1:2000, definition 3.2.14. ISO/IEC TR 29127:2011(E) 2 ISO/IEC 2011 All rights reserved2.5 Java servlet Java programming language objects that dynamically process requests and construct responses 2.6 meronym constituent part of, or a m

31、ember of, something EXAMPLE “Winchester Cathedral” is a meronym of “Church of England”. 2.7 nominalization use of a verb or an adjective as a noun with or without morphological transformation, so that the word can now act as the head of a noun phrase 2.8 word sense linguistics one of the meanings of

32、 a word NOTE A dictionary may have over 50 different meanings of the word “play”, with each of these having a different meaning based on the context of the word usage in a sentence. EXAMPLE We went to see the play Romeo and Juliet at the theater. The children went out to play in the park. 3 Example

33、SRQE Implementation This clause provides an implemented example of the SRQE process. The simplistic example of the SRQE implementation illustrates a sequence of actions between a learner and the system. The example provided is based on a possible learning assignment made to a learner in producing a

34、report on trucks using foreign language resources. A learner wanting international information in producing a report on trucks could use the system to gather international information related to trucks for possible inclusion in the report. The system can perform a cross lingual query on documents in

35、 a number of languages, and translate the documents into the learners native language. The learner can optionally access a map of the locations listed in the text files returned by the query for improved comprehension of where the article originates from, or the location of where the subject of the

36、article can be found. In this example, the SRQE process utilizes a user interface, interacting with Java servlets to provide a web based User Interface (UI). The SRQE process flow is illustrated below in Figure 1. The SRQE process is a human-machine interactive process to perform cross-lingual queri

37、es. Human inputs are shown in the white box above the green arrow. Java servlet functions are shown below the green arrow. A UML diagram showing the Java servlets in a green box are linked to functions in the white box. Figure 1 SRQE Process Flow ISO/IEC TR 29127:2011(E) ISO/IEC 2011 All rights rese

38、rved 33.1 Initialization of the User Interface The user interface application is accessed by URL. The GetMenuBarDataService initializes the user interface application providing the repositories available and in what languages the repositories are in. The user interface application is shown in Figure

39、 2. The GetMenuBarDataService builds the menus and menu choices in the user interface. This includes 1. Enter word to be translated, 2. Select Class, 3. Select Language(s), and 4. Select Sources, 5. Execute Query, the results section (blank), and the Map. Figure 2 Initialized User Interface NOTE 1 I

40、n this example, an Adobe Flash plug-in is required for the use of the SRQE in a browser. The user interface described in the example is based on Adobe Flex. NOTE 2 The original example uses CaMel CaSe format in this instance and the instances that follow. This formatting is retained for this reason.

41、 ISO/IEC TR 29127:2011(E) 4 ISO/IEC 2011 All rights reserved3.2 Select Query Parameters The learner enters a word in “1. Enter word to be translated”. The learner selects a class of the word in “2. Select Class”. The learner selects a language in “3. Select Language(s)”. The learner selects a source

42、 in “4. Select Source(s)”. In this example, the learner enters Truck in “1. Enter word to be translated”. The learner clicks on Noun in “2. Select Class”. The learner selects Arabic in “3. Select Language(s)”. The learner selects Linguistic Data Consortium as the source in “4. Select Sources”. Figur

43、e 3 shows menu items 1 through 4 filled in by the learner. Figure 3 Learner Inputs Menu Items 1-4 ISO/IEC TR 29127:2011(E) ISO/IEC 2011 All rights reserved 5The learner selects Define Term in Item 5 “Execute query” menu. The GetSensesService retrieves the noun word senses for Truck, in this example,

44、 word senses are retrieved by GetSensesService from WordNet utilizing the Java WordNet Library (JWNL). The GetSensesService returns noun word senses for truck. The word senses for Truck is displayed in the return section of the interface. Figure 4 shows the noun word senses for Truck returned by the

45、 GetSensesService. Figure 4 Word Senses for Truck ISO/IEC TR 29127:2011(E) 6 ISO/IEC 2011 All rights reserved3.3 Select Word Senses The learner selects the relevant word sense for Truck. Figure 5 shows the word sense selected by the learner “an automotive vehicle suitable for hauling”. Figure 5 Word

46、 Sense Selected ISO/IEC TR 29127:2011(E) ISO/IEC 2011 All rights reserved 7The learner selects “Expand Term” at the far right of the word sense display. The GetNymService retrieves coordinate terms, hyponyms, nominalizations, hypernyms, and meronyms for the word sense selected. In this example, the

47、GetNymService retrieves expanded terms from WordNet utilizing the JWNL. Figure 6 shows the expanded terms returned from the GetNymService. Figure 6 Expanded Terms List ISO/IEC TR 29127:2011(E) 8 ISO/IEC 2011 All rights reserved3.4 Selecting and Translating Appropriate Terms The learner selects the t

48、erms of interest for translation and reverse translation. Terms selected are shown in Figure 7. Figure 7 Terms Selected ISO/IEC TR 29127:2011(E) ISO/IEC 2011 All rights reserved 9The learner selects “Translate Terms” above and to the right of the part meronyms list. The GetTranslatedWordService send

49、s the terms to the appropriate translator. In this example, the terms are sent to an Arabic word translator (LanguageWeaver). The terms translated into Arabic are reverse translated into the original language, in this example English. The translated and reverse translated terms are returned to the GetTranslatedWordService for display to the learner. Figure 8 shows the translated and reverse translated terms. The terms on the far left are the terms selected from the “Expanded Term” page. The Arabic terms in t

展开阅读全文
相关资源
猜你喜欢
相关搜索

当前位置:首页 > 标准规范 > 国际标准 > 其他

copyright@ 2008-2019 麦多课文库(www.mydoc123.com)网站版权所有
备案/许可证编号:苏ICP备17064731号-1