BS ISO 24616-2012 Language resources management Multilingual information framework《语言资源管理 多种语言信息框架》.pdf

上传人:rimleave225 文档编号:586683 上传时间:2018-12-15 格式:PDF 页数:52 大小:1.39MB
下载 相关 举报
BS ISO 24616-2012 Language resources management Multilingual information framework《语言资源管理 多种语言信息框架》.pdf_第1页
第1页 / 共52页
BS ISO 24616-2012 Language resources management Multilingual information framework《语言资源管理 多种语言信息框架》.pdf_第2页
第2页 / 共52页
BS ISO 24616-2012 Language resources management Multilingual information framework《语言资源管理 多种语言信息框架》.pdf_第3页
第3页 / 共52页
BS ISO 24616-2012 Language resources management Multilingual information framework《语言资源管理 多种语言信息框架》.pdf_第4页
第4页 / 共52页
BS ISO 24616-2012 Language resources management Multilingual information framework《语言资源管理 多种语言信息框架》.pdf_第5页
第5页 / 共52页
亲,该文档总共52页,到这儿已超出免费预览范围,如果喜欢就下载吧!
资源描述

1、raising standards worldwideNO COPYING WITHOUT BSI PERMISSION EXCEPT AS PERMITTED BY COPYRIGHT LAWBSI Standards PublicationBS ISO 24616:2012Language resources management Multilingual information frameworkBS ISO 24616:2012 BRITISH STANDARDNational forewordThis British Standard is the UK implementation

2、 of ISO 24616:2012. The UK participation in its preparation was entrusted to T e c h n i c a l Committee TS/1, Terminology.A list of organizations represented on this committee can be obtained on request to its secretary.This publication does not purport to include all the necessary provisions of a

3、contract. Users are responsible for its correct application. The British Standards Institution 2012. Published by BSI Standards Limited 2012.ISBN 978 0 580 66449 6 ICS 01.020 Compliance with a British Standard cannot confer immunity from legal obligations.This British Standard was published under th

4、e authority of the Standards Policy and Strategy Committee on 30 September 2012.Amendments issued since publicationDate T e x t a f f e c t e dBS ISO 24616:2012Reference numberISO 24616:2012(E)ISO 2012INTERNATIONALSTANDARD ISO24616First edition2012-09-01Language resources management Multilingual inf

5、ormation framework Gestion des ressources langagires Plateforme dinformationsmultilingues BS ISO 24616:2012ISO 24616:2012(E) COPYRIGHT PROTECTED DOCUMENT ISO 2012 All rights reserved. Unless otherwise specified, no part of this publication may be reproduced or utilized in any form or by any means, e

6、lectronic or mechanical, including photocopying and microfilm, without permission in writing from either ISO at the address below or ISOs member body in the country of the requester. ISO copyright office Case postale 56 CH-1211 Geneva 20 Tel. + 41 22 749 01 11 Fax + 41 22 749 09 47 E-mail copyrighti

7、so.org Web www.iso.org Published in Switzerland ii ISO 2012 All rights reservedBS ISO 24616:2012ISO 24616:2012(E) ISO 2012 All rights reserved iiiContents Page Foreword iv 1 Scope 1 2 Normative references 1 3 Terms and definitions . 1 4 Specification principles 2 4.1 Key standard used in the specifi

8、cation: Unified Modeling Language (UML) 2 4.2 Metamodel and adornment . 2 4.3 XML serialization . 2 5 Metamodel specification . 2 6 MLIF compliance . 3 7 Metamodel adornment 3 7.1 Introduction 3 7.2 General principles concerning the use of W3C generic attributes 3 7.3 Recommended adornment for GI 4

9、7.4 Recommended adornment for GroupC . 4 7.5 Recommended adornment for MultiC . 4 7.6 Recommended and mandatory adornment for MonoC . 5 7.7 Recommended adornment for SegC . 5 7.8 Recommended adornment for HistoC . 5 7.9 Recommended online annotation adornment 5 7.10 Recommended adornment for localiz

10、ation. 6 7.11 Recommended adornment for internationalization . 6 7.12 Recommended adornment for temporal synchronization 6 8 Relation with other standards 6 Annex A (informative) Example using MLIF for Computer-Assisted Translation (CAT) . 8 Annex B (informative) Example: representing TMX data 11 An

11、nex C (informative) Example of XLIFF data representation . 14 Annex D (informative) Example: representing smilText data . 18 Annex E (informative) Example of MLIF usage for subtitles (captioning) 20 Annex F (informative) Using MLIF for MAF data 26 Annex G (normative) Detailed specification 27 Biblio

12、graphy 42 BS ISO 24616:2012ISO 24616:2012(E) iv ISO 2012 All rights reserved Foreword ISO (the International Organization for Standardization) is a worldwide federation of national standards bodies (ISO member bodies). The work of preparing International Standards is normally carried out through ISO

13、 technical committees. Each member body interested in a subject for which a technical committee has been established has the right to be represented on that committee. International organizations, governmental and non-governmental, in liaison with ISO, also take part in the work. ISO collaborates cl

14、osely with the International Electrotechnical Commission (IEC) on all matters of electrotechnical standardization. International Standards are drafted in accordance with the rules given in the ISO/IEC Directives, Part 2. The main task of technical committees is to prepare International Standards. Dr

15、aft International Standards adopted by the technical committees are circulated to the member bodies for voting. Publication as an International Standard requires approval by at least 75 % of the member bodies casting a vote. Attention is drawn to the possibility that some of the elements of this doc

16、ument may be the subject of patent rights. ISO shall not be held responsible for identifying any or all such patent rights. ISO 24616 was prepared by Technical Committee ISO/TC 37, Terminology and other language and content resources, Subcommittee SC 4, Language resource management. BS ISO 24616:201

17、2INTERNATIONAL STANDARD ISO 24616:2012(E) ISO 2012 All rights reserved 1Language resources management Multilingual information framework 1 Scope This International Standard provides a generic platform for modelling and managing multilingual information in various domains: localization, translation,

18、multimedia annotation, document management, digital library support, and information or business modelling applications. MLIF (multilingual information framework) provides a metamodel and a set of generic data categories ISO 12620:2009 for various application domains. MLIF also provides strategies f

19、or the interoperability and/or linking of models including, but not limited to, XLIFF, TMX, smilText and ITS. 2 Normative references The following referenced documents are indispensable for the application of this document. For dated references, only the edition cited applies. For undated references

20、, the latest edition of the referenced document (including any amendments) applies. ISO 12620:2009; Terminology and other language and content resources Specification of data categories and management of a Data Category Registry for language resources ISO 8879, Information processing Text and office

21、 systems Generalized Markup Language (SGML) Extensible Markup Language. Fifth Edition, T. Bray, J. Paoli, C. M. Sperberg-McQueen, E. Maler, F. Yergeau Editors, W3C Recommendation, 26 November 2008, http:/www.w3.org/TR/xml 3 Terms and definitions For the purposes of this document, the following terms

22、 and definitions apply: 3.1 adornment data category attached to a component of a metamodel 3.2 inline code inline instructions inserted in a source document Note to entry: Native code can, for instance, provide presentational instructions (e.g. HTML codes). 3.3 subtitle textual versions of the dialo

23、g in films, television programs, video games, etc., usually displayed at the bottom of the screen 3.4 working language language in which linguistic sequences are expressed BS ISO 24616:2012ISO 24616:2012(E) 2 ISO 2012 All rights reserved4 Specification principles 4.1 Key standard used in the specifi

24、cation: Unified Modeling Language (UML) The MLIF specification complies with the modelling principles of UML as defined by the Object Management Group (OMG) UML. The specification uses the UML subset that is relevant for the purposes of MLIF. 4.2 Metamodel and adornment In line with Terminological M

25、arkup Framework (TMF) as defined in ISO 16642, MLIF defines a metamodel that is adorned by data categories, as defined in ISO 12620. 4.3 XML serialization Associated with the metamodel and its adornment, MLIF proposes a representation in XML called “XML serialization”, in line with Extensible Markup

26、 Language (XML) as defined in ISO 8879. 5 Metamodel specification The MLIF metamodel is specified in the UML object diagram in Figure 1. Figure 1 MLIF metamodel BS ISO 24616:2012ISO 24616:2012(E) ISO 2012 All rights reserved 3The MLIF metamodel is defined by the following seven “core components“. Th

27、ese components are listed as follows, according to their XML serialization: (Multilingual Data Collection), which represents a collection of data containing global information and several multilingual units; (Global Information), which represents technical and administrative information applying to

28、the entire multilingual data collection; (Grouping components), which represents a sub-collection of multilingual data that have a common origin or purpose within a given project; (Multilingual Component), which groups together all variants of a given textual content; (Monolingual Component), which

29、groups together information related to one language and is part of a multilingual component (MultiC); (History Component), which traces modifications to the component to which it is anchored (i.e. versioning); (Segmentation Component), which allows any level of segmentation for textual information,

30、possibly in a recursive manner. 6 MLIF compliance Any format compliant with this International Standard may use the MLIF metamodel in two possible ways: by fully implementing the MLIF metamodel starting at the level of ; by specifically embedding MLIF-compliant information within another model, by i

31、mplementing one of the lower level MLIF elements, namely , or . 7 Metamodel adornment 7.1 Introduction The MLIF XML serialization proposes a set of XML elements and XML attributes, which are described in the following sections, where the characters “” delimit the name of the element. Following the T

32、EI guidelines (http:/www.tei-c.org), some attributes are specified by means of a class attribute, with the convention that the name of the class attribute is prefixed by “att.” (e.g. “att.xlink”). The other XML attributes are listed with the convention that two quotes delimit the name of the attribu

33、te (e.g. “xml:lang”). The specifications in Annex G shall be applied. 7.2 General principles concerning the use of W3C generic attributes The following W3C attributes are to be used by all MLIF-compliant applications: the attribute xml:lang shall be used in accordance with W3C recommendations to rep

34、resent the working language of any relevant element and, in particular, shall be used systematically for any implementation of MonoC; the attribute xml:id shall be used in accordance with W3C recommendations to provide a unique identifier to an element of the MLIF metamodel. BS ISO 24616:2012ISO 246

35、16:2012(E) 4 ISO 2012 All rights reserved7.3 Recommended adornment for GI 7.4 Recommended adornment for GroupC 7.5 Recommended adornment for MultiC BS ISO 24616:2012ISO 24616:2012(E) ISO 2012 All rights reserved 57.6 Recommended and mandatory adornment for MonoC att.lang att.xlink The language attri

36、bute is mandatory on MonoC. All other adornments are optional. 7.7 Recommended adornment for SegC att.linguistic att.xlink 7.8 Recommended adornment for HistoC The HistoC component is a generic component that traces modifications made on the component to which it is anchored (e.g. creation, modifica

37、tion and validation). In the MLIF metamodel, the HistoC component may be anchored to the GI, MultiC or MonoC component. This makes it possible for all evolutions of, or enhancements to, the component to be recorded. HistoC may be adorned by four elements: 7.9 Recommended online annotation adornment

38、Multilingual text documents are often only one stage in a complex workflow that involves external document sources in a wide variety of formats. From these, it is often necessary to keep inline markup indicating the presentational features that have to be retained in a translated target document. To

39、 this end, MLIF-compliant applications should use the following elements, in relation to the element, that map onto similar subsets in TMX and XLIFF: BS ISO 24616:2012ISO 24616:2012(E) 6 ISO 2012 All rights reserved 7.10 Recommended adornment for localization All the following elements should be use

40、d to provide localization-related information: 7.11 Recommended adornment for internationalization 7.12 Recommended adornment for temporal synchronization The following elements should be used when textual content has to be conveyed (in written or spoken form) together with some constraints: 8 Relat

41、ion with other standards As with the “Terminological Markup Framework” TMF ISO 16642 in terminology, MLIF introduces a metamodel that combines with selected data categories as a way of ensuring interoperability between several multilingual applications and corpora. MLIF deals with multilingual corpo

42、ra, multilingual fragments, and the translation relations between them. In each domain where MLIF is applicable, a specific granularity may be considered for segmentation and description. These two last processes may rely on MAF ISO 24611, SynAF ISO 24615 and TMF for morphological description, synta

43、ctical annotation and terminological description respectively. MLIF supports the construction and the interoperability of localization and translation memories resources, and also deals with the description of a metamodel for multilingual content. MLIF does not propose a closed list of description f

44、eatures. Rather, it provides a list of data categories that is much easier to update and extend. This list represents a point of reference for multilingual information in the context of various application scenarios. However, MLIF not only describes elementary linguistic segments (e.g. sentence, syn

45、tactical fragment, word and part of speech), but may also be used to represent document structure (e.g. title, abstract, paragraph and section). In addition, MLIF allows for external and internal links (annotations and references). MLIF is designed to provide a common framework that facilitates the

46、interoperability with formats such as TMX (LISA OSCAR) and XLIFF (OASIS). MLIF can be seen as a parent of these formats, since both of them BS ISO 24616:2012ISO 24616:2012(E) ISO 2012 All rights reserved 7deal with multilingual data expressed in the form of segments or text units. Both can be stored

47、, manipulated and translated in a similar manner. Examples of using MLIF are given in Annexes A to F. BS ISO 24616:2012ISO 24616:2012(E) 8 ISO 2012 All rights reservedAnnex A (informative) Example using MLIF for Computer-Assisted Translation (CAT) The main reason for lemma, part-of-speech and morpho

48、logical features is to allow CAT tools based on translation memory to produce translations of new words and sentences that are not in the translation database. For example, using a translation memory that contains the English sentence “The meal is nice.“ and its translation in French “Le repas est b

49、on.“, current CAT tools such as SDL TRADOS1)Translators Workbench are not able to provide the predicted translation for the sentence “The meals are nice.“ even though the word lemmas of “The meal is nice.“ and “The meals are nice.“ are matching. This weakness is due to the fact that these tools use limited linguistic criteria during

展开阅读全文
相关资源
猜你喜欢
相关搜索

当前位置:首页 > 标准规范 > 国际标准 > BS

copyright@ 2008-2019 麦多课文库(www.mydoc123.com)网站版权所有
备案/许可证编号:苏ICP备17064731号-1