BS ISO 24610-1-2010 Language resource management - Feature structures - Feature structure representation《语言资源管理 特征结构 特征结构表示法》.pdf

上传人:orderah291 文档编号:586676 上传时间:2018-12-15 格式:PDF 页数:90 大小:1.22MB
下载 相关 举报
BS ISO 24610-1-2010 Language resource management - Feature structures - Feature structure representation《语言资源管理 特征结构 特征结构表示法》.pdf_第1页
第1页 / 共90页
BS ISO 24610-1-2010 Language resource management - Feature structures - Feature structure representation《语言资源管理 特征结构 特征结构表示法》.pdf_第2页
第2页 / 共90页
BS ISO 24610-1-2010 Language resource management - Feature structures - Feature structure representation《语言资源管理 特征结构 特征结构表示法》.pdf_第3页
第3页 / 共90页
BS ISO 24610-1-2010 Language resource management - Feature structures - Feature structure representation《语言资源管理 特征结构 特征结构表示法》.pdf_第4页
第4页 / 共90页
BS ISO 24610-1-2010 Language resource management - Feature structures - Feature structure representation《语言资源管理 特征结构 特征结构表示法》.pdf_第5页
第5页 / 共90页
亲,该文档总共90页,到这儿已超出免费预览范围,如果喜欢就下载吧!
资源描述

1、raising standards worldwideNO COPYING WITHOUT BSI PERMISSION EXCEPT AS PERMITTED BY COPYRIGHT LAWBSI Standards PublicationBS ISO 24610-1:2006Language resourcesmanagement FeaturestructuresPart 1: Feature structure representationBS ISO 24610-1:2006 BRITISH STANDARDNational forewordThis British Standar

2、d is the UK implementation of ISO 24610-1:2006.The UK participation in its preparation was entrusted to TechnicalCommittee TS/1, Terminology.A list of organizations represented on this committee can beobtained on request to its secretary.This publication does not purport to include all the necessary

3、provisions of a contract. Users are responsible for its correctapplication. BSI 2010ISBN 978 0 580 54233 6ICS 01.140.20Compliance with a British Standard cannot confer immunity fromlegal obligations.This British Standard was published under the authority of theStandards Policy and Strategy Committee

4、 on 31 July 2010Amendments issued since publicationDate Text affectedBS ISO 24610-1:2006Reference numberISO 24610-1:2006(E)ISO 2006INTERNATIONAL STANDARD ISO24610-1FIrst edition2006-04-15Language resource management Feature structures Part 1: Feature structure representation Gestion des ressources l

5、inguistiques Structures de traits Partie 1: Reprsentation de structures de traits BS ISO 24610-1:2006ISO 24610-1:2006(E) PDF disclaimer This PDF file may contain embedded typefaces. In accordance with Adobes licensing policy, this file may be printed or viewed but shall not be edited unless the type

6、faces which are embedded are licensed to and installed on the computer performing the editing. In downloading this file, parties accept therein the responsibility of not infringing Adobes licensing policy. The ISO Central Secretariat accepts no liability in this area. Adobe is a trademark of Adobe S

7、ystems Incorporated. Details of the software products used to create this PDF file can be found in the General Info relative to the file; the PDF-creation parameters were optimized for printing. Every care has been taken to ensure that the file is suitable for use by ISO member bodies. In the unlike

8、ly event that a problem relating to it is found, please inform the Central Secretariat at the address given below. ISO 2006 All rights reserved. Unless otherwise specified, no part of this publication may be reproduced or utilized in any form or by any means, electronic or mechanical, including phot

9、ocopying and microfilm, without permission in writing from either ISO at the address below or ISOs member body in the country of the requester. ISO copyright office Case postale 56 CH-1211 Geneva 20 Tel. + 41 22 749 01 11 Fax + 41 22 749 09 47 E-mail copyrightiso.org Web www.iso.org Published in Swi

10、tzerland ii ISO 2006 All rights reservedBS ISO 24610-1:2006ISO 24610-1:2006(E) ISO 2006 All rights reserved iiiContents Page Foreword. v Introduction . vi 1 Scope . 1 2 Normative references . 1 3 Terms and definitions. 1 4 General characteristics of feature structure 4 4.1 Overview 4 4.2 Use of feat

11、ure structures . 4 4.3 Basic concepts 5 4.4 Notations . 5 4.4.1 Overview 5 4.4.2 Graph notation 6 4.4.3 Matrix notation 7 4.4.4 XML-based notation 8 4.5 Structure sharing 10 4.6 Collections as complex feature values. 12 4.6.1 Overview 12 4.6.2 Lists as feature values . 12 4.6.3 Sets as feature value

12、s 14 4.6.4 Multisets as feature values 15 4.7 Typed feature structure 16 4.7.1 Overview 16 4.7.2 Types 16 4.7.3 Notations . 16 4.8 Subsumption: relation on feature structures 18 4.8.1 Overview 18 4.8.2 Definition . 18 4.8.3 Condition A on path values . 19 4.8.4 Condition B on structure sharing . 19

13、4.8.5 Condition C on type ordering 20 4.9 Operations on feature structures and feature values. 21 4.9.1 Overview 21 4.9.2 Compatibility . 21 4.9.3 Unification . 22 4.9.4 Unification of shared structures . 22 4.10 Operations on feature values and types 23 4.10.1 Concatenation and union operations . 2

14、3 4.10.2 Alternation . 24 4.10.3 Negation. 25 4.11 Informal semantics of feature structures. 27 5 XML Representation of feature structures. 29 5.1 Overview 29 5.2 Organization 29 5.3 Elementary feature structures and the binary feature value 30 5.4 Other atomic feature values 32 5.5 Feature and feat

15、ure-value libraries . 35 5.6 Feature structures as complex feature values 37 5.7 Re-entrant feature structures 40 5.8 Collections as complex feature values. 41 BS ISO 24610-1:2006ISO 24610-1:2006(E) iv ISO 2006 All rights reserved5.9 Feature value expressions. 44 5.9.1 Overview 44 5.9.2 Alternation

16、. 44 5.9.3 Negation. 47 5.9.4 Collection of values 48 5.10 Default values 48 5.11 Linking text and analysis . 50 Annex A (informative) Formal definitions and implementation of the XML representation of feature structures 54 A.1 Overview 54 A.2 RELAX NG specification for the module 54 Annex B (inform

17、ative) Examples for illustration . 60 Annex C (informative) Type inheritance hierarchies. 62 C.1 Overview 62 C.2 Definition 62 C.3 Multiple inheritance 64 C.4 Type constraints . 64 Annex D (informative) Denotational semantics of feature structure. 66 D.1 Feature structure signatures . 66 D.2 Feature

18、 structure algebra. 66 D.3 FS domains 67 D.4 Feature structure interpretations 68 D.5 Satisfiability . 68 D.6 Subsumption . 68 D.7 Unification 69 Annex E (informative) Use of feature structures in applications. 70 E.1 Overview 70 E.2 Phonological representation 70 E.3 Grammar formalisms or theories

19、70 E.4 Computational implementations . 71 Bibliography . 75 BS ISO 24610-1:2006ISO 24610-1:2006(E) ISO 2006 All rights reserved vForeword ISO (the International Organization for Standardization) is a worldwide federation of national standards bodies (ISO member bodies). The work of preparing Interna

20、tional Standards is normally carried out through ISO technical committees. Each member body interested in a subject for which a technical committee has been established has the right to be represented on that committee. International organizations, governmental and non-governmental, in liaison with

21、ISO, also take part in the work. ISO collaborates closely with the International Electrotechnical Commission (IEC) on all matters of electrotechnical standardization. International Standards are drafted in accordance with the rules given in the ISO/IEC Directives, Part 2. The main task of technical

22、committees is to prepare International Standards. Draft International Standards adopted by the technical committees are circulated to the member bodies for voting. Publication as an International Standard requires approval by at least 75 % of the member bodies casting a vote. Attention is drawn to t

23、he possibility that some of the elements of this document may be the subject of patent rights. ISO shall not be held responsible for identifying any or all such patent rights. ISO 24610-1 was prepared by Technical Committee ISO/TC 37, Terminology and other language and content resources, Subcommitte

24、e SC 4, Language resource management. ISO 24610 consists of the following parts, under the general title Language resource management Feature structures: Part 1: Feature structure representation The following part is under preparation: Part 2: Feature system declaration BS ISO 24610-1:2006ISO 24610-

25、1:2006(E) vi ISO 2006 All rights reservedIntroduction This part of ISO 24610 results from the agreement between the Text Encoding Initiative Consortium (TEI) and the ISO TC 37/SC 4 that a joint activity should take place to revise the two existing chapters on feature structures and feature system de

26、claration in The TEI Guidelines called P4. It is foreseen that ISO 24610 will have the following two parts. Part 1, Feature structure representation, describes feature structures and their representation. It provides an informal but explicit overview of their basic characteristics and formal semanti

27、cs. In addition, part 1 defines a standard XML (eXtended Markup Language) vocabulary for the representation of untyped feature structures, feature values, and feature libraries. It thus provides a reference format for the exchange of feature structure representations between different application sy

28、stems. Part 2, Feature system declaration, discusses ways of validating typed feature structures which are conformant to part 1, and of enforcing application-specific constraints. It proposes an XML vocabulary for the representation of such constraints with reference to a set of features and the ran

29、ge of values appropriate for them, and thus facilitates representation and validation of a type hierarchy as well as other well-formedness conditions for particular applications, in particular those related to the goal of language resource management. BS ISO 24610-1:2006INTERNATIONAL STANDARD ISO 24

30、610-1:2006(E) ISO 2006 All rights reserved 1Language resource management Feature structures Part 1: Feature structure representation 1 Scope Feature structures are an essential part of many linguistic formalisms as well as an underlying mechanism for representing the information consumed or produced

31、 by and for language engineering applications. This part of ISO 24610 provides a format for the representation, storage and exchange of feature structures in natural language applications concerned with the annotation, production or analysis of linguistic data. It also defines a computer format for

32、the description of constraints that bear on a set of features, feature values, feature specifications and operations on feature structures, thus offering a means of checking the conformance of each feature structure with regards to a reference specification. 2 Normative references The following refe

33、renced documents are indispensable for the application of this document. For dated references, only the edition cited applies. For undated references, the latest edition of the referenced document (including any amendments) applies. ISO 8879, Information processing Text and office systems Standard G

34、eneralized Markup Language (SGML), as extended by TC 2 (ISO/IEC JTC 1/SC 34 N029: 1998-12-06). ISO 19757-2, Information technology Document Schema Definition Language (DSDL) Part 2: Regular-grammar-based validation RELAX NG NOTE The first reference permits the use of XML and the second, RELAX NG,pro

35、vides a specification for XML modules. RELAX NG is a schema language for XML, standing for REgular LAnguage for XML for Next Generation, and simplifies and extends the features of DTDs, Document Type Definitions. 3 Terms and definitions For the purposes of this document, the terms and definitions gi

36、ven in ISO 8879 and ISO 19757-2 and the following apply. This list is provided to clarify the terminology relating to feature structures used throughout this part of ISO 24610. Terminology derived from XLM and other formal languages is not defined here. 3.1 alternation operation on feature values (3

37、.23) that returns one and only one of the values supplied as its argument NOTE Given a feature specification F : a|b, where a|b denotes the alternation of a and b, F has either the value a or the value b, but not both. 3.2 atomic value value (3.23) without internal structure, i.e. value other than f

38、eature structure (3.10) and collection (3.4) BS ISO 24610-1:2006ISO 24610-1:2006(E) 2 ISO 2006 All rights reserved3.3 boxed label label in box used in a matrix notation to denote a value shared by several features (3.8) NOTE The label may be any alphanumeric symbol. 3.4 collection list, set, or mult

39、iset of values (3.23) NOTE A list is an ordered collection of entities some of which may be identical. A set is an unordered collection of unique entities. A multiset is an unordered collection of entities that may or may not be unique; it is sometimes referred to as a bag. 3.5 complex value value (

40、3.23) represented either as a feature structure (3.10) or as collection (3.4) 3.6 concatenation operation of combining two lists of values (3.23) into a single list 3.7 empty feature structure feature structure (3.10) containing no feature specifications (3.9) 3.8 feature property of an entity NOTE

41、The combination of feature and feature-value constitutes a feature specification (3.9). For example, number is a feature, singular is a value, and a pair is a feature specification. 3.9 feature specification assignment of a value (3.23) to a feature (3.8) NOTE Formally, it is treated as a pair of a

42、feature and its value. 3.10 feature structure set of feature specifications (3.9) NOTE The minimum feature structure is the empty feature structure (3.7). 3.11 graph notation notation of feature structure (3.10) in a single rooted graph 3.12 incompatibility relation between two feature structures (3

43、.10) which have conflicting types (3.19) or at least one common feature (3.8) with incompatible values (3.23) NOTE Two feature structures that are incompatible cannot be unified. The empty feature structure (3.7) is compatible with any other feature structure. BS ISO 24610-1:2006ISO 24610-1:2006(E)

44、ISO 2006 All rights reserved 33.13 matrix notation attribute-value matrix AVM notation that uses square brackets to represent feature structures (3.10) NOTE In a matrix notation, each row represents a feature specification (3.9), with the feature name and the feature value separated by a colon (:),

45、space ( ) or the equals sign (=). 3.14 merge generic operation that includes union (3.22) of sets or multisets and concatenation (3.6) of lists 3.15 negation (unary) operation on a value (3.23) denoting any other value incompatible with it NOTE In this part of ISO 24610, negation applies to values o

46、nly and is not understood as a truth function as in ordinary bivalent logics. 3.16 path sequence of labeled arcs connecting nodes in a graph 3.17 structure sharing re-entrancy relation between two or more features (3.8) within a feature structure (3.10) that share a value (3.23) 3.18 subsumption rel

47、ationship between two feature structures (3.10) in which one is more specific than the other NOTE A feature structure A is said to subsume a feature structure B if A is at least as informative as B. Subsumption is a reflexive, antisymmetric, and transitive relation between two feature structures. 3.

48、19 type name of a class of entities NOTE Feature structures (3.10) may be characterized by grouping them into certain classes. Types are used to name such classes. 3.20 typed feature structure feature structure (3.10) labelled by a type (3.19) NOTE In the graph notation (3.11), each node is labelled

49、 with a type. In the matrix notation (3.13), a type is ordinarily placed at the upper left corner of the inside of the pair of square brackets that represents a typed feature structure. In XML notation, the type is supplied as the value (3.23) of a type attribute on the element. 3.21 unification operation that combines two compatible feature structures (3.10) into the least informative feature structure that contains the information from the two 3.22 union operation that combines two sets, or multisets, into one NOTE

展开阅读全文
相关资源
猜你喜欢
相关搜索

当前位置:首页 > 标准规范 > 国际标准 > BS

copyright@ 2008-2019 麦多课文库(www.mydoc123.com)网站版权所有
备案/许可证编号:苏ICP备17064731号-1