1、raising standards worldwideNO COPYING WITHOUT BSI PERMISSION EXCEPT AS PERMITTED BY COPYRIGHT LAWBSI Standards PublicationBS ISO 24610-2:2011Language resource management Feature structuresPart 2: Feature system declarationBS ISO 24610-2:2011 BRITISH STANDARDNational forewordThis British Standard is
2、the UK implementation of ISO 24610-2:2011. The UK participation in its preparation was entrusted to T e c h n i c a l Committee TS/1, Terminology.A list of organizations represented on this committee can be obtained on request to its secretary.This publication does not purport to include all the nec
3、essary provisions of a contract. Users are responsible for its correct application. The British Standards Institution 2013. Published by BSI Standards Limited 2013.ISBN 978 0 580 64013 1 ICS 01.140.20 Compliance with a British Standard cannot confer immunity from legal obligations.This British Stand
4、ard was published under the authority of the Standards Policy and Strategy Committee on 31 January 2013.Amendments issued since publicationDate T e x t a f f e c t e dBS ISO 24610-2:2011Reference numberISO 24610-2:2011(E)ISO 2011INTERNATIONAL STANDARD ISO24610-2First edition2011-10-01Language resour
5、ce management Feature structures Part 2: Feature system declaration Gestion des ressources langagires Structures de traits Partie 2: Dclaration de systme de structures de traits BS ISO 24610-2:2011ISO 24610-2:2011(E) COPYRIGHT PROTECTED DOCUMENT ISO 2011 All rights reserved. Unless otherwise specifi
6、ed, no part of this publication may be reproduced or utilized in any form or by any means, electronic or mechanical, including photocopying and microfilm, without permission in writing from either ISO at the address below or ISOs member body in the country of the requester. ISO copyright office Case
7、 postale 56 CH-1211 Geneva 20 Tel. + 41 22 749 01 11 Fax + 41 22 749 09 47 E-mail copyrightiso.org Web www.iso.org Published in Switzerland ii ISO 2011 All rights reservedBS ISO 24610-2:2011ISO 24610-2:2011(E) ISO 2011 All rights reserved iiiContents Page Foreword iv Introduction . v 1 Scope 1 2 Nor
8、mative references 1 3 Terms and definitions . 2 4 Overall structure 5 5 Basic concepts 6 5.1 Typed feature structures reviewed 6 5.2 Types 7 5.3 Type inheritance hierarchies 9 5.4 Type constraints 11 5.5 Optional (default) values and underspecification 12 5.6 Subsumption 12 6 Defining well-formednes
9、s versus validity. 14 6.1 Overview . 14 6.2 ISO 24610 . 14 7 A feature system for a grammar 19 7.1 Overview . 19 7.2 Sample FSDs 20 8 Declaration of a feature system . 23 8.1 Overview . 24 8.2 Linking a text to feature system declarations 24 8.3 Overall structure of a feature system declaration . 25
10、 8.4 Feature declarations . 27 8.5 Feature structure constraints 33 Annex A (normative) XML schema for feature structures 36 Annex B (informative) A complete example . 46 Bibliography 50 BS ISO 24610-2:2011ISO 24610-2:2011(E) iv ISO 2011 All rights reservedForeword ISO (the International Organizatio
11、n for Standardization) is a worldwide federation of national standards bodies (ISO member bodies). The work of preparing International Standards is normally carried out through ISO technical committees. Each member body interested in a subject for which a technical committee has been established has
12、 the right to be represented on that committee. International organizations, governmental and non-governmental, in liaison with ISO, also take part in the work. ISO collaborates closely with the International Electrotechnical Commission (IEC) on all matters of electrotechnical standardization. Inter
13、national Standards are drafted in accordance with the rules given in the ISO/IEC Directives, Part 2. The main task of technical committees is to prepare International Standards. Draft International Standards adopted by the technical committees are circulated to the member bodies for voting. Publicat
14、ion as an International Standard requires approval by at least 75 % of the member bodies casting a vote. Attention is drawn to the possibility that some of the elements of this document may be the subject of patent rights. ISO shall not be held responsible for identifying any or all such patent righ
15、ts. ISO 24610-2 was prepared by Technical Committee ISO/TC 37, Terminology and other language and content resources, Subcommittee SC 4, Language resource management. ISO 24610 consists of the following parts, under the general title Language resource management Feature structures: Part 1: Feature st
16、ructure representation Part 2: Feature system declaration BS ISO 24610-2:2011ISO 24610-2:2011(E) ISO 2011 All rights reserved vIntroduction ISO 24610 is organized in two separate main parts. Part 1, Feature structure representation, is dedicated to the description of feature structures, providing an
17、 informal and yet explicit outline of their characteristics, as well as an XML-based structured way of representing feature structures in general and typed feature structures in particular. It is designed to lay a basis for constructing an XML-based reference format for exchanging (typed) feature st
18、ructures between applications. Part 2, Feature system declaration, will provide an implementation standard for XML-based typed feature structures, first by defining a set of types and their hierarchy, then by formulating type constraints on a set of features and their respective admissible feature v
19、alues and finally by introducing a set of validity conditions on feature structures for particular applications, especially related to the goal of language resource management. A feature structure is a general-purpose data structure that identifies and groups together individual features by assignin
20、g a particular value to each. Because of the generality of feature structures, they can be used to represent many different kinds of information. Interrelations among various pieces of information and their instantiation in markup provide a meta-language for representing linguistic content. Moreover
21、, this instantiation allows a specification of a set of features and values associated with specific types and their restrictions, by means of feature system declarations, or other XML mechanisms to be discussed in this part of ISO 24610. Some of the statements here are copied from ISO 24610-1:2006
22、in order to make this part standalone without referring to part 1. BS ISO 24610-2:2011BS ISO 24610-2:2011INTERNATIONAL STANDARD ISO 24610-2:2011(E) ISO 2011 All rights reserved 1Language resource management Feature structures Part 2: Feature system declaration 1 Scope This part of ISO 24610 provides
23、 a format to represent, store or exchange feature structures in natural language applications, for both annotation and production of linguistic data. It is ultimately designed to provide a computer format to define a type hierarchy and to declare the constraints that bear on a set of feature specifi
24、cations and operations on feature structures, thus offering means to check the conformance of each feature structure with regards to a reference specification. Feature structures are an essential part of many linguistic formalisms as well as an underlying mechanism for representing the information c
25、onsumed or produced by and for language engineering applications. A feature system declaration (FSD) is an auxiliary file used in conjunction with a certain type of text that makes use of fs (that is, feature structure) elements. The FSD serves four purposes. It provides an encoding by which types a
26、nd their subtyping and inheritance relationships can be introduced and defined, thus laying the basis for constructing a feature system. It provides a mechanism by which the encoder can list all of the feature names and feature values and give a prose description as to what each represents. It provi
27、des a mechanism by which type constraints can be declared, against which typed feature structures are validated relative to a given theory stated in typed feature logic. These constraints may involve constraints on the range of a features value, constraints on which features are permitted within cer
28、tain types of feature structures, or constraints that prevent the co-occurrence of certain feature-value pairs. The source of these constraints is normally the empirical domain being modelled. It provides a mechanism by which the encoder can define the intended interpretation of underspecified featu
29、re structures. This involves defining default values (whether literal or computed) for missing features. The scheme described in this part of ISO 24610 may be used to document any feature system, but is primarily intended for use with the typed feature structure representation defined in ISO 24610-1
30、. The feature structure representations of ISO 24610-1 specify data structures that are subject to the typing conventions and constraints specified using ISO 24610-2. The feature structure representations of ISO 24610-1 are also used within some of the elements defined in ISO 24610-2. 2 Normative re
31、ferences The following referenced documents are indispensable for the application of this document. For dated references, only the edition cited applies. For undated references, the latest edition of the referenced document (including any amendments) applies. ISO 24610-1:2006, Language resource mana
32、gement Feature structures Part 1: Feature structure representation ISO/IEC 19757-2, Information technology Document Schema Definition Language (DSDL) Part 2: Regular-grammar-based validation RELAX NG BS ISO 24610-2:2011ISO 24610-2:2011(E) 2 ISO 2011 All rights reserved3 Terms and definitions For the
33、 purposes of this document, the terms and definitions given in ISO 19757-2 and the following apply. 3.1 admissibility constraint feature admissibility constraint specification of a set of admissible features (3.2) and admissible feature values (3.3) associated with a specific type (3.24) 3.2 admissi
34、ble feature appropriate feature feature which any feature structure (3.14) of a given type (3.24) may bear a value (3.17) for NOTE This term is often interpreted elsewhere to mean obligatory, i.e. feature structures of the given type must bear a value for every admissible feature. This term does not
35、 imply that the feature is obligatory here. 3.3 admissible feature value admissible value value restriction range restriction value (3.17) that the value of an admissible feature (3.2) must be subsumed by in feature structures (3.14) of a given type (3.24) 3.4 atomic type user-defined type (3.24) wi
36、th no admissible features (3.2) declared or inherited 3.5 bag multiset triple of an integer n, a set S and a function that maps the integers in the range, 1 to n, to elements of S NOTE A bag is halfway between a set (in that its elements are unordered) and a list (in that particular elements can occ
37、ur more than once). 3.6 built-in non-user-defined element that may appear in place of a feature structure (3.14), for example, as a feature value (3.17) NOTE Built-ins can be atomic or complex. The atomic built-ins are numeric, string, symbol and binary. The complex built-ins are collections (3.7) a
38、nd applications of the operators, i.e. alternation, negation and merge (5.2.4). 3.7 collection feature value (3.17) consisting of potentially many values, organized as a list, set or bag (3.5) 3.8 constraint unit of specification that identifies some collection of feature structures (3.14) as invali
39、d NOTE 1 All constraints are implicational in their syntactic form, although some are distinguished as admissibility constraints. See validity (3.31) and 5.4. All feature structures not explicitly excluded as invalid are considered to be valid. NOTE 2 A feature structure that has not been so identif
40、ied by any of the constraints in a feature system is considered to be valid. BS ISO 24610-2:2011ISO 24610-2:2011(E) ISO 2011 All rights reserved 33.9 default value value (3.17) otherwise assigned to a feature (3.12) when one is not specified EXAMPLE Masculine is the default value of the grammatical
41、gender in Dutch. NOTE A feature structure may not bear a feature without a corresponding value. 3.10 empty feature structure feature structure (3.14) that contains no information NOTE An empty feature structure subsumes all other feature structures. 3.11 extension converse of subsumption (3.21) NOTE
42、 A feature structure F extends G if and only if G subsumes F. 3.12 feature property or aspect of an entity that is formally represented as a function mapping the entity to a corresponding value (3.17) 3.13 feature specification pairing of a feature (3.12) with a value (3.17) in a feature structure d
43、escription 3.14 feature structure record structure that associates one value (3.17) to each of a collection of features NOTE 1 Each value is either a feature structure or a simpler built-in (3.6) such as a string. NOTE 2 Feature structures are partially ordered. The minimal feature structures in thi
44、s ordering are the empty feature structures. 3.15 feature system type hierarchy (3.26) in which each type (3.24) has been associated with a collection of admissibility constraints (3.1) and implicational constraints (3.18) NOTE cf. type declaration (3.25) 3.16 feature system declaration FSD specific
45、ation of a particular feature system (3.15) 3.17 feature value value entity or aggregation of entities that characterize some property or aspect of another entity BS ISO 24610-2:2011ISO 24610-2:2011(E) 4 ISO 2011 All rights reserved3.18 implicational constraint constraint of the form, “if G, then H,
46、” where G and H are feature structures (3.14) NOTE This identifies any feature structure F as invalid for which G subsumes F, and yet F and H have no valid extension in common. See subsumption (3.21) and 8.5. Often used to refer to implicational constraints that are not also admissibility constraint
47、s. 3.19 interpretation minimally informative (or equivalently, most general) extension (3.11) of a feature structure (3.14) that is consistent with a set of constraints declared by an FSD (3.16) 3.20 partial order partially ordered set set S equipped with a relation over S S that is (1) reflexive (f
48、or all s S, s s), (2) anti-symmetric (for all p, q S, if p q and q p, then p q), and (3) transitive (for all p, q, r S, if p q and q r, then p r) NOTE The set of integers Z is partially ordered, but it has an additional property: for every p, q Z, either p q or q p. Not all partial orders have this
49、property. The taxonomical classification of organisms into phyla, genera and species, for example, is a partial order that does not. Type hierarchies may not necessarily. The typed feature structures of a feature system do not, unless (a) their type hierarchy does, and (b) either the type hierarchy has exactly one type, or every y type is constrained to have exactly one appropriate feature. 3.21 subsumption property that holds between two feature structures, G and F, such that G