1、 Reference number ISO 24610-2:2011(E) ISO 2011INTERNATIONAL STANDARD ISO 24610-2 First edition 2011-10-01 Language resource management Feature structures Part 2: Feature system declaration Gestion des ressources langagires Structures de traits Partie 2: Dclaration de systme de structures de traits I
2、SO 24610-2:2011(E) COPYRIGHT PROTECTED DOCUMENT ISO 2011 All rights reserved. Unless otherwise specified, no part of this publication may be reproduced or utilized in any form or by any means, electronic or mechanical, including photocopying and microfilm, without permission in writing from either I
3、SO at the address below or ISOs member body in the country of the requester. ISO copyright office Case postale 56 CH-1211 Geneva 20 Tel. + 41 22 749 01 11 Fax + 41 22 749 09 47 E-mail copyrightiso.org Web www.iso.org Published in Switzerland ii ISO 2011 All rights reservedISO 24610-2:2011(E) ISO 201
4、1 All rights reserved iiiContents Page Foreword iv Introduction . v 1 Scope 1 2 Normative references 1 3 Terms and definitions . 2 4 Overall structure 5 5 Basic concepts 6 5.1 Typed feature structures reviewed 6 5.2 Types 7 5.3 Type inheritance hierarchies 9 5.4 Type constraints 11 5.5 Optional (def
5、ault) values and underspecification 12 5.6 Subsumption 12 6 Defining well-formedness versus validity. 14 6.1 Overview . 14 6.2 ISO 24610 . 14 7 A feature system for a grammar 19 7.1 Overview . 19 7.2 Sample FSDs 20 8 Declaration of a feature system . 23 8.1 Overview . 24 8.2 Linking a text to featur
6、e system declarations 24 8.3 Overall structure of a feature system declaration . 25 8.4 Feature declarations . 27 8.5 Feature structure constraints 33 Annex A (normative) XML schema for feature structures 36 Annex B (informative) A complete example . 46 Bibliography 50 ISO 24610-2:2011(E) iv ISO 201
7、1 All rights reservedForeword ISO (the International Organization for Standardization) is a worldwide federation of national standards bodies (ISO member bodies). The work of preparing International Standards is normally carried out through ISO technical committees. Each member body interested in a
8、subject for which a technical committee has been established has the right to be represented on that committee. International organizations, governmental and non-governmental, in liaison with ISO, also take part in the work. ISO collaborates closely with the International Electrotechnical Commission
9、 (IEC) on all matters of electrotechnical standardization. International Standards are drafted in accordance with the rules given in the ISO/IEC Directives, Part 2. The main task of technical committees is to prepare International Standards. Draft International Standards adopted by the technical com
10、mittees are circulated to the member bodies for voting. Publication as an International Standard requires approval by at least 75 % of the member bodies casting a vote. Attention is drawn to the possibility that some of the elements of this document may be the subject of patent rights. ISO shall not
11、 be held responsible for identifying any or all such patent rights. ISO 24610-2 was prepared by Technical Committee ISO/TC 37, Terminology and other language and content resources, Subcommittee SC 4, Language resource management. ISO 24610 consists of the following parts, under the general title Lan
12、guage resource management Feature structures: Part 1: Feature structure representation Part 2: Feature system declaration ISO 24610-2:2011(E) ISO 2011 All rights reserved vIntroduction ISO 24610 is organized in two separate main parts. Part 1, Feature structure representation, is dedicated to the de
13、scription of feature structures, providing an informal and yet explicit outline of their characteristics, as well as an XML-based structured way of representing feature structures in general and typed feature structures in particular. It is designed to lay a basis for constructing an XML-based refer
14、ence format for exchanging (typed) feature structures between applications. Part 2, Feature system declaration, will provide an implementation standard for XML-based typed feature structures, first by defining a set of types and their hierarchy, then by formulating type constraints on a set of featu
15、res and their respective admissible feature values and finally by introducing a set of validity conditions on feature structures for particular applications, especially related to the goal of language resource management. A feature structure is a general-purpose data structure that identifies and gr
16、oups together individual features by assigning a particular value to each. Because of the generality of feature structures, they can be used to represent many different kinds of information. Interrelations among various pieces of information and their instantiation in markup provide a meta-language
17、for representing linguistic content. Moreover, this instantiation allows a specification of a set of features and values associated with specific types and their restrictions, by means of feature system declarations, or other XML mechanisms to be discussed in this part of ISO 24610. Some of the stat
18、ements here are copied from ISO 24610-1:2006 in order to make this part standalone without referring to part 1. INTERNATIONAL STANDARD ISO 24610-2:2011(E) ISO 2011 All rights reserved 1Language resource management Feature structures Part 2: Feature system declaration 1 Scope This part of ISO 24610 p
19、rovides a format to represent, store or exchange feature structures in natural language applications, for both annotation and production of linguistic data. It is ultimately designed to provide a computer format to define a type hierarchy and to declare the constraints that bear on a set of feature
20、specifications and operations on feature structures, thus offering means to check the conformance of each feature structure with regards to a reference specification. Feature structures are an essential part of many linguistic formalisms as well as an underlying mechanism for representing the inform
21、ation consumed or produced by and for language engineering applications. A feature system declaration (FSD) is an auxiliary file used in conjunction with a certain type of text that makes use of fs (that is, feature structure) elements. The FSD serves four purposes. It provides an encoding by which
22、types and their subtyping and inheritance relationships can be introduced and defined, thus laying the basis for constructing a feature system. It provides a mechanism by which the encoder can list all of the feature names and feature values and give a prose description as to what each represents. I
23、t provides a mechanism by which type constraints can be declared, against which typed feature structures are validated relative to a given theory stated in typed feature logic. These constraints may involve constraints on the range of a features value, constraints on which features are permitted wit
24、hin certain types of feature structures, or constraints that prevent the co-occurrence of certain feature-value pairs. The source of these constraints is normally the empirical domain being modelled. It provides a mechanism by which the encoder can define the intended interpretation of underspecifie
25、d feature structures. This involves defining default values (whether literal or computed) for missing features. The scheme described in this part of ISO 24610 may be used to document any feature system, but is primarily intended for use with the typed feature structure representation defined in ISO
26、24610-1. The feature structure representations of ISO 24610-1 specify data structures that are subject to the typing conventions and constraints specified using ISO 24610-2. The feature structure representations of ISO 24610-1 are also used within some of the elements defined in ISO 24610-2. 2 Norma
27、tive references The following referenced documents are indispensable for the application of this document. For dated references, only the edition cited applies. For undated references, the latest edition of the referenced document (including any amendments) applies. ISO 24610-1:2006, Language resour
28、ce management Feature structures Part 1: Feature structure representation ISO/IEC 19757-2, Information technology Document Schema Definition Language (DSDL) Part 2: Regular-grammar-based validation RELAX NG ISO 24610-2:2011(E) 2 ISO 2011 All rights reserved3 Terms and definitions For the purposes of
29、 this document, the terms and definitions given in ISO 19757-2 and the following apply. 3.1 admissibility constraint feature admissibility constraint specification of a set of admissible features (3.2) and admissible feature values (3.3) associated with a specific type (3.24) 3.2 admissible feature
30、appropriate feature feature which any feature structure (3.14) of a given type (3.24) may bear a value (3.17) for NOTE This term is often interpreted elsewhere to mean obligatory, i.e. feature structures of the given type must bear a value for every admissible feature. This term does not imply that
31、the feature is obligatory here. 3.3 admissible feature value admissible value value restriction range restriction value (3.17) that the value of an admissible feature (3.2) must be subsumed by in feature structures (3.14) of a given type (3.24) 3.4 atomic type user-defined type (3.24) with no admiss
32、ible features (3.2) declared or inherited 3.5 bag multiset triple of an integer n, a set S and a function that maps the integers in the range, 1 to n, to elements of S NOTE A bag is halfway between a set (in that its elements are unordered) and a list (in that particular elements can occur more than
33、 once). 3.6 built-in non-user-defined element that may appear in place of a feature structure (3.14), for example, as a feature value (3.17) NOTE Built-ins can be atomic or complex. The atomic built-ins are numeric, string, symbol and binary. The complex built-ins are collections (3.7) and applicati
34、ons of the operators, i.e. alternation, negation and merge (5.2.4). 3.7 collection feature value (3.17) consisting of potentially many values, organized as a list, set or bag (3.5) 3.8 constraint unit of specification that identifies some collection of feature structures (3.14) as invalid NOTE 1 All
35、 constraints are implicational in their syntactic form, although some are distinguished as admissibility constraints. See validity (3.31) and 5.4. All feature structures not explicitly excluded as invalid are considered to be valid. NOTE 2 A feature structure that has not been so identified by any o
36、f the constraints in a feature system is considered to be valid. ISO 24610-2:2011(E) ISO 2011 All rights reserved 33.9 default value value (3.17) otherwise assigned to a feature (3.12) when one is not specified EXAMPLE Masculine is the default value of the grammatical gender in Dutch. NOTE A feature
37、 structure may not bear a feature without a corresponding value. 3.10 empty feature structure feature structure (3.14) that contains no information NOTE An empty feature structure subsumes all other feature structures. 3.11 extension converse of subsumption (3.21) NOTE A feature structure F extends
38、G if and only if G subsumes F. 3.12 feature property or aspect of an entity that is formally represented as a function mapping the entity to a corresponding value (3.17) 3.13 feature specification pairing of a feature (3.12) with a value (3.17) in a feature structure description 3.14 feature structu
39、re record structure that associates one value (3.17) to each of a collection of features NOTE 1 Each value is either a feature structure or a simpler built-in (3.6) such as a string. NOTE 2 Feature structures are partially ordered. The minimal feature structures in this ordering are the empty featur
40、e structures. 3.15 feature system type hierarchy (3.26) in which each type (3.24) has been associated with a collection of admissibility constraints (3.1) and implicational constraints (3.18) NOTE cf. type declaration (3.25) 3.16 feature system declaration FSD specification of a particular feature s
41、ystem (3.15) 3.17 feature value value entity or aggregation of entities that characterize some property or aspect of another entity ISO 24610-2:2011(E) 4 ISO 2011 All rights reserved3.18 implicational constraint constraint of the form, “if G, then H,” where G and H are feature structures (3.14) NOTE
42、 This identifies any feature structure F as invalid for which G subsumes F, and yet F and H have no valid extension in common. See subsumption (3.21) and 8.5. Often used to refer to implicational constraints that are not also admissibility constraints. 3.19 interpretation minimally informative (or e
43、quivalently, most general) extension (3.11) of a feature structure (3.14) that is consistent with a set of constraints declared by an FSD (3.16) 3.20 partial order partially ordered set set S equipped with a relation over S S that is (1) reexive (for all s S, s s), (2) anti-symmetric (for all p, q S
44、, if p q and q p, then p q), and (3) transitive (for all p, q, r S, if p q and q r, then p r) NOTE The set of integers Z is partially ordered, but it has an additional property: for every p, q Z, either p q or q p. Not all partial orders have this property. The taxonomical classication of organisms
45、into phyla, genera and species, for example, is a partial order that does not. Type hierarchies may not necessarily. The typed feature structures of a feature system do not, unless (a) their type hierarchy does, and (b) either the type hierarchy has exactly one type, or every y type is constrained t
46、o have exactly one appropriate feature. 3.21 subsumption property that holds between two feature structures, G and F, such that G is said to subsume F if and only if F carries all of the information with it that G does NOTE A formal definition is provided in 5.6. 3.22 subtype type (3.24) to which an
47、other type confers its constraints and appropriate features 3.23 supertype base type type (3.24) from which another type inherits constraints and appropriate features NOTE s is a subtype of t iff t is a supertype of s. Every type is a subtype and supertype of itself. 3.24 semantic type type referrin
48、g expression that distinguishes a collection of feature structures (3.14) as an identifiable and conceptually significant class NOTE As implied by the name semantic type, types in this part of ISO 24610 do not serve to distinguish feature structures or their specifications syntactically. 3.25 type d
49、eclaration structure that declares the supertypes (3.23), admissible features (3.2), admissible feature values (3.3), admissibility constraints (3.1) and implicational constraints (3.18) for a given type (3.24) NOTE The constraints on a type in the resulting feature system are those that have been declared in its declaration, in addition to those that it has inherited from its supertypes. ISO 24610-2:2011(E) ISO 2011 All rights reserved 53.26 type hierarchy partial order (3.20)