BS ISO 24615-1-2014 Language resource management Syntactic annotation framework (SynAF) Syntactic model《语言资源管理 语法注释框架 (SynAF) 语法模型》.pdf

上传人:rimleave225 文档编号:586682 上传时间:2018-12-15 格式:PDF 页数:32 大小:1.33MB
下载 相关 举报
BS ISO 24615-1-2014 Language resource management Syntactic annotation framework (SynAF) Syntactic model《语言资源管理 语法注释框架 (SynAF) 语法模型》.pdf_第1页
第1页 / 共32页
BS ISO 24615-1-2014 Language resource management Syntactic annotation framework (SynAF) Syntactic model《语言资源管理 语法注释框架 (SynAF) 语法模型》.pdf_第2页
第2页 / 共32页
BS ISO 24615-1-2014 Language resource management Syntactic annotation framework (SynAF) Syntactic model《语言资源管理 语法注释框架 (SynAF) 语法模型》.pdf_第3页
第3页 / 共32页
BS ISO 24615-1-2014 Language resource management Syntactic annotation framework (SynAF) Syntactic model《语言资源管理 语法注释框架 (SynAF) 语法模型》.pdf_第4页
第4页 / 共32页
BS ISO 24615-1-2014 Language resource management Syntactic annotation framework (SynAF) Syntactic model《语言资源管理 语法注释框架 (SynAF) 语法模型》.pdf_第5页
第5页 / 共32页
亲,该文档总共32页,到这儿已超出免费预览范围,如果喜欢就下载吧!
资源描述

1、BSI Standards PublicationBS ISO 24615-1:2014Language resourcemanagement Syntacticannotation framework (SynAF)Part 1: Syntactic modelBS ISO 24615-1:2014 BRITISH STANDARDNational forewordThis British Standard is the UK implementation of ISO 24615-1:2014.It supersedes BS ISO 24615:2010 which is withdra

2、wn.The UK participation in its preparation was entrusted to TechnicalCommittee TS/1, Terminology.A list of organizations represented on this committee can beobtained on request to its secretary.This publication does not purport to include all the necessaryprovisions of a contract. Users are responsi

3、ble for its correctapplication. The British Standards Institution 2014. Published by BSI StandardsLimited 2014ISBN 978 0 580 80104 4ICS 01.020Compliance with a British Standard cannot confer immunity fromlegal obligations.This British Standard was published under the authority of theStandards Policy

4、 and Strategy Committee on 30 April 2014.Amendments issued since publicationDate Text affectedBS ISO 24615-1:2014 ISO 2014Language resource management Syntactic annotation framework (SynAF) Part 1: Syntactic modelGestion de ressources langagires Cadre dannotation syntaxique (SynAF) Partie 1: Modle s

5、yntaxiqueINTERNATIONAL STANDARDISO24615-1First edition2014-02-01Reference numberISO 24615-1:2014(E)BS ISO 24615-1:2014ISO 24615-1:2014(E)ii ISO 2014 All rights reservedCOPYRIGHT PROTECTED DOCUMENT ISO 2014All rights reserved. Unless otherwise specified, no part of this publication may be reproduced

6、or utilized otherwise in any form or by any means, electronic or mechanical, including photocopying, or posting on the internet or an intranet, without prior written permission. Permission can be requested from either ISO at the address below or ISOs member body in the country of the requester.ISO c

7、opyright officeCase postale 56 CH-1211 Geneva 20Tel. + 41 22 749 01 11Fax + 41 22 749 09 47E-mail copyrightiso.orgWeb www.iso.orgPublished in SwitzerlandBS ISO 24615-1:2014ISO 24615-1:2014(E) ISO 2014 All rights reserved iiiContents PageForeword ivIntroduction v1 Scope . 12 Normative references 13 T

8、erms and definitions . 14 SynAF metamodel 44.1 Introduction 44.2 SynAF metamodel . 4Annex A (normative) Data categories for SynAF 7Annex B (informative) Relation to the Linguistic Annotation Framework 18Bibliography .20BS ISO 24615-1:2014ISO 24615-1:2014(E)ForewordISO (the International Organization

9、 for Standardization) is a worldwide federation of national standards bodies (ISO member bodies). The work of preparing International Standards is normally carried out through ISO technical committees. Each member body interested in a subject for which a technical committee has been established has

10、the right to be represented on that committee. International organizations, governmental and non-governmental, in liaison with ISO, also take part in the work. ISO collaborates closely with the International Electrotechnical Commission (IEC) on all matters of electrotechnical standardization.The pro

11、cedures used to develop this document and those intended for its further maintenance are described in the ISO/IEC Directives, Part 1. In particular the different approval criteria needed for the different types of ISO documents should be noted. This document was drafted in accordance with the editor

12、ial rules of the ISO/IEC Directives, Part 2 (see www.iso.org/directives).Attention is drawn to the possibility that some of the elements of this document may be the subject of patent rights. ISO shall not be held responsible for identifying any or all such patent rights. Details of any patent rights

13、 identified during the development of the document will be in the Introduction and/or on the ISO list of patent declarations received (see www.iso.org/patents).Any trade name used in this document is information given for the convenience of users and does not constitute an endorsement.For an explana

14、tion on the meaning of ISO specific terms and expressions related to conformity assessment, as well as information about ISOs adherence to the WTO principles in the Technical Barriers to Trade (TBT) see the following URL: Foreword - Supplementary information.The committee responsible for this docume

15、nt is ISO/TC 37, Terminology and other language and content resources, Subcommittee SC 4, Language resource management.This first edition of ISO 24615-1 cancels and replaces ISO 24615:2010, of which it constitutes a minor revision.ISO 24615 (all parts) is designed to coordinate closely with ISO 2461

16、2, Language resource management Linguistic annotation framework (LAF), ISO 24613:2008, Language resource management Lexical markup framework (LMF), and ISO 24611, Language resource management Morpho-syntactic annotation framework.ISO 24615 consists of the following parts, under the general title Lan

17、guage resource management Syntactic annotation framework (SynAF): Part 1: Syntactic modelThe following part is under preparation: Part 2: XML serialization ()iv ISO 2014 All rights reservedBS ISO 24615-1:2014ISO 24615-1:2014(E)IntroductionISO 24615 is based on numerous projects and pre-standardisati

18、on activities that have taken place in the last few years (see Abeill, 20019), to provide reference models and formats for the representation of syntactic information, whether as the output of a syntactic parser, or as annotations of language resources (treebanks). For several years, the Penn Treeba

19、nk initiative has served as a de facto standard for treebanking, but more recent works e.g. the Negra/Tiger initiative (see: http:/www.ims.uni-stuttgart.de/projekte/TIGER/TIGERCorpus/) in Germany or the ISST initiative in Italy see Montemagni (2003)18 demonstrate the viability of a more coherent fra

20、mework that can account for both (hierarchical) constituency and dependency phenomena in syntactic annotation.The eContent project “LIRICS”, has been seminal in gathering a group of experts, who initiated the ISO 24615 (SynAF) project. While preparing SynAF, this group confirmed that existing initia

21、tives indeed share a common data model that offers a good basis for the SynAF metamodel (see the study made in Deliverable D.3.1 “Evaluation of initiatives for morpho-syntactic and syntactic annotation” of the EU project LIRICS, available at http:/lirics.loria.fr/doc_pub/Del3_1_V2.pdf).This part of

22、ISO 24615 proposes a metamodel for syntactic annotation together with a list of relevant data categories for syntactic annotation. The data categories are available on the ISOCat server (http:/www.isocat.org/) in the syntax profile (as defined in ISO 12620:2009). ISO 2014 All rights reserved vBS ISO

23、 24615-1:2014BS ISO 24615-1:2014Language resource management Syntactic annotation framework (SynAF) Part 1: Syntactic model1 ScopeThis part of ISO 24615 describes the syntactic annotation framework (SynAF), a high level model for representing the syntactic annotation of linguistic data, with the obj

24、ective of supporting interoperability across language resources or language processing components. This part of ISO 24615 is complementary and closely related to ISO 24611 (MAF, morpho-syntactic annotation framework) and provides a metamodel for syntactic representations as well as reference data ca

25、tegories for representing both constituency and dependency information in sentences or other comparable utterances and segments.2 Normative referencesThe following referenced documents are indispensable for the application of this document. For dated references, only the edition cited applies. For u

26、ndated references, the latest edition of the referenced document (including any amendments) applies.ISO 1087-1:2000, Terminology work Vocabulary Part 1: Theory and applicationISO 12620:2009, Terminology and other language and content resources Specification of data categories and management of a Dat

27、a Category Registry for language resourcesISO 24611:2012, Language resource management Morpho-syntactic annotation framework3 Terms and definitionsFor the purposes of this document, the terms and definitions given in ISO 1087-1:2000, ISO 12620:2009, ISO 24611:2012 and the following apply.3.1adjunctn

28、on-essential element associated with a verb as opposed to syntactic arguments (3.19)Note 1 to entry: Adverbs are possible adjuncts for a sentence.3.2chunknon-recursive constituent (3.4)3.3clausegroup of phrases (3.14), usually containing a predicateNote 1 to entry: A clause can be either a main clau

29、se (3.10) or a subordinate clause (3.17). In languages distinguishing finiteness, clauses whose predicate is a verb can be either finite or non-finite, depending on the form of the verb. A main clause alone can build a complete sentence (3.15). In the SynAF model, a clause is a special case of a con

30、stituent (3.4).INTERNATIONAL STANDARD ISO 24615-1:2014(E) ISO 2014 All rights reserved 1BS ISO 24615-1:2014ISO 24615-1:2014(E)3.4constituentsyntactic grouping of words into phrases (3.14), phrases into clauses (3.3) or other phrases or clauses into a sentence (3.15) on the base of structural (or hie

31、rarchical) properties3.5dependencydependency relationsyntactic relation between word forms (3.24) or constituents (3.4) on the basis of the grammaticalfunctions (3.7) that constituents play in relation to each other3.6syntactic edgeedgetriplet with a source node (3.12), a target node, and optional a

32、nnotations (3.9)Note 1 to entry: Non-terminal nodes (3.13) have an outgoing constituency syntactic edge.3.7grammatical functiongrammatical role of a wordform (3.24) or constituent (3.4) within its embedding syntactic environmentNote 1 to entry: For example, a noun phrase (NP) can act as a subject wi

33、thin a sentence (3.15), or a noun may act as a subject dependent of a verb in a dependency graph. There is a grammatical relation between the subject NP and the main verb in a sentence. All grammatical relations (subject predicate, head modifier, etc.) are subsumed under the concept of dependency re

34、lations (3.5), whether between terminal or non-terminal nodes.3.8syntactic headheadpart of a constituent (3.4) which determines its distribution (the syntactic environments in which the constituent may appear) and its grammatical properties (e.g. if the grammatical gender of the head is feminine, th

35、en the gender of the entire constituent will be feminine)Note 1 to entry: The head of a constituent usually cannot be left out.3.9linguistic annotationannotationfeature-value pair denoting a linguistic property of a linguistic segment3.10main clauseclause (3.3), which can act on its own as a complet

36、e sentence (3.15)Note 1 to entry: In languages distinguishing finiteness, the main clause is usually finite. Example: The train is late.3.11modifierpart of a constituent (3.4) which ascribes a property to the head (3.8) of the constituentNote 1 to entry: A modifier can be placed before or after the

37、head of the phrase (3.14) (pre-modifier or post-modifier). Modifiers are optional in a constituent.3.12nodesyntactic nodeword form (3.24) or constituent (3.4) seen as an elementary syntactic component of a syntactic analysis2 ISO 2014 All rights reservedBS ISO 24615-1:2014ISO 24615-1:2014(E)3.13non-

38、terminal nodesyntactic node (3.12) which is not a word form (3.24)Note 1 to entry: A non-terminal node has an outgoing constituency edge (3.6).3.14phrasegroup of word forms (3.24) (usually containing one or more words) which can fulfill a grammatical function (3.7), e.g. in a clause (3.3)Note 1 to e

39、ntry: Empty phrases are permitted (being non-realised pronouns, sometimes marked as “pro”, and having the role of subjects in clauses). A phrase is typically named after its head (3.8), for example noun phrases, verb phrases, adjective phrases, adverbial phrases and prepositional phrases. Phrases ha

40、ve been informally described as “bloated words”, in that the parts of the phrase added to the head elaborate and specify the reference of the head. In our model, a phrase is a special case of a constituent (3.4).3.15sentencerelated group of word forms (3.24) containing a predication, usually express

41、ing a complete thought and forming the basic unit of discourse structureNote 1 to entry: A sentence consists of one or more clauses (3.3). When describing speech, it is common to talk about “utterances” rather than sentences.3.16spanpair of points (p1, p2), where p1 p2, identifying the segment of th

42、e document to which an annotation (3.9) is appliedNote 1 to entry: A multiple span is a sequence of spans where the ending point of each span is less than or equal to the starting point of the subsequent span.3.17subordinate clauseclause which fulfils a grammatical function (3.7) in a phrase (3.14)

43、for example a relative clause (3.3) modifying the head (3.8) noun of a nominal phrase or in another clauseNote 1 to entry: A subordinate clause usually does not act on its own as a sentence, but is part of a larger sentence.3.18subcategorization frameset of restrictions indicating the properties of

44、the syntactic arguments (3.19) that can or must occur with a verbEXAMPLE Alfred (/syntacticArgument/) reads a book (/syntacticArgument/) today (/adjunct/).Note 1 to entry: The subject, indirect object and direct object are subcategorized grammatical functions (3.7) within a sentence; they are depend

45、ents of the verb (i.e. they can appear in subcategorization frames).3.19syntactic argumentfunctionally essential element that is required and given its interpretation by the head of its phrase (3.14) or the node (3.12) of which it is a dependent (e.g. the nominal argument of a prepositional phrase o

46、r verb)Note 1 to entry: For verbs and verbal phrases, arguments identify the participants in the process referred to by the verb. In some frameworks, syntactic arguments are called complements.3.20syntactic graphgraphconnected set of syntactic nodes (3.12) and edges (3.6) ISO 2014 All rights reserve

47、d 3BS ISO 24615-1:2014ISO 24615-1:2014(E)3.21syntactic treesyntactic graph (3.20) in which each node has a single parent3.22syntaxway in which word forms (3.24) are interrelated and/or grouped together into phrases, thus capturing the relations that exist between those units3.23terminal nodesyntacti

48、c node (3.12) which is a single word form (3.24) or an empty element involved in a syntactic relation3.24word formcontiguous or non-contiguous entity from a speech or text sequence identified as an autonomous lexical item4 SynAF metamodel4.1 IntroductionSyntactic annotations have at least two functi

49、ons in language processing:a) to represent linguistic constituency, as in noun phrases (NP), describing a structured sequence of morpho-syntactically annotated items (including empty elements or traces generated by movements at the constituency level), as well as constituents built from non-contiguous elements, andb) to represent dependency relations, such as head-modifier relations, and also including relations between categories of the same kind (such as the head-head relations between nouns in appositi

展开阅读全文
相关资源
猜你喜欢
相关搜索

当前位置:首页 > 标准规范 > 国际标准 > BS

copyright@ 2008-2019 麦多课文库(www.mydoc123.com)网站版权所有
备案/许可证编号:苏ICP备17064731号-1