1、BS ISO 24617-8:2016Language resourcemanagement Semanticannotation framework (SemAF)Part 8: Semantic relations in discourse, coreannotation schema (DR-core)BSI Standards PublicationWB11885_BSI_StandardCovs_2013_AW.indd 1 15/05/2013 15:06BS ISO 24617-8:2016 BRITISH STANDARDNational forewordThis Britis
2、h Standard is the UK implementation of ISO 24617-8:2016. The UK participation in its preparation was entrusted to TechnicalCommittee TS/1, Terminology.A list of organizations represented on this committee can be obtained on request to its secretary.This publication does not purport to include all th
3、e necessary provisions of a contract. Users are responsible for its correct application. The British Standards Institution 2016.Published by BSI Standards Limited 2016ISBN 978 0 580 81286 6 ICS 01.020 Compliance with a British Standard cannot confer immunity from legal obligations.This British Stand
4、ard was published under the authority of the Standards Policy and Strategy Committee on 31 December 2016.Amendments/corrigenda issued since publicationDate T e x t a f f e c t e dBS ISO 24617-8:2016 ISO 2016Language resource management Semantic annotation framework (SemAF) Part 8: Semantic relations
5、 in discourse, core annotation schema (DR-core)Gestion des ressources langagires Cadre dannotation smantique (SemAF) Partie 8: Relations smantiques dans le discours, schma dannotation de base (DR-core)INTERNATIONAL STANDARDISO24617-8First edition2016-12-15Reference numberISO 24617-8:2016(E)BS ISO 24
6、617-8:2016ISO 24617-8:2016(E)ii ISO 2016 All rights reservedCOPYRIGHT PROTECTED DOCUMENT ISO 2016, Published in SwitzerlandAll rights reserved. Unless otherwise specified, no part of this publication may be reproduced or utilized otherwise in any form or by any means, electronic or mechanical, inclu
7、ding photocopying, or posting on the internet or an intranet, without prior written permission. Permission can be requested from either ISO at the address below or ISOs member body in the country of the requester.ISO copyright officeCh. de Blandonnet 8 CP 401CH-1214 Vernier, Geneva, SwitzerlandTel.
8、+41 22 749 01 11Fax +41 22 749 09 47copyrightiso.orgwww.iso.orgBS ISO 24617-8:2016ISO 24617-8:2016(E)Foreword ivIntroduction v1 Scope . 12 Normative references 23 Terms and definitions . 24 Basic concepts and metamodel . 34.1 Overview 34.2 Representation of discourse structure . 34.3 Semantic descri
9、ption of discourse relations . 44.4 Pragmatic variants of discourse relations . 44.5 Hierarchical classification of discourse relations . 54.6 Inference of multiple relations between two segments 54.7 Representation of (a)symmetry of relations . 64.8 Representation of the relative importance of argu
10、ments for discourse meaning/structure . 64.9 Arity of arguments . 74.10 Syntactic form, extent, and (non-)adjacency of argument realizations . 74.11 Triggers of discourse relations . 74.12 Representation of attribution as a discourse relation 84.13 Representation of entity-based relations. 94.14 Rep
11、resentation of non-existence of a discourse relation 104.15 Summary: Assumptions of the DR-core annotation scheme 104.16 Issues to be taken up in the follow-up of DR-core .114.17 Metamodel . 115 Core discourse relations 126 Current approaches and annotation schemes 216.1 Overview . 216.2 Rhetorical
12、structure theory (RST) 216.3 RST Treebank 226.4 Hobbs Theory of Discourse Coherence (HTDC) . 246.5 GraphBank . 246.6 SDRT 256.7 CCR . 266.8 Penn Discourse Treebank (PDTB) . 266.9 Mapping of DR-core discourse relations to existing classifications .287 Interactions of this document with other annotati
13、on schemes .307.1 Overlapping annotation schemes 307.2 Discourse relations and semantic roles 317.3 Discourse relations and temporal relations . 317.4 Discourse relations and semantic relations between dialogue acts 328 DRelML: Discourse Relations Markup Language .338.1 Overview . 338.2 DRelML abstr
14、act syntax and semantics 348.3 Concrete syntax . 35Bibliography .39 ISO 2016 All rights reserved iiiContents PageBS ISO 24617-8:2016ISO 24617-8:2016(E)ForewordISO (the International Organization for Standardization) is a worldwide federation of national standards bodies (ISO member bodies). The work
15、 of preparing International Standards is normally carried out through ISO technical committees. Each member body interested in a subject for which a technical committee has been established has the right to be represented on that committee. International organizations, governmental and non-governmen
16、tal, in liaison with ISO, also take part in the work. ISO collaborates closely with the International Electrotechnical Commission (IEC) on all matters of electrotechnical standardization.The procedures used to develop this document and those intended for its further maintenance are described in the
17、ISO/IEC Directives, Part 1. In particular, the different approval criteria needed for the different types of ISO documents should be noted. This document was drafted in accordance with the editorial rules of the ISO/IEC Directives, Part 2 (see www.iso.org/directives).Attention is drawn to the possib
18、ility that some of the elements of this document may be the subject of patent rights. ISO shall not be held responsible for identifying any or all such patent rights. Details of any patent rights identified during the development of the document will be in the Introduction and/or on the ISO list of
19、patent declarations received (see www.iso.org/patents).Any trade name used in this document is information given for the convenience of users and does not constitute an endorsement.For an explanation on the meaning of ISO specific terms and expressions related to conformity assessment, as well as in
20、formation about ISOs adherence to the World Trade Organization (WTO) principles in the Technical Barriers to Trade (TBT) see the following URL: www.iso.org/iso/foreword.html.The committee responsible for this document is ISO/TC 37, Terminology and other language and content resources, Subcommittee S
21、C 4, Language resource management.A list of all parts in the ISO 24617 series can be found on the ISO website.iv ISO 2016 All rights reservedBS ISO 24617-8:2016ISO 24617-8:2016(E)IntroductionThe last decade has seen a proliferation of linguistically annotated corpora coding many phenomena in support
22、 of empirical natural language research, both computational and theoretical. At the level of discourse, interest in discourse processing has led to the development of several corpora annotated for discourse relations. Discourse relations, also called “coherence relations” or “rhetorical relations”,
23、are relations, expressed explicitly or implicitly, between situations mentioned in a discourse and are key to a complete understanding of the discourse, beyond the meaning conveyed by clauses and sentences. Discourse relations and discourse structure are considered to be key ingredients for NLP task
24、s such as summarization,3941complex question answering,74natural language generation,194756machine translation,42opinion mining and sentiment analysis,1112and information retrieval.38A recent overview76includes a description of the state of the art in discourse and computation. Several international
25、 and collaborative efforts have resulted in annotated resources of discourse relations, across languages as well as genres, to support the development of such applications.Existing annotation frameworks exhibit two major differences in their underlying assumptions, one of which concerns the represen
26、tation of discourse structure, while the other has to do with the semantic classification of discourse relations. As a result, annotations constructed using one framework are not easily interpreted in another framework, and annotated resources are limited in their interoperability. Notwithstanding t
27、heir differences, however, there are strong compatibilities between them that can be clarified and used as the basis for mappings and comparisons between the resources, as well as for use as a basis for future annotation.In a coherent (written or spoken) discourse, the situations mentioned in the di
28、scourse, such as events, states, facts, propositions, and dialogue acts are semantically linked through causal, contrastive, temporal and other relations, called “discourse relations”, “rhetorical relations”, or “coherence relations”. Although discourse relations hold most prominently between the me
29、anings of successive sentences or utterances in a discourse, they may also occur between the meanings of smaller or larger units (nominalizations, clauses, paragraphs, dialogue segments), and they may occur between situations that are not explicitly described but that can be inferred.This document a
30、ims to specify an interoperable approach to the annotation of local semantic relations in discourse (DRels), following the Linguistic Annotation Framework (LAF, ISO 24612-2; see also Reference 23) and the general principles for semantic annotation established in ISO 24617-6. It reflects the view tha
31、t strong underlying compatibilities with respect to the semantic description of discourse relations can be observed in the various discourse relation frameworks being used to support data annotation, e.g. Rhetorical Structure Theory (RST),40Segmented Discourse Representation Theory (SDRT),3the Penn
32、Discourse Treebank,59Hobbs Theory of Discourse Coherence (HTDC)1718and the Cognitive Approach to Coherence Relations (CCR)66. This document aims to provide an explanation of these compatibilities and a loose mapping between definitions of individual discourse relations, as specified in the different
33、 frameworks that will benefit the community as a whole.The main aims of this document are to (1) establish a set of desiderata for interoperable DRel annotation; (2) specify a way of annotating DRels that is compatible with existing and emerging ISO standard annotation schemes for semantic informati
34、on; and (3) provide clear and mutually consistent definitions of a set of “core” discourse relations which are commonly found in some form in many existing discourse relation frameworks. Together, (2) and (3) form a “core annotation scheme” for DRels.This document does not aim at providing a fixed a
35、nd exhaustive set of discourse relations, but rather at providing an open, extensible set of core relations. The core annotation scheme also discusses certain issues in discourse relation annotation that it leaves open, as they require further study in collaboration with other efforts in multilingua
36、l discourse annotation, in particular the European COST action TextLink. A future part of ISO 24617 is envisaged that will complement this document by providing a complete interoperable annotation scheme for DRels, while also addressing the multilingual dimension of the standard. The issues to be ta
37、ken up for this complementary part are listed in 4.16. ISO 2016 All rights reserved vBS ISO 24617-8:2016BS ISO 24617-8:2016Language resource management Semantic annotation framework (SemAF) Part 8: Semantic relations in discourse, core annotation schema (DR-core)1 ScopeThis document establishes the
38、representation and annotation of local, “low-level” discourse relations between situations mentioned in discourse, where each relation is annotated independently of other relations in the same discourse.This document provides a basis for annotating discourse relations by specifying a set of core dis
39、course relations, many of which have similar definitions in different frameworks. To the extent possible, this document provides mappings of the semantics across the different frameworks.This document is applicable to two different situations: for annotating discourse relations in natural language c
40、orpora; as a target representation of automatic methods for shallow discourse parsing, for summarization, and for other applications.The objectives of this specification are to provide: a reference set of data categories that define a collection of discourse relation types with an explicit semantics
41、; a pivot representation based on a framework for defining discourse relations that can facilitate mapping between different frameworks; a basis for developing guidelines for creating new resources that will be immediately interoperable with pre-existing resources.With respect to discourse structure
42、, the limitation of this document to specifications for annotating local, “low-level” discourse relations is based on the view that (a) the analysis at this level is what is well understood and can be clearly defined; (b) further extensions to represent higher-level, global discourse structure is po
43、ssible where desired; and (c) that it allows for the resulting annotations to be compatible across frameworks, even when they are based on different theories of discourse structure.As a part of the ISO 24617 semantic annotation framework (“SemAF”), the present DR-core standard aims to be transparent
44、 in its relation to existing frameworks for discourse relation annotation, but also to be compatible with other ISO 24617 parts. Some discourse relations are specific to interactive discourse, and give rise to an overlap with ISO 24617 Part 2, the ISO standard for dialogue act annotation. Other disc
45、ourse relations relate to time, and their annotation forms part of ISO 24617-1 (time and events); still other discourse relations are very similar to certain predicate-argument relations (“semantic roles”), whose annotation is the subject matter of ISO 24617-4. Since the various parts are required t
46、o form a consistent whole, this document pays special attention to the interactions of discourse relation annotation and other semantic annotation schemes (see Clause 8).This document does not consider global, higher-level discourse structure representation which involves linking local discourse rel
47、ations to form one or more composite global structures.INTERNATIONAL STANDARD ISO 24617-8:2016(E) ISO 2016 All rights reserved 1BS ISO 24617-8:2016ISO 24617-8:2016(E)This document is, moreover, restricted to strictly semantic relations, to the exclusion of, for example, presentational relations, whi
48、ch concern the way in which a text is presented to its readers or the way in which speakers structure their contributions in a spoken dialogue.2 Normative referencesThere are no normative references in this document.3 Terms and definitionsFor the purposes of this document, the following terms and de
49、finitions apply.ISO and IEC maintain terminological databases for use in standardization at the following addresses: IEC Electropedia: available at http:/www.electropedia.org/ ISO Online browsing platform: available at http:/www.iso.org/obp3.1discoursesequence of clauses or sentences in written text or of utterances in oral speech3.2situationeventuality, fact, proposition, condition, belief or dialogue act, that can be realized by a linguistically simple or complex expression, such as a clause, a nominal