1、raising standards worldwideNO COPYING WITHOUT BSI PERMISSION EXCEPT AS PERMITTED BY COPYRIGHT LAWBSI Standards PublicationBS ISO 25964-1:2011Information anddocumentation Thesauriand interoperability with othervocabulariesPart 1: Thesauri for information retrievalBS ISO 25964-1:2011 BRITISH STANDARDN
2、ational forewordThis British Standard is the UK implementation of ISO 25964-1:2011.It supersedes BS 8723-2:2005 and DD 8723-5:2008, which arewithdrawn. Together with BS ISO 25964-2, it supersedes BS8723-1:2005 and BS 8723-4:2007, which will be withdrawn onpublication of BS ISO 25964-2.The UK partici
3、pation in its preparation was entrusted to TechnicalCommittee IDT/2/2, Information description, source identificationand ,indexing.A list of organizations represented on this committee can beobtained on request to its secretary.This publication does not purport to include all the necessaryprovisions
4、 of a contract. Users are responsible for its correctapplication. BSI 2011ISBN 978 0 580 58905 8ICS 01.140.20Compliance with a British Standard cannot confer immunity fromlegal obligations.This British Standard was published under the authority of theStandards Policy and Strategy Committee on 30 Sep
5、tember 2011.Amendments issued since publicationDate Text affectedBS ISO 25964-1:2011Reference numberISO 25964-1:2011(E)ISO 2011INTERNATIONAL STANDARD ISO25964-1First edition2011-08-15Information and documentation Thesauri and interoperability with other vocabularies Part 1: Thesauri for information
6、retrieval Information et documentation Thsaurus et interoprabilit avec dautres vocabulaires Partie 1: Thsaurus pour la recherche documentaire BS ISO 25964-1:2011ISO 25964-1:2011(E) COPYRIGHT PROTECTED DOCUMENT ISO 2011 All rights reserved. Unless otherwise specified, no part of this publication may
7、be reproduced or utilized in any form or by any means, electronic or mechanical, including photocopying and microfilm, without permission in writing from either ISO at the address below or ISOs member body in the country of the requester. ISO copyright office Case postale 56 CH-1211 Geneva 20 Tel. +
8、 41 22 749 01 11 Fax + 41 22 749 09 47 E-mail copyrightiso.org Web www.iso.org Published in Switzerland ii ISO 2011 All rights reservedBS ISO 25964-1:2011ISO 25964-1:2011(E) ISO 2011 All rights reserved iiiContents Page Foreword .v Introductionvi 1 Scope1 2 Terms and definitions .1 3 Symbols, abbrev
9、iated terms and other conventions.12 4 Thesaurus overview and objectives15 4.1 Overall objective15 4.2 Vocabulary control and its purpose 16 4.3 Paradigmatic versus syntagmatic relationships16 4.4 Types of paradigmatic relationship.17 5 Concepts and their scope in a thesaurus.18 5.1 Conceptual basis
10、.18 5.2 Scope notes .20 5.3 Reciprocal scope notes 21 6 Thesaurus terms21 6.1 Form of terms 21 6.2 Clarification and disambiguation of thesaurus terms .21 6.3 Grammatical form of terms.23 6.4 Capitalization, punctuation and special characters 26 6.5 Singular or plural forms27 6.6 Selection of the pr
11、eferred form30 7 Complex concepts.37 7.1 General .37 7.2 The nature of compound terms .38 7.3 Deciding whether or not to admit a complex concept.39 7.4 How to split a complex concept.43 7.5 Retention of constituent concepts 43 7.6 Consistency in the treatment of complex concepts 44 7.7 Order of word
12、s in multi-word terms 44 8 The equivalence relationship, in a monolingual context 44 8.1 General .44 8.2 Synonyms.45 8.3 Quasi-synonyms48 8.4 Specific terms subsumed in a broader concept 48 8.5 Representation of complex concepts by a combination of terms .49 9 Equivalence across languages 50 9.1 Gen
13、eral .50 9.2 Degrees of equivalence 51 9.3 Typical problems and solutions 52 9.4 Representation of cross-language equivalence between preferred terms .57 9.5 Cross-language equivalence between non-preferred terms.57 10 Relationships between concepts.57 10.1 Introduction57 10.2 The hierarchical relat
14、ionship .58 10.3 The associative relationship 63 10.4 Customized relationships.67 BS ISO 25964-1:2011ISO 25964-1:2011(E) iv ISO 2011 All rights reserved11 Facet analysis 68 12 Presentation and layout 70 12.1 General70 12.2 Alternative display styles71 12.3 Presentation and layout of multilingual the
15、sauri .80 12.4 Language and character encoding issues85 13 Managing thesaurus construction and maintenance 88 13.1 Planning a thesaurus 88 13.2 Early stages of compilation90 13.3 Construction.91 13.4 Introduction to the thesaurus.93 13.5 Dissemination 93 13.6 Updating .95 14 Guidelines for thesaurus
16、 management software .98 14.1 General98 14.2 Size and character limitations98 14.3 Relationships between terms and between concepts .99 14.4 Notes applying to terms or concepts 100 14.5 Codes and notation .100 14.6 Node labels.100 14.7 Status of languages.100 14.8 Data import/export.101 14.9 Editori
17、al navigation and support102 14.10 Editorial safeguards 102 14.11 Housekeeping tools.103 15 Data model103 15.1 General103 15.2 Notes on the model105 15.3 Tabular presentation .109 16 Integration of thesauri with applications 115 16.1 Introduction115 16.2 Interoperability needs for thesauri.116 16.3
18、Integration with indexing and searching applications.116 17 Exchange formats118 18 Protocols 119 18.1 General119 18.2 Purposes and use cases.119 18.3 Application environment and architecture .120 18.4 Thesaurus-specific protocols.120 18.5 General-purpose web database protocols used with thesauri .12
19、0 Annex A (informative) Examples of displays found in published thesauri.122 Annex B (informative) XML Schema for data exchange139 Bibliography 140 Index.144 Table 1 Symbols and abbreviations 13 Table 2 English language tags and their equivalents in other languages 14 Table A.1 Tags used in Inspec T
20、hesaurus alphabetical display 122 Figure 1 Paradigmatic and syntagmatic relationships 17 BS ISO 25964-1:2011ISO 25964-1:2011(E) ISO 2011 All rights reserved vForeword ISO (the International Organization for Standardization) is a worldwide federation of national standards bodies (ISO member bodies).
21、The work of preparing International Standards is normally carried out through ISO technical committees. Each member body interested in a subject for which a technical committee has been established has the right to be represented on that committee. International organizations, governmental and non-g
22、overnmental, in liaison with ISO, also take part in the work. ISO collaborates closely with the International Electrotechnical Commission (IEC) on all matters of electrotechnical standardization. International Standards are drafted in accordance with the rules given in the ISO/IEC Directives, Part 2
23、. The main task of technical committees is to prepare International Standards. Draft International Standards adopted by the technical committees are circulated to the member bodies for voting. Publication as an International Standard requires approval by at least 75 % of the member bodies casting a
24、vote. Attention is drawn to the possibility that some of the elements of this document may be the subject of patent rights. ISO shall not be held responsible for identifying any or all such patent rights. ISO 25964-1 was prepared by Technical Committee ISO/TC 46, Information and documentation, Subco
25、mmittee SC 9, Identification and description. This first edition of ISO 25964-1 cancels and replaces ISO 2788:1986 and ISO 5964:1985, which have been merged and technically revised. Clauses 1 to 13 of this part of ISO 25964 correspond broadly to the content of ISO 2788:1986 and ISO 5964:1985. The re
26、maining clauses cover new material. ISO 25964 consists of the following parts, under the general title Information and documentation Thesauri and interoperability with other vocabularies: Part 1: Thesauri for information retrieval The following parts are under preparation: Part 2: Interoperability w
27、ith other vocabularies This part of ISO 25964 covers the development and maintenance of thesauri, both monolingual and multilingual, including formats and protocols for data exchange. ISO 25964-2 will cover interoperability between different thesauri and with other types of structured vocabulary, su
28、ch as classification schemes, name authority lists, ontologies, etc., not previously covered in any International Standard. BS ISO 25964-1:2011ISO 25964-1:2011(E) vi ISO 2011 All rights reservedIntroduction Todays thesauri are mostly electronic tools, having moved on from the paper-based era when th
29、esaurus standards were first developed. They are built and maintained with the support of software and need to integrate with other software, such as search engines and content management systems. (For example, data from the thesaurus database might need to be presented in combination with the numbe
30、r of postings found by a search application.) Whereas in the past thesauri were designed for information professionals trained in indexing and searching, today there is a demand for vocabularies that untrained users will find to be intuitive, and for vocabularies that enable inferencing by machines.
31、 ISO 25964 makes the transition that is needed in order to be compatible with the world of electronic information management. However, this part of ISO 25964 retains the assumption that human intellect is usually involved in the selection of indexing terms and in the selection of search terms. If bo
32、th the indexer and the searcher are guided to choose the same term for the same concept, then relevant documents will be retrieved. This is the main principle underlying thesaurus design, even though a thesaurus may also be applied in situations where computers make the choices. Efficient exchange o
33、f data is a vital component of thesaurus management and exploitation. This part of ISO 25964 therefore includes recommendations for exchange formats and protocols. Adoption of these will facilitate interoperability between thesaurus management systems and other computer applications, such as indexin
34、g and retrieval systems, that will utilize the data. This part of ISO 25964 covers development and maintenance of thesauri rather than how to use them in indexing. Where multilingual issues and examples are addressed, efforts have been made to cover as wide a selection of languages as possible, cons
35、istent with clarity and comprehensibility. Thesauri are typically used in post-coordinate retrieval systems, but may also be applied to hierarchical directories, pre-coordinate indexes and classification systems. Increasingly, thesaurus applications need to mesh with others, such as automatic catego
36、rization schemes, free-text search systems, etc. ISO 25964-2 will address additional types of structured vocabulary (such as classification schemes, name authority lists, ontologies, etc.) and give recommendations to enable interoperation of the vocabularies at all stages of the information storage
37、and retrieval process. BS ISO 25964-1:2011INTERNATIONAL STANDARD ISO 25964-1:2011(E) ISO 2011 All rights reserved 1Information and documentation Thesauri and interoperability with other vocabularies Part 1: Thesauri for information retrieval 1 Scope This part of ISO 25964 gives recommendations for t
38、he development and maintenance of thesauri intended for information retrieval applications. It is applicable to vocabularies used for retrieving information from all types of information resources, irrespective of the media used (text, sound, still or moving image, physical object or multimedia) inc
39、luding knowledge bases and portals, bibliographic databases, text, museum or multimedia collections, and the items within them. This part of ISO 25964 also provides a data model and recommended format for the import and export of thesaurus data. This part of ISO 25964 is applicable to monolingual an
40、d multilingual thesauri. This part of ISO 25964 is not applicable to the preparation of back-of-the-book indexes, although many of its recommendations could be useful for that purpose. This part of ISO 25964 is not applicable to the databases or software used directly in search or indexing applicati
41、ons, but does anticipate the needs of such applications among its recommendations for thesaurus management. 2 Terms and definitions For the purposes of this document, the following terms and definitions apply. 2.1 array group of sibling concepts (2.52) EXAMPLE In the following, the sibling concepts
42、outerwear and underwear form an array within the concept “clothing”. clothing outerwear overcoats underwear 2.2 associative relationship relationship between a pair of concepts (2.11) that are not related hierarchically but share a strong semantic connection BS ISO 25964-1:2011ISO 25964-1:2011(E) 2
43、ISO 2011 All rights reserved2.3 broader term preferred term (2.45) representing a concept (2.11) that is broader than the one in question NOTE The scope of the narrower concept falls completely within the scope of the broader. The relationship between the two is commonly indicated with the tag BT. F
44、or more explanation see 10.2.1. 2.4 characteristic of division attribute by which a concept (2.11) can be subdivided into an array (2.1) of narrower concepts (2.11), each having a distinct value of that attribute cf. facet analysis (2.21), node label (2.38) EXAMPLE In the following, age group is the
45、 characteristic of division applied to the concept of people: people (people by age group) children youths adults 2.5 classification classifying activity involving the components of grouping similar or related things together; separating dissimilar or unrelated things; and arranging the resulting gr
46、oups in a logical and helpful sequence 2.6 classification scheme schedule (2.49) of concepts (2.11) and pre-coordinated combinations of concepts (2.11), arranged by classification (2.5) NOTE A classification scheme often also includes an index. 2.7 coined term new term (2.61) created to express a co
47、ncept (2.11) for which no suitable term (2.61) exists in the required language NOTE For a further explanation and examples, see 6.6.5 and 9.3.3.3 2.8 compound equivalence relationship or mapping in which one term (2.61) or concept (2.11) in one context is represented by two or more terms (2.61) or c
48、oncepts (2.11) in another BS ISO 25964-1:2011ISO 25964-1:2011(E) ISO 2011 All rights reserved 32.9 compound term term (2.61) that can be split morphologically into separate components EXAMPLE In English: “copper mines” can be split into “copper” and “mines”; “lawnmowers” can be split into “lawns” an
49、d “mowers” In French: “mine de cuivre” can be split into “mine” and “cuivre”; “biodiversit” can be split into “biologie“ and “diversit“ NOTE Compound terms can be multi-word terms, or can consist of only one word. 2.10 computer application computer program or set of programs that provides high-level processing related to a specific user need NOTE In ISO 25964, a computer application is sometimes referred to as an application. 2.11 concept unit of thought NOTE Concepts can often be expressed in a variety of different w
copyright@ 2008-2019 麦多课文库(www.mydoc123.com)网站版权所有
备案/许可证编号:苏ICP备17064731号-1