1、ANSI INCITS TR-11-1992(formerly ANSI X3/TR-11-1992)Information Processing SystemsTechnical ReportInformation ResourceDictionary System (IRDS)Support forNaming ConventionVerification (NCV)Copyright American National Standards Institute Provided by IHS under license with ANSI Not for ResaleNo reproduc
2、tion or networking permitted without license from IHS-,-,-X3s Technical Report Series This Technical Report is one in a series produced by the American National Standards Committee, X3, Information Processing Systems. The Secretariat for X3 is held by the Computer and Business Equipment Manufacturer
3、s Association (CBEMA), 1250 Eye Street NW, Suite 200, Washington, DC 20005. As a by-product of the standards development process and the resources of knowledge devoted to it, X3 from time to time produces Technical Reports. Such Technical Reports are not stan- dards, nor are they intended to be used
4、 as such. X3 Technical Reports are produced in some cases to disseminate the technical and logical con- cepts reflected in standards already published or under development. In other cases, they derive from studies in areas where it is found premature to develop a standard due to a still-changing tec
5、hnology, or inappropriate to develop a rigorous standard due to the existence of a number of viable options, the choice of which depends on the users particular requirements. These Technical Reports, thus, provide guidelines, the use of which can result in greater consistency and coherence of inform
6、ation processing systems. When the draft Technical Report is completed, the Technical Committee approval process is the same as for a draft standard. Processing by X3 is also similar to that for a draft standard. Published by American National Standards Institute 11 West 42nd Street, New York, New Y
7、ork 10036 Copyright 0 1992 by American National Standards Institute All rights reserved. No part of this publication may be reproduced in any form, in an electronic retrieval system or otherwise, without prior written permission of the publisher. Printed in the United States of America APSI C992/60
8、Copyright American National Standards Institute Provided by IHS under license with ANSI Not for ResaleNo reproduction or networking permitted without license from IHS-,-,-Foreword At the time it approved this Technical Report, the Technical Committee X3H4 on Information Resource Dictionary System ha
9、d the following mem- bers: Anthony J. Winkler, Chair Bruce Bargmeyer John Bestwick Roger Burkhart Chi Chen Twyla Courtot Edd Cutway Richard Desmond Alan Goldfine Beverly Hacker Mark Jones Douglas Mann Dana Marks Sandra Perez Mike Reynolds Cliff Sundberg Manoo Urs Mel Bing (Alt.) Jim Fulton (Alt.) Bo
10、b Hodges (Alt.) Julia McCreary (Alt.) Judith Newton (Alt.) Burt Parker (Alt.) Woody Pidcock (Alt.) Jim Pipher (Alt.) Mohan Prabandham (Alt.) Gary Rokey (Alt.) Anthony Sarris (Alt.) John Sowa (Alt.) Task Group X3H4.4, IRDS Administration, had the following members: Judith Newton, NIST, Chair Twyla Co
11、urtot, MITRE Corp., Vice-Chair file or record names were checked for adherence to certain format or syntax. Because of space limitations and simplistic name validation mechanisms, the need for sophisticated naming conventions was minimal. Systems were built by developers as stand-alone applications
12、with little or no direct access by end users. Technological advances have moved users closer to development. The 80-column card has given way to tapes and disks with more space for data representation. Hardware architectures and capabilities have improved performance, throughput, and storage capabil
13、ity. Distribution of applications and data across multiple homogeneous or heterogeneous platforms and locations, coupled with these enhanced capabilities, has made familiar problems worse and created new ones in identifying and locating data. Now the same data structure is used in local and distribu
14、ted systems. New constraints are placed on identifying redundant data and ensuring that semantic properties are preserved. Organizations have increased in size and the amount of data being processed has astronomically increased. To support this automation and distribution there is now a requirement
15、to understand and integrate systems within and across organizations. The need for standard names becomes apparent, and naming convention methodologies have been developed to assist in unambiguously identifying data. 6 Copyright American National Standards Institute Provided by IHS under license with
16、 ANSI Not for ResaleNo reproduction or networking permitted without license from IHS-,-,-Within organizations today there is increased communication and use of automated data. Organizations now manage information 8s 8 corporate resource just like the application systems that process the data. The ne
17、ed for data sharing and management has been institutionally recognized. Data sharing occurs both horizontally and vertically within 8n organization and extends outside the organization 8s well. The importance and problem of communicating and understanding data with the semantics intact has received
18、much attention. Data naming and name Verification can aSSiSt significantly with the preservation of semantic data integrity. To 2. Sets, and membership in sets, of the name or any of its components (lexical, semantic, and syntactical); and, 3. Constraints on the size of the name or any of its compon
19、ents (lexical). As indicated, these categories of naming characteristics concern lexical, semantic, and syntactical rules. Each of the categories may contain one or more specific rule types. Specific rule types associated with each category are provided below. Each rule type may have one or more con
20、verse rules. A rule and its converse are grouped together in the list below for clarity. Converse rules should be treated as independent rule types. The categories and rule types associated with them are not mutually exclusive. When a naming convention is developed, the various rules are specified i
21、n some combination that is not logically contradictory. Category 1: Positioning Type la: A word or symbol is required to be placed in a specific relative or absolute position, order, or sequence within the name, e.g., “a keyword or symbol must always appear in the 1st position of the name“, “the obj
22、ect or noun of the phrase must always exist and must be positioned immediately preceding the end-of-phrase delimiter“, or “the component must begin with an alphabetic character or national special character, e.g., “$“, and be immediately followed by an alphabetic character. 14 Copyright American Nat
23、ional Standards Institute Provided by IHS under license with ANSI Not for ResaleNo reproduction or networking permitted without license from IHS-,-,-Type lb: Type lc: Type Id: Type le: Category 2: Sets Type 2a: Type 2b: Type 2c: Type 2d: Type 2e: Type 2f: Type 2g: Type 2h: A word or symbol is not re
24、quired to be placed in a specific relative or absolute position, order, or sequence within the name. A space or designated symbols may be required to separate or delimit specified components of a name, e.g., “an asterisk will precede the keyword, hyphens will separate words of a term, all other word
25、s and designated symbol sets will be separated by an underline“. No separators or delimiters are specified, default is a space between all words or designated components. The relationship of the name to one or more other entities in the metadata structure may require incorporation of some form of pa
26、rent entity or configuration/version identification, e.g., record code included as a component of each of its contained field names. A name, word, or symbol is required to be a member of a specific, designated set(s), e.g., must match keyword list, or must be a specified connector symbol. A name, wo
27、rd or symbol is required not to be a member of a specific, designated set(s), e.g., “stop“ list, “dirty word“ list, or “unique name“ list. A name, word or symbol is not required to be a member of a specific, designated set(s). Set membership is controlled, e.g., a keyword set is established for sele
28、cted name components and changes to the set must be approved by an individual specified by name or position. Set membership is not controlled. A component of a name may be restricted to words of a specific form or part of speech, e.g., no plural nouns, or no articles. A component of a name may be an
29、y word of a language. A component may be a member of one and only one component type set. 15 Copyright American National Standards Institute Provided by IHS under license with ANSI Not for ResaleNo reproduction or networking permitted without license from IHS-,-,-Type 21: A component may be a member
30、 of more than one component type set. Category 3: Size Type 3a: The name, word, or symbol set contains a specified minimum, constant, or maximum number of symbols, e.g., “a name may be no more than 30 characters“, or “a mnemonic name or code must always be 6 characters long“. Type 3b: The name, word
31、, or symbol set contains no specified minimum, constant, or maximum number of symbols. Type 3c: A shortened form of one or more word components of a name may be required to be substituted for the original word(s). These may be in the form of an abbreviation, contraction, truncation, or acronym. Rule
32、s or algorithms may be specified for the purpose of name and component shortening. Type 3d: There is no requirement for shortening a name or word component. A discussion of name verification and validation as related to these categories and rules is provided in Annex C. 16 Copyright American Nationa
33、l Standards Institute Provided by IHS under license with ANSI Not for ResaleNo reproduction or networking permitted without license from IHS-,-,-4.0 NCVR Features Features required to be provided and supported by the NCVM are listed below. These features were derived from the analysis done to identi
34、fy naming convention paradigms, task group experience in naming, naming verification support currently available in name verification software, and interaction with users. User features and requirements are documented more fully in Annex F. For verification, these features have been mapped to the re
35、quirements listed in clause 1.3. Some features support more than one requirement, and some requirements are supported by more than one feature. The requirements that relate to each feature are provided in the parentheses at the end of each numbered feature. 4.1 Required Features 1. Specific naming c
36、onventions and name verification rules shall be external to the NCVM. The NCVM shall support the definition and maintenance of naming convention rules and the verification of IRDS names according to the rules defined by an organization. The NCVM shall not be limited to a predefined set of naming con
37、vention or name verification rules. (Requirement 2). 2. The NCVM shall be able to detect and reject inconsistencies between the naming rules defined to it. (Requirement 14). 3. The NCVM shall verify correct names and identify nonstandard names, abbreviations, terms, etc. either for names existing in
38、 the IRD or for name(s) entered by a user. (Requirement 1). 4. Ihe NCVM shall assist in the generation of allowable names from a given or proposed name based on a set of consistent rules. (Requirements 4, 8, 15). Standard names, including access and descriptive names, shall be suggested by the NCVM
39、when a proposed standard name is determined to be incorrect according to the rules described to the NCVM for standard names. Alternate names shall be generated from standard names where the alternate name is based on a set of naming rules different from those used to verify the standard name. Altern
40、ate names can include programming names (e.g., COBOL names) as well as other more user-oriented names. 17 Copyright American National Standards Institute Provided by IHS under license with ANSI Not for ResaleNo reproduction or networking permitted without license from IHS-,-,-5. The NCVH shall analy
41、ze names based on content of components and relative and absolute format arrangements. It shall also analyze the semantic content of connectors. The NCVM shall associate word types within a given context. (Requirements 5, 6, 7, 8, 9). Since naming convention paradigms separate names into parts and a
42、ssign meaning to those parts, the NCVM shall support the verification of names based on components and words arranged in a particular order for various contexts. This is the lowest level of semantic analysis necessary to verify relationships between name components. The NCVM shall identify component
43、s by both absolute and relative position (context) in a name. 6. The NCVM shall provide thesaurus capability to support name generation and semantic identification. (Requirements 12, 13, and 15). 7. The NCVM shall support synonym identification of name components. This shall include non-exact matche
44、s that incorporate identical terms presented in different order. (Requirements 12, 13). The NCW shall identify and maintain synonyms and near-synonyms used in components and words in the given name structure. 8. The NCVM shall allow for different rules for different object types and life cycle phase
45、s. Thus, it shall support various methods of naming across different data types. (Requirement 4). 9. The NCVM shall support rule maintenance. (Requirement 3). The NCVM shall support adding, changing, and deleting of rules according to the security and permissions established by the user organization
46、. 11. The NCVM shall provide the capability to enter names directly to the dictionary and check for duplicates from the dictionary. (Requirement 1). 12. The NCVM shall prohibit entry of duplicate names within an object type. (Requirement 1). 13. The NCVM shall provide for lexical, semantic, and synt
47、actic checks. (Requirement 5). 14. The NCVM shall have the capability to automatically insert abbreviations or acronyms, or automatically insert fully spelled out words or terms from abbreviations or acronyms. (Requirement 11). 15. The NCVM shall have the capability to tailor display formats for met
48、adata relevant to the organization and support the application of the naming convention. (Requirement 10). 18 Copyright American National Standards Institute Provided by IHS under license with ANSI Not for ResaleNo reproduction or networking permitted without license from IHS-,-,-4.2 Possible Additi
49、onal Capabilities Capabilities that should be considered for the NCW to support, but not required to provide, are those that support additional user interface, browse, data thesaurus, security, and name analysis. The capabilities identified are given below, grouped by the capability area: User Interface: l Provide interactive and batch modes for NCVM functions l P