1、Juni 2005DEUTSCHE NORM Normenausschuss Informationstechnik (NI) im DINPreisgruppe 14DIN Deutsches Institut f r Normung e.V. Jede Art der Vervielf ltigung, auch auszugsweise, nur mit Genehmigung des DIN Deutsches Institut f r Normung e. V., Berlin, gestattet.ICS 35.040C1 9614300www.din.deXDIN EN 1460
2、3Informationstechnik Alphanumerischer Bildzeichensatz f r optische Zeichenerkennung OCRB Formen und Abmessungen des gedruckten Bildes;Englische Fassung EN 14603:2004Information technology Alphanumeric glyph image set for optical character recognition OCRB Shapes and dimensions of the printed image;E
3、nglish version EN 14603:2004Technologies de l information Jeu d images de glyphe alphanumrique pour la reconnaissance optique de caractres ORCB Formes et dimensions de l image imprime;Version anglaise EN 14603:2004Alleinverkauf der Normen durch Beuth Verlag GmbH, 10772 BerlinErsatz f rDIN 66009:1977
4、09www.beuth.deGesamtumfang 35 SeitenDIN EN 14603:2005-06 2 Nationales Vorwort Die vorliegende Norm geht auf Arbeiten des CEN/TC 304 ICT European Localisation Requirements“ zurck. Das nationale Spiegelgremium ist der NI-29.01 Codierte Zeichenstze“ des Normenausschusses Informationstechnik (NI) im DIN
5、. Das Prsidium des DIN hat mit seinem Beschluss 1/2004 festgelegt, dass von dem in den Regeln der europischen Normungsarbeit von CEN/CENELEC verankerten Grundsatz, wonach Europische Normen in den drei offiziellen Sprachen Deutsch, Englisch und Franzsisch verffentlicht werden, in begrndeten Ausnahmef
6、llen abgewichen und auf die deutsche Sprachfassung verzichtet werden kann. Die Genehmigung dafr hat die DIN-Geschftsleitung entsprechend ihren in Anlage 1 zu dem DIN-Rundschreiben A 5/2004 festgelegten Kriterien fr die vorliegende Norm auf Antrag des NI als Ergebnis einer Einzelfallentscheidung erte
7、ilt, zumal bereits dieser Norm zugrunde liegende Papiere berwiegend in englischer Sprache auch von den deutschen Marktteilnehmern angewendet werden. nderungen Gegenber der DIN 66009:1977-09 wurden folgende nderungen vorgenommen: a) der Zeichenvorrat wurde normativ um das Eurozeichen erweitert; b) de
8、r Zeichenvorrat wurde durch spezielle Buchstaben europischer Sprachen ergnzt (informativ). Frhere Ausgaben DIN 66009: 1977-09 EUROPEAN STANDARDNORME EUROPENNEEUROPISCHE NORMEN 14603December 2004ICSEnglish versionInformation technology Alphanumeric glyph image set foroptical character recognition OCR
9、-B Shapes and dimensionsof the printed imageTechnologies de linformation Jeu dimages de glyphealphanumrique pour la reconnaissance optique decaractres OCR-B Formes et dimensions de limageimprimeInformationstechnik Alphanumerischer Bildzeichensatzfr optische Zeichenerkennung OCR-B Formen undAbmessung
10、en des gedruckten BildesThis European Standard was approved by CEN on 17 June 2004.CEN members are bound to comply with the CEN/CENELEC Internal Regulations which stipulate the conditions for giving this EuropeanStandard the status of a national standard without any alteration. Up-to-date lists and
11、bibliographical references concerning such nationalstandards may be obtained on application to the Central Secretariat or to any CEN member.This European Standard exists in three official versions (English, French, German). A version in any other language made by translationunder the responsibility
12、of a CEN member into its own language and notified to the Central Secretariat has the same status as the officialversions.CEN members are the national standards bodies of Austria, Belgium, Cyprus, Czech Republic, Denmark, Estonia, Finland, France,Germany, Greece, Hungary, Iceland, Ireland, Italy, La
13、tvia, Lithuania, Luxembourg, Malta, Netherlands, Norway, Poland, Portugal, Slovakia,Slovenia, Spain, Sweden, Switzerland and United Kingdom.EUROPEAN COMMITTEE FOR STANDARDIZATIONCOMIT EUROPEN DE NORMALISATIONEUROPISCHES KOMITEE FR NORMUNGManagement Centre: rue de Stassart, 36 B-1050 Brussels 2004 CE
14、N All rights of exploitation in any form and by any means reservedworldwide for CEN national Members.Ref. No. EN 14603:2004 EEN 14603:2005 (D)2ForewordThis document (EN 14603:2004) has been prepared by Technical Committee CEN/TC 304, “Information andcommunication technologies European localization r
15、equirements, the secretariat of which is held by SIS.This European Standard shall be given the status of a national standard, either by publication of an identicaltext or by endorsement, at the latest by June 2005, and conflicting national standards shall be withdrawn atthe latest by June 2005.The d
16、ocument is based on the International Standard ISO 1073/ll, Alphanumeric character set for opticalrecognition Part ll: Character set OCR-B Shapes and dimensions of the printed image.According to the CEN/CENELEC Internal Regulations, the national standards organizations of the followingcountries are
17、bound to implement this European Standard: Austria, Belgium, Cyprus, Czech Republic,Denmark, Estonia, Finland, France, Germany, Greece, Hungary, Iceland, Ireland, Italy, Latvia, Lithuania,Luxembourg, Malta, Netherlands, Norway, Poland, Portugal, Slovakia, Slovenia, Spain, Sweden, Switzerlandand Unit
18、ed Kingdom.EN 14603:2004 (E)3Contents1 Scope 52 Conformance53 Normative references54 Terms and definitions .55 Coding in OCR applications.66 OCR-B styles67 OCR-B sizes .68Typical dimensions of the nominal printed image.79 OCR-B glyph image set.89.1 Subset 1: Minimal alphanumeric subset89.2 Subset 2:
19、 Basic alphanumeric subset89.3 Subset 3: Extended alphanumeric subset .99.4 Subset 4: Options subset 910 Index table1010.1 Availability of glyph images .1010.2 Identification of drawings .1010.3 Application considerations 1011 Use of diacritical marks 1811.1 Diacritical mark repertoire1811.2 Composi
20、te glyph images 1811.3 Rules for glyph image combinations 1812 Use of the LOW LINE glyph 1913 SPACE 1914 Glyph image shape definitions 1914.1 Reference drawings 1914.2 Availability of duplicates 1914.3 Type dimensions .1914.4 Constant-strokewidth font, size I.1914.5 Constant-strokewidth font, size I
21、II.2014.6 Constant-strokewidth font, size IV.2014.7 Letterpress font, size I 2015 Printing the letterpress and constant-strokewidth fonts 2016 Illustration of OCR-B .20Annex A (normative) Definition of Euro sign glyph image (ISO/IEC 9541-3 syntax) 22Annex B (informative) Main differences between ISO
22、 1073/II-1976 and this European Standard 23Annex C (informative) Notes on the implementation of OCR-B .24Annex D (informative) Glyph-repertoire extension needs identified in JTC 1/SC 2 revision process .25Annex E (informative) Illustrations of reference drawings .29Annex F (informative) Availability
23、 of reference drawings.32Bibliography33SeiteEN 14603:2004 (E)4IntroductionOptical Character Recognition technology, OCR, came into use in the 1960s, and some specialized OCR fontswere designed at the time. In 1976 two such fonts were formally standardized by ISO, designated OCR-A andOCR-B, in the st
24、andard ISO 1073 parts I and II, respectively.ISO 1073 was developed by the ISO Technical Committee ISO/TC97, Computers and information processing. Atthe creation of ISO/IEC JTC 1, responsibility for ISO 1073 was transferred to JTC 1/SC 2, Coded character sets.In order to enlarge the set of character
25、s covered by the standard, especially with special letters used in European-origin languages, a revision of the standard was initiated in 1994 by JTC 1/SC 2, and progressed through threeconsecutive Committee Drafts. Since however testing of the proposed character set extensions could not be ac-compl
26、ished, the JTC 1/SC 2 revision was discontinued in 1999.With the introduction of the Euro sign a need primarily European to add that character to the OCR-B set wasrecognized. CEN/TC304 therefore decided to develop an OCR-B glyph image shape for the character, verify itsrecognition properties, and in
27、clude it in a European version of the OCR-B standard; see CEN/TC304 reports refer-enced in the Bibliography. The decided-on glyph image shape is specified in Annex A.For reasons of continuity, and also to facilitate possible future CEN ISO/IEC cooperation on OCR-B, it was de-cided to use the current
28、 ISO text with only the necessary minimum of changes as a basis for the CEN standard,even though the ISO text was developed in an OCR-technology situation rather different from the one existingwhen this CEN standard is published. In particular, the ISO standard texts division into clauses was kept a
29、s far aspossible, although some restructuring might have been desirable.A description of the main differences between this European Standard and ISO 1073/II is given in Annex B. Gen-eral information on the implementation of the OCR-B shapes, taken from ISO 1073/II, has been included in An-nex C.In c
30、onnection with the verification of the recognition properties of the Euro sign, some limited verification was alsodone on special letters identified during the JTC 1/SC 2 revision work as needed in OCR-B. The extent of this veri-fication is not sufficient for the inclusion of the letters in the OCR-
31、B repertoire at present, but the issue is describedin Annex D, as a basis for possible future inclusion work.EN 14603:2004 (E)51 ScopeThis European Standard defines a set of glyph im-ages designated OCR-B, intended primarily for usein Optical Character Recognition (OCR) appli-cations, but suitable a
32、lso for visual, i.e. human,reading. It does not relate any coding scheme withthese images (see clause 5).This European Standard is based on the ISO stan-dard 1073 part II. It differs from that standard inextending normatively the set of glyph images withthe Euro currency sign; but also in deleting s
33、omeglyphs not relevant in present-day OCR processing.It further adds information on a number of glyph im-ages corresponding to characters specific to someEuropean-origin languages.NOTE In ISO 1073 Part II the term “character“ isused not only in its strict sense, but also to mean theprinted images us
34、ed for their visual, i.e. printed, repre-sentations. In this European Standard the term “glyphimage“ is used in the latter sense.This European Standard contains information onnominal dimensions for the glyph images. Toler-ances, printing quality and other characteristics ofthe formats needed to sati
35、sfy interchange require-ments are covered in other standards (see clause3).The glyph image set contains 117 glyph imagescomprising digits, capital and small letters, diacriticalmarks, and symbols. It also contains a definition forSPACE.The diacritical marks are designed for combinationwith small let
36、ters to produce composite glyph im-ages complementing the basic image repertoire.2 ConformanceA printing or OCR reading device is in conformancewith this standard if it can generate/recognize, foreither or both of the defined styles (see clause 6)and in one or more of the specified sizes (seeclause
37、7), all or part of the specified glyph imagesubsets (see clause 9).A claim of conformance shall specify all the imagesin (each of) the style(s) and size(s) generated/recognized. Such a specification shall take the formof a reference to one of the subsets, a list of the im-ages generated/recognized,
38、or a combination ofthose.Additionally, a printing or OCR reading device mustclaim conformance to International Standard ISO1831 (see clause 3).Printed images produced by an OCR-B printing deviceare in conformance with this standard if their nominalshapes and dimensions are in accordance with theirre
39、spective reference drawing(s) and, in the case of theEuro sign glyph image, with Annex A (see clause 14);with the claimed conformance to tolerances and printingquality factors specified in standard ISO 1831 consid-ered.3 Normative referencesThis European Standard incorporates by dated or un-dated re
40、ference, provisions from other publications.These normative references are cited at the appropriateplaces in the text and the publications are listed here-after. For dated references, subsequent amendments toor revisions of any of these publications apply to thisEuropean Standard only when incorpora
41、ted in it byamendment or revision. For undated references the lat-est edition of the publication referred to applies.ISO 1831-1980, Printing specifications for optical char-acter recognition.ISO/IEC 9541-3:1994, Information technology Fontinformation interchange Part 3: Glyph shape repre-sentationOC
42、R-B character reference drawings and glyph defini-tion (see clause 14).4 Terms and definitionsFor the purposes of this European Standard, the fol-lowing terms and definitions apply:4.1charactera member of a set of elements used for the organisa-tion, control or representation of data.4.2coded charac
43、ter seta set of characters, defined by unambiguous rules thatestablish the character set and the relationship betweenthe characters of the set and their coded representa-tions.4.3composite glyph imageAn image printed on paper or any other medium in-tended for OCR applications, obtained by superimpos
44、-ing two or more glyph images on the same area.4.4glyphA recognizable abstract graphic symbol which is inde-pendent of any specific design.EN 14603:2004 (E)64.5glyph imageAn image of a glyph, as obtained from a glyph rep-resentation printed on paper or any other mediumintended for OCR applications.N
45、OTE The definition above of “coded character set“differs slightly from definitions in ISO/IEC standards, andthe definition of “glyph image“ is more limited. The defini-tion of “composite glyph image“ is specific to this standard(at the time of its publication).5 Coding in OCR applicationsThis standa
46、rd defines a set of glyph images, butdoes not specify corresponding characters, and re-lates no coding with the images. The images havebeen named as far as possible in the same way asthe characters with corresponding glyphs in theISO/IEC standard 10646-1 (see Bibliography), butthis does not imply an
47、y normative association be-tween the OCR-B glyph images according to thisEuropean Standard and the characters of eitherISO/IEC 10646-1 or any other standard for codedcharacter sets.Printing and/or OCR applications based on thisEuropean Standard must therefore define, throughreference to other standa
48、rds or otherwise, the set ofglyph images which is available for printing and/orshall be recognized, and for each image the corre-sponding character and its coding.6 OCR-B stylesThe OCR-B glyph images are defined by this stan-dard in two different styles.The “constant-strokewidth“ style is intended p
49、rimar-ily for printer equipment in which the width of thestrokes of the images is less controllable. This is forinstance the case for some types of mechanicalprinters.The “letterpress“ style is intended for printingequipment which can reproduce fine details withhigh accuracy. For aesthetic reasons, thestrokewidths of the letterpress images are varieddeliberately, and the stroke endings are speciallydesigned.The shapes of the glyph images for the two stylesare specified (with the exception of the Euro signglyph) by
copyright@ 2008-2019 麦多课文库(www.mydoc123.com)网站版权所有
备案/许可证编号:苏ICP备17064731号-1