1、BRITISH STANDARD BS5464-3: 1983 ISO2033:1983 Optical character recognition Part3: Method of coding machine readable characters (MICR and OCR) for information processing UDC 681.327.12:681.3.04+003:681.327.63BS5464-3:1983 This BritishStandard, having been prepared under the directionof the Office and
2、 Information Standards Committee, was published underthe authority of the BoardofBSI and comes intoeffecton 30 December1983 BSI12-1999 First published as BS4869-2 (and subsequently amended to BS5464-3) September1972 First revision December1983 The following BSI references relate to the work on this
3、standard: Committee referenceOIS/2 Draft for comment82/61070DC ISBN 0 580 13510 1 Committees responsible for this BritishStandard The preparation of this BritishStandard was entrusted by the Office and Information Standards Committee (OIS/-) to Technical CommitteeOIS/2 upon which the following bodie
4、s were represented: British Federation of Printing Machinery and Supplies Ltd. British Radio and Electronic Equipment Manufacturers Association British Telecom Business Equipment Trade Association Computing Services Association Independent Broadcasting Authority Institution of Electrical Engineers N
5、ational Computing Centre Ltd. Pira (The Research Association for the Paper and Board, Printing and Packaging Industries) Coopted experts Amendments issued since publication Amd. No. Date of issue CommentsBS5464-3:1983 BSI 12-1999 i Contents Page Committees responsible Inside front cover National for
6、eword ii 1 Scope 1 2 Field of application 1 3 References 1 4 Coding 1 5 General considerations 7 6 Font CMC7 7 7 Font OCR-A 7 8 Font OCR-B 8 9 Font E13B 10 Annex Main differences between ISO2033:1972 and the presentedition 11 Table 1 8-bit coding of the characters of all fonts 2 Table 2 7-bit coding
7、 of the characters of the CMC7 font 3 Table 3 7-bit coding of the characters of the OCR-A font 4 Table 4 7-bit coding of the characters of the OCR-B font 5 Table 5 7-bit coding of the characters of the E13B font 6 Table 6 Characters of the CMC7 font and their assignment to positions in the code tabl
8、es 7 Table 7 Characters of the OCR-A font and their assignment to positions in the code tables 7 Table 8 Characters of the OCR-B font and their assignment to positions in the code tables 8 Table 9 Characters of the E13B font and their assignment to positions in the code tables 10 Publications referr
9、ed to Inside back coverBS5464-3:1983 ii BSI 12-1999 National foreword This BritishStandard has been prepared under the direction of the Office and Information Standards Committee and is identical with ISO2033:1983 “Information processing Coding of machine readable characters (MICR and OCR)” publishe
10、d by the International Organization for Standardization (ISO). This BritishStandard supersedes BS5464-3, first published as BS4869-2 in1972 and amended in1977, which is withdrawn. Terminology and conventions. The text of the International Standard has been approved as suitable for publication as a B
11、ritishStandard without deviation. Some terminology and certain conventions are not identical with those used in BritishStandards; attention is drawn especially to the following. Wherever the words “International Standard” appear, referring to this standard, they should be read as “BritishStandard”.
12、The forms of expression used throughout this standard are not consistent with those normally used for BritishStandards. In particular, wherever the auxiliary verb “will” is used (for example, in5.2: “. . . will be coded”) this should be read as the auxiliary verb “shall” (for example: “shall be code
13、d”) denoting a requirement. The Technical Committee has reviewed the provisions of ISO646:1983 and ISO2022:1982, to which reference is made in the text, and has decided that they are acceptable for use in conjunction with this standard. Additional information. A revision of BS4730 is in course of pr
14、eparation to implement ISO646:1983 for use in the UK. The first edition of ISO2022, published in1973, is technically equivalent to BS4953. It has been decided to withdraw BS4953, pending further revision of ISO2022:1982, and it is intended to publish a new BritishStandard when the next edition of IS
15、O2022 has been finalized. A British Standard does not purport to include all the necessary provisions of a contract. Users of British Standards are responsible for their correct application. Compliance with a British Standard does not of itself confer immunity from legal obligations. Cross-reference
16、s International Standard Corresponding British Standard ISO1004:1977 BS4810:1980 Specification for print for magnetic ink character recognition (Identical) BS5464 Specification for optical character recognition ISO1073-I:1976 Part1:1977 Character set OCR-A. Shapes and dimensions of the printed image
17、 (Identical) ISO1073-II:1976 Part2:1977 Character set OCR-B. Shapes and dimensions of the printed image (Identical) Summary of pages This document comprises a front cover, an inside front cover, pagesi andii, pages1 to12, an inside back cover and a back cover. This standard has been updated (see cop
18、yright date) and may have had amendments incorporated. This will be indicated in the amendment table on the inside front cover.BS5464-3:1983 BSI 12-1999 1 1 Scope This International Standard defines the coded representation of printed characters recognized by reading equipment. It includes the fonts
19、: 2 Field of application This International Standard assigns bit-patterns to characters recognized by reading equipment. This coded information generated by the reading equipment is given to the recipient by different media, such as magnetic tape, by data transmission or a direct link. This coded re
20、presentation can also be used by printing devices to print the information which shall later be read. It is not intended for general information interchange. Two different applications are considered: Single-font reader: The reading equipment is only capable of recognizing one font at a time. Multip
21、le-font reader: The reading equipment is capable of recognizing multiple fonts at the same time. 3 References ISO646, Information processing 7-bit coded character set for information interchange. ISO1004, Information processing Magnetic ink character recognition Print specifications. ISO1073, Alphan
22、umeric character sets for optical recognition Part1: Character set OCR-A Shapes and dimensions of the printed image Part2: Character set OCR-B Shapes and dimensions of the printed image. ISO2022, Information processing ISO7-bit and8-bit coded character sets Code extension techniques. 4 Coding The co
23、ding given in this International Standard is based on the7-bit code described in ISO646 and on its extension to8 bits according to ISO2022. The empty positions in code Table 1 to Table 5 are reserved for future standardization. This International Standard does not define the character set to be read
24、 by the reading equipment. Two codings are shown. The8-bit coding is primarily intended for use with multi-font readers in the case where the7-bit coding is not sufficient to represent the needed characters. Independent of the coding shown in this International Standard, the code extension technique
25、s given in ISO2022 are applicable, i.e.the7-bit coding shown in this International Standard may be transformed into8-bit coding and the8-bit coding shown in this International Standard may be transformed into7-bit coding according to the rules of ISO2022. Furthermore, the characters of columns10 to1
26、5 can equally be designated as a G1, G2 or G3 set. References to positions of Table 1 to Table 5 in Table 6 to Table 9 are given by the notation “column number/row number”. The column numbers for7-bit coding consist of one digit and those for8-bit coding consist of two digits. The notations b 1to b
27、8refer to the7 bits or8 bits of the coding whereby, b 1is the low order bit. Example Capital letter F is shown in position4/6 of the7 bit table and04/6 of the8-bit table. This corresponds to bit pattern1000110 and01000110 respectively. 4.1 7-bit coding The7-bit coding can be used whenever the number
28、 of characters shown is sufficient for the application. This coding can also be used within an8-bit environment by adding an eighth bit with the value0, as defined in ISO2022. 4.2 8-bit coding The8-bit coding can be used whenever the number of characters in a7-bit table is insufficient for the appli
29、cation. The8-bit coding can also be used within a7-bit environment as defined in ISO2022. E13B as covered in ISO1004 CMC7 as covered in ISO1004 OCR-A as covered in ISO1073-1 OCR-B as covered in ISO1073-2BS5464-3:1983 2 BSI 12-1999 Table 1 8-bit coding of the characters of all fonts NOTEThe empty pos
30、itions are reserved for future standardization.BS5464-3:1983 BSI 12-1999 3 Table 2 7-bit coding of the characters of the CMC7 font NOTEThe empty positions are reserved for future standardization.BS5464-3:1983 4 BSI 12-1999 Table 3 7-bit coding of the characters of the OCR-A font NOTEThe empty positi
31、ons are reserved for future standardization.BS5464-3:1983 BSI 12-1999 5 Table 4 7-bit coding of the characters of the OCR-B font NOTEThe empty positions are reserved for future standardization.BS5464-3:1983 6 BSI 12-1999 Table 5 7-bit coding of the characters of the E13B font NOTEThe empty positions
32、 are reserved for future standardization.BS5464-3:1983 BSI 12-1999 7 5 General considerations 5.1 End of line If the information read by the equipment is structured in lines and if this structure should be maintained two possibilities are given: if the information is handled on a basis of records, t
33、hen every line will form one record; if the information is handled character-by-character (data stream), the end of a line will be coded by means of control characters CR and LF. 5.2 Characters in error If a character is recognized but cannot be identified as one character of the character set, cont
34、rol character SUB will be coded. 6 Font CMC7 The characters of the CMC7 font will be assigned to the positions of the code tables as specified in Table 6 (see also Table 1 and Table 2). Table 6 Characters of the CMC7 font and their assignment to positions in the code tables 7 Font OCR-A The characte
35、rs of the OCR-A font will be assigned to the positions of the code tables specified in Table 7 (see also Table 1 and Table 3). 7.1 Erase characters Erase characters will normally be ignored by the reading equipment. If there is a requirement to code these characters, control character DEL shall be c
36、oded. For the character Group Erase one or more DEL may be coded. Table 7 Characters of the OCR-A font and their assignment to positions in the code tables Characters of CMC7 8-bit code 7-bit code Digits0 to9 Capital letters A to Z 03/0 to03/9 04/1 to05/10 3/0 to3/9 4/1 to5/10 Symbol SI 10/10 3/10 S
37、ymbol SII 10/11 3/11 Symbol SIII 10/12 3/12 Symbol SIV 10/13 3/13 Symbol SV 10/14 3/14 Characters of OCR-A 8-bit code 7-bit code Digits0 to9 Capital letters A to Z 03/0 to03/9 04/1 to05/10 3/0 to3/9 4/1 to5/10 Full stop (period) a 02/14 2/14 Comma a 02/12 2/12 Equals sign 03/13 3/13 Plus sign 02/11
38、2/11 Hyphen, minus sign a 02/13 2/13 Solidus 02/15 2/15 Asterisk 02/10 2/10 Abstract symbol H1 (hook) 11/12 3/12 Abstract symbol H2 (fork) a 11/13 5/13 Abstract symbol H3 (chair) 11/14 3/14 Abstract symbol H4 (long vertical mark) 07/12 7/12 Character erase a 07/15 7/15 Group erase a 07/15 7/15 Capit
39、al letter, 14/0 Capital letter 14/2 Capital letter 14/1 Capital letter 14/8 Capital letter? 14/9 Capital letter 14/10 Capital letter 14/13 Colon 03/10 3/10 Semicolon 03/11 3/11 Question mark a 03/15 3/15 Quotation mark 02/2 2/2 a See special description in the following clauses.BS5464-3:1983 8 BSI 1
40、2-1999Table 7 Characters of the OCR-A font and their assignment to positions in the code tables 7.2 Alternative characters If there are alternative representations of one character, all of them will be assigned to the same position. This applies to characters Full stop, Comma, Hyphen, Question mark
41、and Apostrophe. 7.3 Abstract symbol H2 (fork) In general, this symbol is not used in conjunction with the alphabetic characters because of its potential interference with the letter Y. However, when it is possible to recognize both characters correctly, the indicated coding shall be used. 8 Font OCR
42、-B The characters of the OCR-B font will be assigned to the positions of the code table as specified in Table 8 (see also Table 1 and Table 4). Table 8 Characters of the OCR-B font and their assignment to positions in the code tables Characters of OCR-A 8-bit code 7-bit code Apostrophe a 02/7 2/7 Le
43、ft curly bracket 07/11 2/8 Right curly bracket 07/13 2/9 Percent sign 02/5 2/5 Ampersand 02/6 2/6 Dollar sign 10/4 2/4 Pound sign 10/3 2/3 Yen sign 10/5 a See special description in the following clauses. Characters of OCR-B (number according to ISO1073) 8-bit code 7-bit code 1 to10 digits0 to9 11 t
44、o36 Capital letters Ato Z 37 to62 Small letters ato z 03/0 to03/9 04/1 to05/10 06/1 to07/10 3/0 to3/9 4/1 to5/10 6/1 to7/10 63 Asterisk 02/10 2/10 64 Plus sign 02/11 2/11 65 Hyphen (minus sign) 02/13 2/13 66 Equals sign 03/13 3/13 67 Solidus 02/15 2/15 68 Full stop (period) 02/14 2/14 69 Comma 02/12
45、 2/12 70 Colon 03/10 3/10 71 Semicolon 03/11 3/11 72 Quotation mark 02/2 2/2 73 Apostrophe 02/7 2/7 74 Discontinuous underline a 05/15 5/15 75 Question mark 03/15 3/15 76 Exclamation mark 02/1 2/1 77 Left parenthesis 02/8 2/8 78 Right parenthesis 02/9 2/9 79 Less than sign 03/12 3/12 80 Greater than
46、 sign 03/14 3/14 81 Left square bracket 05/11 5/11 82 Right square bracket 05/13 5/13 83 Percent sign 02/5 2/5 84 Number sign 02/3 2/3 a See special description in the following clauses.BS5464-3:1983 BSI 12-1999 9Table 8 Characters of the OCR-B font and their assignment to positions in the code tabl
47、esTable 8 Characters of the OCR-B font and their assignment to positions in the code tables 8.1 Underline characters Two characters are provided for underlining The latter, Continuous underline is not intended for use in OCR applications. The character Discontinuous underline shall be used in OCR ap
48、plications as a free-standing character only, and shall not be printed under another character. It shall be coded as indicated. 8.2 Diacritical marks The characters will be used as a diacritical mark combined with a letter. In general, reading equipment recognizes such a combined character as one si
49、ngle character and will code it as one single character. If such a combined character is required and it is not contained in the code table as a single character, it shall be included in future revisions of this International Standard. Characters of OCR-B (numbers according to ISO1073) 8-bit code 7-bit code 85 Ampersand 02/6 2/6 86 Commercial at 04/0 4/0 87 Upward arrow head 05/14 5/14 88 Currency sign 02/4 2/4 89 Pound sign 10/3 90 Dollar sign 10/4 91 Vertical line a 07/12