1、ANWAIIM MS52-1991 Recommended Practice for the Requirements and Characteristics of Documents Intended for Optical Scanning Standard Association for information and Image Management 1 1 O0 Wayne Avenue, Suite 1 1 O0 Silver Spring, Maryland 2091 O Telephone 301 1587-8202 Approved As EEiiEz9 AIIM MS52
2、91 m 110112348 05005113 O13 m ANS I /A I I M M S 52-1 991 Standard for Information and Image Management - Recommended Practice for the Requirements and Characteristics of Documents Intended for Optical Scanning Association for Information and Image Management This standard describes the physical cha
3、racteristics of paper documents which facilitate black-and-white optical scanning, and the characteristics which make scanning either difficult or impossible. AIIM MS52 91 m 1012348 0500514 T5T Contents Foreword i MD 20910. 1 2 Normative references .l Board, Association for Information and Image Man
4、age- ment, 1100 Wayne Avenue, Suite 1100, Silver Spring, Scope and purpose .l At the time it approved this standard, the AIIM Stan- dards Board had the following members: 3 Definitions .2 4 Physical characteristics of documents . 2 5 Printed documents, . .4 6 Typography used in documents 4 7 Related
5、 document page elements . 5 8 Sample testing . .6 9 Summary 6 Figures 1 Example of vertical (portrait) mode for a 216 mm X 279 mm (8.5 inch X 11.0 inch) document and an A4 document: 210 mm X 297 mm (8.3 inch X 11.7 inch) 2 Example of preferred orientation for a half-size sheet with lines of print pa
6、rallel to the longer dimension Foreword (This foreword is not part of the American National Standard for Information and Image Management-Recommended Practice for the Require- ments and Characteristics of Original Documents Intend- ed for Optical Scanning, ANWAIIM MS52-1991.) This standard describes
7、 the physical characteristics of original documents which will facilitate their later scan- ning. These physical characteristics are largely indepen- dent from the method used to create the documents, and apply whether a document was created on a manual type- writer, a computer printer, a typesetter
8、, or by hand print- ing. This standard also identifies those characteristics which will make scanning difficult or impossible. Moreover, certain characteristics of original documents result in images which will, when scanned, produce either no image or an illegible image. Finally, this stan- dard li
9、sts physical characteristics both to use and to avoid. When the user has no control over the documents to be scanned, the scanner operator can be guided by a list of characteristics for special handling. When an organi- zation is aware that its documents will be scanned into document image systems o
10、r read by optical character recognition (OCR) scanners, it should take special care to design documents and forms which will facilitate rather than impede scanning. This is particularly impor- tant when original documents will be destroyed after scanning, and the electronic copy will be the only rec
11、ord. Some of the recommendations in this standard apply to documents which will be scanned with OCR systems, however, this document does not cover all of the com- plexities of OCR in detail. Various facets of OCR scan- ning are covered in other American National Standards Institute (ANSI) standards.
12、 Suggestions for improvements of this standard are wel- come. They should be sent to the Chair, AIIM Standards Marilyn Courtot, Chairman Thomas C. Baa Thomas E. Berney Loretta DAgnolo Bruce A. Holroyd Don Klosterboer E. Brien Lewis Alan S. Linden Robert C. Novak Charles A. Plesums L. Don Prince Geor
13、ge Thoma Charles F. Touchton Herbert J. White II The AIIM Electronic Imaging Committee, C13, processed and approved this standard. The committee had the following members at the time this standard was approved: Organization Represented Name of Representative Polaroid Corporation Richard Leslie, Chai
14、rman ADVENT Technologies Mark Chastain Alabama Department of Clara Jehle Blueridge Technologies John Jamieson Computer Microfilm Don S. Kyser Consultant E. Brien Lewis Eastman Kodak Shahzad Qazi FileNet Corporation Doug Stewart Genealogical Society of Eric Erickson Genealogical Society of Brent Rebe
15、r I-Net, Inc. William Hooton Information International, Richard Gershbock Kofax Image Products Dean Hough Mobile Oil Robert Starbird Moore Business Forms National Institute of Standards and Technology Archives corp. Utah Utah Inc. Delmer H. Oddy Thomas C. Bagg Smithsonian Archives Alan L. Bain 3M Co
16、mpany Bruce Evans U.S. Department of the Joseph G. Hardy U.S. Department of the Walter A. Orlof U.S. Library of Congress Felix Krayeski U.S. Library of Congress Basil Manns U.S. Patent and Kent Hughes Wang Laboratories Phil Eckell Army Trademark Office 1 AIIM MS52 91 303234B 0500535 996 American Nat
17、ional Standard for Information and Image Management - Recommended Practice for the Requirements and Characteristics of Docu- ments Intended for Optical Scanning, ANSUAIIM MS52- 1991 1 Scope and purpose 1.1 scope This standard describes the physical characteristics of paper documents which facilitate
18、 black-and-white opti- cal scanning, and the characteristics which make scan- ning either difficult or impossible. It provides general recommendations for the design of documents in order to make those documents easier to scan. Document processing (storage, retrieval, reproduction) is the focal poin
19、t for this standard. Its audience is the user of a scanner which captures one data bit (black/white) per picture element (PEL). Pixels and PELS are commonly used abbreviations for the term pic- ture element. Throughout this standard, the term PEL(S) will be used. This standard does not cover specifi
20、c scanning applica- tions, such as scanning of checks, scanning of engineer- ing drawings, or scanning of bar codes, which are the subjects of other standards. It also does not address the technical details for OCR, which are the subject of other standards. Moreover, oversized documents and tiling t
21、echniques are not specifically addressed in this standard, although many of the same principles apply. 1.2 Purpose The purpose of this standard is to describe characteris- tics of paper documents that may be scanned. This stan- dard provides general recommendations for the design of such paper docum
22、ents. 2 Normative references All standards are subject to revision. When the follow- ing documents are superseded by an approved revision, that revision may apply. 2.1 Referenced American National Standards ANSUAIIM MS32-1987, Microrecording of Engineer- ing Source Documents on 35mm Microfilm ANSUAI
23、IM MS35-1990, Recommended Practice for the Requirements and Characteristics of Original Documents That May Be Microfilmed ANSUAIIM MS44-1988, Recommended Practice for the Quality Control of Image Scanners ANSI X3.62-1987(R1990), Optical Character Recogni- tion (0CR)-Paper Used in OCR Systems ANS1 X3
24、.86-1980(R1987), Optical Character Recogni- tion (OCR) Inks ANSI X3.93M-l981(R1989), Optical Character Recog- nition (OCR) Character Positioning ANSI X3.99-1983(Ri991), Optical Character Recognition (OCR) Print Quality, Guidelines for ANSI X3.151-1987, Bond Papers and Index Bristols- Common Sheet Si
25、zes ANSI Y14.2M-I979(R1987), Line Conventions and Lettering 2.2 Other referenced standards CAN2-9.60M-76, Canadian National Standard for Paper Sizes for Correspondence IS0 216-1975, International Standard for Writing Paper and Certain Classes of Printed Matter-Trimmed Sizes-A and B Series 2.3 Refere
26、nced publications AIIM TR2-1980, Technical Report for Information and Image Management - Glossary of Micrographics CPPA El-1986, Canadian hlp and Paper Association Standard for Brightness of Pulp, Paper, and Paperboard 2A Related standards ANSI/AIIM MS23-1991, Practice for Operational Proce- dures/I
27、nspection and Quality Control of First- Generation, Silver Gelatin Microfilm of Documents ANSI X3.17-1981(R1989), Character Set for Optical Character Recognition (OCR-A) ANSI X3.49-1975(R1989), Character Set for Optical Character Recognition (OCR-B) CPPA D-29-1976, Canadian Pulp and Paper Associatio
28、n Standard Air Leak Roughness Test (Sheffield Method) TAPPI T452 OM-1987, Technical Association for the Pulp and Paper Industry Standard for Brightness of Pulp, Paper, and Paperboard (Directional Reflectance at 457 mm) TAPPI T538-OM-1988, Technical Association for the Pulp and Paper Industry Standar
29、d for Smoothness of Paper and Paperboard (Sheffield Method) 1 AIIM MS52 91 1012348 0500536 822 3 Definitions The following definitions apply to terms that appear in this standard. Other terms are defined in AIIM TR2, Technical Report for Information and Image Manage- ment - Glossary of Micrographics
30、. 3.1 Dots per inch PI): The number of individual dots per linear inch. Digital scanner, printer, and plot- ter resolutions (perceivable detail or ability of an imag- ing system to reproduce fine detail) are usually given in dots per inch, e.g., DPI. 3.2 Halftone dot pattern: The matrix of individua
31、l dots that is used to reproduce continuous tone images with printing processes, such as lithography, gravure, and letterpress. 3.3 Intelligent Character Recognition (ICR): The process of recognizing character data from digital bit map data. A number of scanners on the market today scan an entire im
32、age, and create a digital bit map, then em- mine the bit map data to recognize characters. ICR differs from OCR in that it is not limited to recognizing charac- ters as they are scanned, or limited to certain fonts. ICR techniques can be used at any time on a digital file. See 3.5. 3.4 Moire pattern
33、: An interference pattern between one screen pattern and another, such as a halftone screen and the screen created by a raster scan. A moire pattern appears in an image as a series of light and dark waves across the image. 3.5 Optical Character Recognition (OCR): The process of scanning text charact
34、ers and producing a dig- ital file that represents the characters. As originally de- fined, OCR was limited to capturing certain fonts (OCR-A and OCR-B) under specific conditions. As the term OCR is used today, and as used in this docu- ment, it includes many or all of the functions of ICR (see 3.3)
35、. Many or most of the systems and software on the market today which call themselves OCR are, perhaps, actually ICR rather than traditional OCR. 3.6 Oversized document: Any document larger than ledger size, i.e., an A3 document: 297 mm X 420 mm (11.7 inch X 16.5 inch). 3.7 PEL: Picture element, the
36、smallest area that can be transmitted for any given bandwidth used within a scanning or printing system or that can be turned in- dividually on and off on a display. Digital display reso- lution is usually given as the number of PELS per scan line and the number of scan lines. 38 Picture element: Sy
37、nonymous with PEL. 39 Pixel: Synonymous with PEL. 3.10 scanned image: The electronic digital represen- tation produced as a result of the scanning of an original document, or a representation of that data on a computer screen or on a hardcopy output device. 3.11 Smoothness: The property of a paper s
38、urface de- termined by the degree to which it is free of irregulari- ties. Such irregularities (hills and depressions) affect . image resolution. 4 Physical characteristics of documents 4.1 Paper sizes The most common paper sizes for correspondence are the North American “A” size, which is described
39、 in ANSI X3.151 and CAN2-9.60-M76, and is 216 mm X 279 mm (8.5 inch X 11.0 inch), and the IS0 A4 size, which is described in IS0 216, and is 210 mm X 297 mm (8.3 inch X 11.7 inch). Business forms, such as invoices and statements, tend to conform to these sizes, or they are one-half of one of these s
40、izes, Le., “half-size sheet.” Other document sizes include those that an organization may issue e.g., notepads or action memo forms to employees. These will be much easier to scan if they are 216 mm X 140 mm (8.5 inch X 5.5 inch), width by height, as opposed to a common notepad size of 127 mm X 203
41、mm (5.0 inch X 8.0 inch), width by height. The reason these document sizes will be easier to scan is that they will be scannable with the original scanner setup for width, i.e, the scanner will not have to be changed from its original setup. Many document scanners have automatic feed mechan- isms wh
42、ich will accommodate conventional sizes of paper. Documents which are of odd sizes often must be hand-placed into the scanner mechanism, and thus slow the scanning process. It is therefore recommended that where control over the document size is possible, con- ventional sizes be selected. While this
43、 is not possible for documents originating outside of an organization, inter- nal documents should be designed for easy scanning. Oversized documents pose special problems, as they can- not be handled by the scanner directly. Possible ways of scanning oversized documents include sectioning the docum
44、ent and reducing the document. This can be done on some photocopiers, or it may be done photographi- cally. It is important to ensure that no data are lost in the electronic copy of the document. The techniques for sectioning or reducing these documents are beyond the scope of this standard and are
45、described in ANSI/AIIM MS32. Checks with attached vouchers pose an unavoidable problem in scanning, as the size of the check plus the attached voucher is not standardized. While the actual check must conform to standards that are used in the banking industry, outgoing checks are typically scanned wi
46、th the voucher attached. If outgoing checks and vouchers are to be scanned, it is recommended that they 2 AIIM MS52 91 H 3032348 0500517 769 be designed into a size which will facilitate feeding them into the scanner. 4.2 Orientation The preferred orientation for most documents is the ver- tical mod
47、e (portrait mode), where, for a 216 mm X 279 mm (8.5 inch X 11.0 inch) document, the lines of print are parallel to the short side of the document. See figure 1, Example of vertical (portrait) mode for a 216 mm X 279 mm (8.5 inch X 11.0 inch) document and an A4 docu- ment: 210 mm X 297 mm (8.3 inch
48、X 11.7 inch). Figure 1 - Example of vertical (portrait) mode for a 216 mm X 279 mm (8.5 inch X 11.0 inch) document and an A4 document: 210 m X 297 mm (8.3 inch X 11.7 inch) For smaller documents, which have one dimension of 216 mm (8.5 inch) or 210 mm (8.3 inch), the preferred orientation for image
49、scanning is with the lines of print parallel to the edge with the longer dimension. This al- lows a document to be scanned with the scanner feeder set for the standard width and with the lines of print parallel to the top of the document. This is an important consideration for OCR scanners. See figure 2, Example of preferred orientation for a half-size sheet with lines of print parallel to the longer dimension. Figure 2 - Example of preferred orientation for a half-size sheet with lines of print parallel to the longer dimension It is recognized that all documents, e.g., char