1、AIIM TR32 94 = 10L234 O500393 743 W ANSI/AIIM TR32-1994 Paper Forms Design Optimization For Electronic Image Management (EIM) Technical Report Association for Information and Image Management 1100 Wayne Avenue, Suite 1100 Silver Spring, Maryland 2091 O Telephone 301 1507-8202 AIIM TR32 94 LO12348 05
2、00392 688 ANSVAIIM TR32-1994 Technical Report for Information and Image Management - Paper Forms Design Optimization For Electronic Image Management (EIM) An* ANSI Technical Report Prepared by the Association for Information and Image Management Association for Information and Image Management Abstr
3、act This technical report provides guidelines for the design of printed forms that are used in EIM systems. The primary audience for this document is individuals involved with designing or specifying requirements for printed paper forms. 0 AIIM TR32 i4 m LO12348 0500393 514 m Contents Foreword. . 1
4、1 Scope and purpose. . .l 2 References . 1 3 Definitions . 2 4 5 Providing user instructions . .9 6 Physical considerations 11 7 Forms testing . .l Figures la Sample constraint boxes 4 lb Sample constraint boxes 5 2 Sample optical character recognition (OCR) code . 6 3 Sample Magnetic Ink Character
5、Recognition (MICR) code . 6 4 Sample bar code 5 Samples of Kermit code 8 Layout and design .3 8 New technology . .13 6 Handprint example. 10 Annexes A Suggested readings . .13 B Examples of forms designs guides . 14 C Generation degradation table . .14 Foreword Forms may be used to collect informati
6、on and convert it into computer-processable data. There- fore, it is necessary to make a form usable, marketable, and scanable. Making a form appealing with color or graphics may increase the probability of it being filled out, but may also decrease its scanability and increase the amount of data st
7、orage required for its scanned im- age. This document provides guidance for optimizing the design of printed forms used with electronic image management (EIM) systems. Data captured from forms represents a significant mar- ket, tens of billions of dollars in size, and establishing general methods fo
8、r forms design will greatly assist in- surance, financial, and other large organizations that de- pend on data capture and manipulation from scanned forms on a regular basis. Todays banking industry could come to a halt if the relatively simple transfer authori- zation, the personal check, did not e
9、asily convert to hundreds of bits of electronically-readable data. Most business forms carry little or no such functional data. In a specific, but familiar case, the United States Inter- nal Revenue Service (IRS) is responsible for process- ing data from billions of pieces of paper each year, yet a
10、very small percentage of that paper contains electronically-readable data for easy processing by an EIM system. The majority of documents received by the IRS contain handwritten or handprinted information that is manually keyed. Many government agencies and large organizations are setting goals for
11、optimizing the design of their forms us- ing electronic imaging, increasing data capture using electronic recognition technology, and reducing the em- phasis on manual keying. This technical report covers different forms design characteristics including the following: a) type fonts; b) handprint cha
12、racters; c) weight, quality, and size of paper; d) layout; and e) print clarity and quality. References to additional reading material can be found in Annex A, Suggested Readings. Publication of this ANSI Technical Report has been ap- proved by the Accredited Standards Developer, Associ- ation for I
13、nformation and Image Management (AIIM). This document is registered as a Technical Report ser- ies of publications according to the Procedures for the Registration of ANSI Technical Reports. This document is not an American National Standard and the material contained herein is not normative in natu
14、re. Comments on the content of this document should be sent to As- sociation for Information and Image Management (AIIM), 1100 Wayne Avenue, Suite 1100, Silver Spring, Maryland, 20910. At the time it approved this technical report, the AIIM Standards Board had the following members: Marilyn Courtot,
15、 Chair Thomas C. Bagg Thomas E. Berney Loretta DAgnolo Bruce Evans Bruce A. Holroyd Don Klosterboer E. Brien Lewis Alan S. Linden Charles A. Plesums George Thoma Charles E Touchton Herbert J. White II Association for Information and Image Management National Institute of Standards and Technology Con
16、sultant American Express Company 3M Company Eastman Kodak Company Anacomp, Inc. I-NET, Inc. Wang Laboratories USAA National Library of IBM Corporation Genealogical Society of Medicine Utah i AIIM TR32 94 LO12348 0500394 450 The AIIM Electronic Imaging/Input Committee, C13, approved this technical re
17、port. The committee had the following members at the time this technical report was approved: Name of Representative Linda Wallace, Chir Larry Albertson Tom Atwood Thomas C. Bagg Alan Bain Gerald Bensi Chuck Biss Bob Blackwelder Sylvie Bokshorn Robert W. Bristol Bill Cox Wayne Doran Jack Eisen Eric
18、Erickson Jon M. Fech Tom Fine Richard Gershbock Scot Gilkenson Steven Gilheany Ralph Grant Pawan Gupta Joseph G. Hardy William Hooton Paul Horowitz Dean Hough Jonah Howells John Jamieson Clara F. Jehle Don S. Kyser Richard Leslie E. Brien Lewis Basil Manns Organization Represented FileNet Corporatio
19、n Applied Image, Inc. IMNET Corporation National Institute of Standards and Technology Smithsonian Institution Bell X, Y, or Y, Dominant Wavelength and Exci- tation Purity. TAPPI T538 OM-1988, Technical Association of the Pulp and Paper Industry - Standard For Smoothness of Paper and Paperboard (She
20、ffield Method). 1 AIIM TR32 94 IS0 216-1975, International Standard for Writing Paper and Certain Classes of Printed Matter - Trimmed Sizes - A and B Series. 3 Definitions The following definitions apply to terms that appear in this technical report. Other terms are defined in AIIM TR2. In addition,
21、 relevant definitions may be found in the documents listed in Annex A, Suggested Readings. 3.1 Alphanumeric Pertaining to a character set that contains letters, num- bers, and usually other characters, such as punctuation marks and symbols. 3.2 Bar code Array of vertical rectangular marks and spaces
22、 in a predetermined pattern. 3.3 Code density Number of code elements that can appear per unit of length. 3.4 Density Measurement of an image (printed using lithographic ink or toner) by determining the amount of incident radi- ant energy (light) reflected by a test sample. 3.5 Dropout ink Ink that
23、cannot be detected by a scanner. A scanners light source (used to illuminate a form) may be “blind” to this ink color, or the ink color may be filtered out of the light source. There are no “typical” dropout ink colors and those used should be specified by an imag- ing equipment manufacturer. 3.6 Fo
24、nt Complete family of a given size of type, including cap- itals, small capitals, and lower-case, together with figures, punctuation marks, ligatures, etc. Italics are spoken of as a separate font. 3.7 Generation One of the successive stages of photographic reproduc- tion of an original or master. C
25、opies made from a first generation are second generation, etc. For example, the first generation is either a printed or electronic version of a general purpose business form. See generation, Nth. 3.8 Generation, Nth (of an electronic image) The number of generations from the original. Example: the 2
26、nd generation (A-2, B-2, etc.) is a copy of the printed or electronic form. See generation. 1012348 0500396 223 = 3.9 Generation, printing Use of copies from an original or master to make addi- tional copies. See Generation, Nth. 3.10 Generation test Means of determining the number of times usable c
27、o- pies may be reproduced from succeeding generations. In this test, copies are successively reproduced until a print has been generated that is unusable. This indicates the anticipated range of generation copies that may be reasonably expected. For example, the generation test covers converting a f
28、orm through printed or electronic versions, i.e., creating subsequent generations by print- ing, scanning, and re-scanning. 3.11 Landscape orientation Mode of rendering an image in which the vertical dimen- sion of the presentation is smaller than the horizontal dimension. Contrast with portrait ori
29、entation. 3.12 Magnetic Ink Character Recognition (MICR) Machine recognition of digits printed with ink that can be magnetized. 3.13 Optical Character Recognition (OCR) An optical technique by which characters can be machine-identified, then converted into computer processable codes, e.g., American
30、Standard Code for Information Interchange (ASCII), Extended Binary Cod- ed Decimal Interchange Code (EBCDIC). 3.14 Pitch (character) The number of characters per inch measured horizon- tally. Fixed spacing printers have the same pitch for ev- ery letter, regardless of the letters widths. Proportiona
31、l spacing has varying pitch, depending on the letter. 3.15 Portrait orientation Mode of rendering an image in which the vertical dimen- sion of the presentation is greater than the horizontal dimension. Contrast with landscape orientation. 3.16 Print Contrast Signal (PCS) A measurement of the print
32、contrast where the R(b) is the average reflectance of the background measured in white light. R(p) is the average reflectance of the print measured in white light. PCS is the result of subtract- ing R(p) from R(b) and dividing the difference by R(b). 3.17 Recognition zone The area around a recogniti
33、on data field that is free of other data. 3.18 Sans serif A typeface without a serif. 2 AIIM TR32 94 3.19 Serif An ending stroke on the arm, stem, and tail of some typeface designs. 3.20 Symbol Collection of graphic entities (lines, circles, point, text) that are used to make up an object (e.g., a d
34、oor, chair, machine, etc.). 4 Layout and design Forms design aspects included in the following sections on layout and design need to be addressed during the forms design process. 4.1 Forms layout Corporate forms design guidelines should be followed whenever possible and such guidance should encompas
35、s EIM considerations. A form that is designed so that it is easy to complete may not be the best for EIM data capture and layout practices typically used for forms printing may not optimize the features of EIM systems. For example, a layout designed for ?user-friendliness,? may have large print, col
36、or coding, and mixed numeric and alphanumeric fill-ins (interspersed with printed in- structions next to boxes). In contrast, a form designed specifically for data capture and character recognition may have strict segregation of spaces for numeric and alphanumeric characters, and structured division
37、 of in- structional text with carefully crafted dropout color areas, as its essential qualities. For EIM applications, a form should be designed so that its logically-connected information is on the same page. Further, there are strong arguments for limiting the con- tent of some data capture fields
38、 to essential information only. 4.2 Type fonts A number of type fonts and type sizes lend themselves to easy reading and understanding. Type fonts used on a form lend to its attractiveness, which is certainly a fea- ture to consider when designing a form to encourage user response. Based on readabil
39、ity and clarity, type fonts that facilitate scanning are those with visual characteristics similar to Palatino, Helvetica, ITC Bookman, New Cen- tury Schoolbook, and Courier. The advantage of these type fonts, and others like them, is that they offer clear differentiation between character shapes. O
40、n the other hand, type fonts with designs simi- lar to Times Roman and Monterey are not as easily dis- tinguished by a scanner. See Section 2.3, Referenced publications. 4.2.1 Typeface There are two basic typeface classes, serif and sans serif. A serif typeface is designed for legi- LO12348 O500397
41、LbT = bility and is commonly used for text. They are univer- sally familiar, due to use in newspapers, books, and other text documents that are typically typeset in 8, 9, 10, 12, and 14 point. Sans serif typefaces are more commonly used in head- lines of manuals, short instruction text blocks, and f
42、orms captions. Sans serif typefaces generally require less horizontal space and more vertical height than a serif typeface that is set in the same point size. Serif typefaces will typically require more storage in a compressed image, because more data is recorded for each character. With the most co
43、mmon compression techniques, approximately 10% more storage is required for a page printed in a serif typeface, as compared with one printed using a sans serif typeface. (This sentence is printed in a serif typeface.) When a form?s text (or other information) is to be dropped out (not captured) or r
44、emoved using forms removal software, it is often printed using an ink color that is not recognized by the scanner. The typeface select- ed for this material is not as critical and should be select- ed primarily to enhance the readability of the form. 4.2.2 Symbol set A symbol set is made up of chara
45、c- ters such as $, %, #, etc., in addition to numerals and punctuation marks. Occasionally, a program for optical character recognition (OCR) may key on a special sym- bol to prompt an action. The forms designer should be aware of any special symbols that prompt such a response and avoid their use.
46、4.2.3 Spacing In computer typesetting, spacing may be either fixed or proportional. In fixed typesetting, each character takes up the same horizontal space. In propor- tional typesetting, different characters take up different amounts of horizontal space (width), e.g., a different character width is
47、 used for an ?i? and a ?w.? In propor- tional typesetting, the computer automatically spaces the individual characters as they are typeset. The forms designer should be concerned with spacing requirements for typesetting used for printed instructions and data entry areas on a form and should optimiz
48、e its design specifi- cally for imaging and OCR, since spacing affects scan speed, storage requirements and other system functions. The forms designer is also concerned with vertical spac- ing requirements for OCR. Although 0.166 inch (4.23 mm) vertical spacing is often sufficient for typewritten da
49、ta entries, at least twice that amount of space is neces- sary for handprint entries and for separating entries for OCR . 4.2.4 Pitch Pitch refers to the number of characters typeset per inch, or per millimeter, within a fixed space. Forms design should allow enough space for the fewest number of characters per inch (usually ten) when the pitch of the data to be entered is unknown. 3 AIIM TR32 94 W LOL2348 0500398 OTb H or C- 4.2.5 Size Typesize, or height, is usually specified in points. A point is 0.0138 inch (0.352 mm). Typesize is important to the users readability as weil as to