1、Designation: F 149 92b (Reapproved 2003)Standard Terminology Relating toOptical Character Recognition1This standard is issued under the fixed designation F 149; the number immediately following the designation indicates the year oforiginal adoption or, in the case of revision, the year of last revis
2、ion. A number in parentheses indicates the year of last reapproval. Asuperscript epsilon (e) indicates an editorial change since the last revision or reapproval.1. Scope1.1 This set of definitions is intended for use by personswho, in the course of their duties, make use of OCR equipmentor interact
3、with operators of such equipment.2. Referenced Documents2.1 ANSI Standards:ANSI X3.17 Character Set for Optical Character Recogni-tion (OCR-A)2ANSI X3.49 Character Set for Optical Character Recogni-tion (OCR-B)23. Terminology3.1 Definitions:adjacencytwo OCR characters printed on the same line withch
4、aracter spacing reference lines separated by the properspace for the font and system.alphamericSee alphanumeric.alphanumericpertaining to a character set that containsletters, digits, and usually other characters such as punctua-tion marks. Syn. alphameric.alphanumeric character seta character set t
5、hat containsboth letters and digits and may contain control characters,special characters, and the space character.alphanumeric character subseta character subset that con-tains both letters and digits and may contain control charac-ters, special characters, and the space character.average backgroun
6、d reflectanceexpressed as a percent, isthe simple arithmetic average of the background reflectionreadings from at least five different points on a sheet.average edgean imaginary line bisecting the irregularities ofthe character edge.backer printingprinting on the reverse side of the sheet. ForOCR fo
7、rms, the paper should have sufficient opacity so thatprinting on the back cant be seen on the front by the opticalscanner.background reflectancea measurement of the brightness ofpaper referring to the amount of light reflected back from thepaper at a particular point when that point is flooded withl
8、ight, as compared with the known value representingabsolute white (such as BaSO4).bandthe light frequency spectrum between two definedlimits; also light band.bankingthe alignment of the first graphic shape in a linewith respect to the left (right) margin, by certain devices(that is, typewriters, lin
9、e printers, etc.).bar codea binary coding system consisting of vertical marksor bars that, when read by an optical scanner, can beconverted to machine language.barium sulfate (BaSO4)a standard reflecting agent used tocalibrate instruments for measuring the whiteness and re-flectance of papers.base l
10、inea reference line used to specify the nominal relativevertical position of OCR characters printed on the same line.basis weightthe weight in pounds of a ream cut to aspecified basic size. The number of sheets in a ream isusually 500. The basic size for writing papers commonlyused in OCR applicatio
11、ns is 17 by 22 in. Also measuredmetrically in grams per square metre (g/m2) and referred toas grammage.blind inkSee reflective ink.bridgingenlargement of a graphic shape beyond the COL,which produces undesired character fill in.brightnessin paper, a characteristic of white paper mea-sured in terms o
12、f reflectance in the blue and violet portionsof the spectrum.caliperthe thickness of a sheet of paper measured underspecified conditions and usually expressed in thousandths ofan inch (mils).carbon papera sheet composed of a supporting substrate onone or both sides of which is a coating containing a
13、transferable (usually colored) material. The coating is ofsuch nature that it will transfer in part or entirely to a copysheet at the point of pressure contact.centerlinethe vertical axis around which character elementsare located for letters, numerals, or symbols of an OCR font.character(1) a membe
14、r of a set of elements upon whichagreement has been reached and that is used for theorganization, control, or representation of information. Char-acters may be letters, digits, punctuation marks, or other1This terminology is under the jurisdiction of ASTM Committee F05 onBusiness Imaging Products an
15、d is the direct responsibility of Subcommittee F05.01on Nomenclature and Definitions.Current edition approved Dec. 15, 1992. Published February 1993. Originallypublished as F 149 72. Last previous edition F 149 92a.2Available from American National Standards Institute, 11 W. 42nd St., 13thFloor, New
16、 York, NY 10036.1Copyright ASTM International, 100 Barr Harbor Drive, PO Box C700, West Conshohocken, PA 19428-2959, United States.symbols, often represented in the form of a spatial arrange-ment of adjacent or connected strokes or in the form of otherphysical conditions in data media.(2) a letter,
17、digit, or other symbol that is used as part of theorganization, control, or representation of data. A character isoften in the form of a spatial arrangement of adjacent orconnected strokes.character alignmentthe vertical or horizontal position ofcharacters with respect to a given reference line.char
18、acter boundaryin character recognition, the largestrectangle with a side parallel to the document referenceedge, each of whose sides is tangential to a given characteroutline.character erasean OCR graphic shape that will cover asingle character or a single space and will be read by theinterpreter as
19、 a deletion.character outline limit (COL)the minimum, nominal, andmaximum limits of a given graphic shape.character readeran input unit that performs characterrecognition.character readingmachine reading of alpha or numericcharacters, or symbols, or both, by optical means (asopposed to optical mark
20、reading).character recognition(1) The identification of characters byautomatic means.(2) See magnetic ink character recognition; optical char-acter recognition.character set(1) a finite set of different characters uponwhich agreement has been reached and that is consideredcomplete for some purpose,
21、for example, each of thecharacter sets contained in ANSI X3.17-81 Character Set forOptical Character Recognition (OCR-A) and ANSIX3.49-75 (R 1982) Character Set for Optical CharacterRecognition (OCR-B).2(2) an ordered set of unique representations called charac-ters, for example, the 26 letters of t
22、he English alphabet,Boolean 0 and 1, the set of symbols in the Morse code, and the128 ASCII characters.character skewthe rotational deviation of the printed imagefrom its intended orientation relative to a document refer-ence edge.character spacingthe pitch distance between adjacent char-acters.char
23、acter stroke widththe distance between the averageedges of a character element.character subseta selection of characters from a characterset, comprising all characters that have a specified commonfeature, for example, in each of the character sets containedin ANSI X3.17-81 Character Set for Optical
24、CharacterRecognition (OCR-A) and ANSI X3.49-75, (R 1982) Char-acter Set for Optical Character Recognition (OCR-B),2thedigits 0 to 9 may constitute a character subset.clear areathat region of a document reserved for OCRcharacters and the required clear space around these charac-ters.COLSee character
25、outline limit.contrast(1) in optical character recognition, the differencebetween color or shading of the printed material on adocument and the background on which it is printed.(2) See print contrast ratio.crowdingimproper horizontal character spacing.CVRcontrast variation ratio is the ratio betwee
26、n the maxi-mum and minimum PCS within a graphic shape:CVR 5PCS, maxPCS, mindebossmentthe depth of a print impression into the surfaceof a document.dirtin paper, refers to the presence of relatively nonreflec-tive foreign particles embedded in the sheet. The size andlack of reflectance of the particl
27、es may be such that they willbe mistaken for inked areas by an optical scanner.documenta form designed as input to a document reader.document readera scanning device that scans one to fivelines of data in fixed locations on a document at a singlepass. Generally, re-scanning of a portion of the docum
28、ent isnot possible, one direction of the scan being provided bymovement of the form past the reading head. The forms usedgenerally dont exceed 8 to34 in. in width by 4 to14 in. indepth. Also see page reader.drop out colorsSee reflective ink.drop out inkSee reflective ink.edge irregularitya variation
29、 in the stroke width of a printedcharacter.embossmentthe height of raised print or raised surface on adocument.errorthe substitution of one character for another.error ratethe ratio of the number of character substitutionsto the total number of characters read.extraneous inkany spot appearing within
30、 the “read” area,but outside the COL, caused by smear, tracking, or splatterthat can be caused either in the manufacturing or whileentering data on the form and can result in less optimumreadability.felt sidethe top side of the paper in the paper manufacturingprocess as opposed to wire side. Optical
31、 scanning formsshould be printed on the felt side.fieldany group of characters defined as a unit of information.field delimiterSee field separator.field markSee field separator.field separatora mark or symbol printed in scan ink thatidentifies fields to the scanner (Syn. field mark).fluorescencethe
32、property of emitting radiation in the visiblerange as a result of absorption of radiation in the ultravioletrange from some other source. Optical brighteners that havethis property are sometimes added to paper to enhance itswhiteness or brightness to the eye in normal lighting. Theemitted radiation
33、can cause erratic reflectance values.flying spot scanningin optical character recognition, adevice employing a moving spot of light to scan a samplespace, the intensity of the transmitted or reflected light beingsensed by the photoelectric transducer.fonta set of graphic shapes that may be alphabeti
34、c, numeric,or both and may include other symbols.F 149 92b (2003)2formatpreprogrammed identification of fields to be read byan optical scanner.free form (unformatted form)a form on which the dataappears in variable length fields. Preprinted symbols andguides are absent or minimal. Field delimiters a
35、re enteredwith the data.grain longpaper grain direction in sheets of paper is parallelto the long dimension of the sheet.grain shortpaper grain direction in sheets of paper isparallel to the short dimension of the sheet.group erasean OCR graphic shape that will delete a groupor string of three or mo
36、re characters.hand print boxesrestraints for controlling entry of scan-nable information by hand. Controlling of size, shape, andconfiguration of hand printed entries on an optical scanningform.hand print character setRefer to ANSI X3.45-182.2infinite pad methodin optical character recognition, amet
37、hod of measuring reflectance of a paper stock such thatdoubling the number of backing sheets of the same stock willnot change measured reflectance.infrared responsea particular type of optical system used insome scanners. As a general rule, nonscan inks for thispurpose are in the red portion of the
38、color spectrum.ink, OCRRefer to ANSI X3.86-180.2interpreterthat part of the OCR system which analyzes theinput data and determines what the individual characters areand what their relation is to each other.ion deposition printera printer where ion charges are gatedonto a dielectric drum. Toner is pi
39、cked up by the charge, thentransferred to the paper. Once the toner is deposited on thepaper, it can be affixed by either pressure or heat fusion.laser scanneran optical scanning device that uses theintense monochromatic light beam given off by a laser as itssource of illumination.leading edgethe ed
40、ge of a form that is used as a base forlocating the first line of data to be scanned.length/depththe distance between the two edges of a form,reached by moving at right angle to a nominal data line.light stabilityin optical character recognition, the resistanceto change of the color of the image whe
41、n exposed to radiantenergy.line skewthe angular displacement of a line in relation to itsintended position.line spacingthe distance between the average base line ofone line to the average base line of the next line.machine languagea language designed for use by a ma-chine, without translation.magnet
42、ic ink character recognition, MICRa recognitiontechnology that utilizes ink capable of being magnetized andsensed. A practical application is E-13B, which is usedprimarily within the North American financial industry.E-13B consists of 14 characters printed to high specificationsusing ink with iron o
43、xide pigments, or other inks utilizingingredients capable of being magnetized.magnetic printera printer in which magnetic signals arerecorded onto a magnetic belt or drum. A magnetic toner isattracted to the drum and transferred to paper where it isfused to the sheet.marginthe distance between any b
44、oundary of the printingarea and the nearest parallel paper edge.mark readingmachine reading, by optical means, of marks(usually vertical or horizontal bars) that have been manuallyentered.mark scanningthe automatic optical sensing of marksusually recorded manually on a data medium.mark sensingmachin
45、e reading of marks (usually pencilstrokes) on a punched card, by using the conductive prop-erties of the mark itself.marking positionthe area designated to mark informationon a mark read form. Also called a response position.mechanical disk scannera rotating scanning disk thatbreaks light reflection
46、 during the optical reading operationinto a series of light points that are directed through the slitof a fixed aperture and onto the surface of a photomultipliertube.MICR(1) An abbreviation commonly applied to the charac-ter set (E-13B) contained in ANSI X3.2-76 and X9.13-83.2(2) See magnetic ink c
47、haracter recognition.millimicron (Mu)a unit of length used in measuring lightwaves. The peak spectral response of a scanner is expressedin Mu.moisture resistant papera category of optical scanningpaper developed to meet unusual ambient or climatic con-ditions, for example, census forms or meter read
48、ing forms.multifont readera reading device that can read formscontaining intermixed characters printed in a number offonts. Multifont reading eliminates the need to prebatch theinput data by font prior to submission to the scanner.multiple font readera reading device that can read morethan one type
49、font, but only one font may be read at a time.MR-8the original optical scanning test device that measuresthe amount of reflected light in millivolts.noise(1) random variations of one or more characteristics ofany entity such as voltage, current, or data.(2) a random signal of known statistical properties ofamplitude, distribution, and spectral density.(3) loosely, any disturbance tending to interfere with thenormal operation of a device or system.nonreflective inkSee scan ink.nonread ink
copyright@ 2008-2019 麦多课文库(www.mydoc123.com)网站版权所有
备案/许可证编号:苏ICP备17064731号-1