1、 Reference numberISO/IEC 14492:2001/Amd.1:2004(E)ISO/IEC 2004INTERNATIONAL STANDARD ISO/IEC14492First edition2001-12-15AMENDMENT 12004-12-15Information technology Lossy/lossless coding of bi-level images AMENDMENT 1: Encoder Technologies de linformation Codage avec ou sans perte des images au trait
2、AMENDEMENT 1: Codeur Amendment 1:2006 toNational Standard of CanadaCAN/CSA-ISO/IEC 14492:04Amendment 1:2004 to International Standard ISO/IEC 14492:2001 has been adopted without modification(IDT) as Amendment 1:2006 to CAN/CSA-ISO/IEC 14492:04. This Amendment was reviewed by the CSATechnical Committ
3、ee on Information Technology (TCIT) under the jurisdiction of the Strategic SteeringCommittee on Information Technology and deemed acceptable for use in Canada.December 2006ISO/IEC 14492:2001/Amd.1:2004(E) PDF disclaimer This PDF file may contain embedded typefaces. In accordance with Adobes licensi
4、ng policy, this file may be printed or viewed but shall not be edited unless the typefaces which are embedded are licensed to and installed on the computer performing the editing. In downloading this file, parties accept therein the responsibility of not infringing Adobes licensing policy. The ISO C
5、entral Secretariat accepts no liability in this area. Adobe is a trademark of Adobe Systems Incorporated. Details of the software products used to create this PDF file can be found in the General Info relative to the file; the PDF-creation parameters were optimized for printing. Every care has been
6、taken to ensure that the file is suitable for use by ISO member bodies. In the unlikely event that a problem relating to it is found, please inform the Central Secretariat at the address given below. ISO/IEC 2004 All rights reserved. Unless otherwise specified, no part of this publication may be rep
7、roduced or utilized in any form or by any means, electronic or mechanical, including photocopying and microfilm, without permission in writing from either ISO at the address below or ISOs member body in the country of the requester. ISO copyright office Case postale 56 CH-1211 Geneva 20 Tel. + 41 22
8、 749 01 11 Fax + 41 22 749 09 47 E-mail copyrightiso.org Web www.iso.org Published by ISO in 2005 ii ISO/IEC 2004 All rights reservedISO/IEC 14492:2001/Amd.1:2004(E) ISO/IEC 2004 All rights reserved iiiForeword ISO (the International Organization for Standardization) and IEC (the International Elect
9、rotechnical Commission) form the specialized system for worldwide standardization. National bodies that are members of ISO or IEC participate in the development of International Standards through technical committees established by the respective organization to deal with particular fields of techni
10、cal activity. ISO and IEC technical committees collaborate in fields of mutual interest. Other international organizations, governmental and non-governmental, in liaison with ISO and IEC, also take part in the work. In the field of information technology, ISO and IEC have established a joint technic
11、al committee, ISO/IEC JTC 1. International Standards are drafted in accordance with the rules given in the ISO/IEC Directives, Part 2. The main task of the joint technical committee is to prepare International Standards. Draft International Standards adopted by the joint technical committee are circ
12、ulated to national bodies for voting. Publication as an International Standard requires approval by at least 75 % of the national bodies casting a vote. Attention is drawn to the possibility that some of the elements of this document may be the subject of patent rights. ISO and IEC shall not be held
13、 responsible for identifying any or all such patent rights. Amendment 1 to ISO/IEC 14492:2001 was prepared by Joint Technical Committee ISO/IEC JTC 1, Information technology, Subcommittee SC 29, Coding of audio, picture, multimedia and hypermedia information, in collaboration with ITU-T. The identic
14、al text is published as ITU-T Rec. T.88 (2000)/Amd.1. ISO/IEC 14492:2001/Amd.1:2004(E) iv ISO/IEC 2004 All rights reservedIntroduction In this amendment, the following new materials are added: a) new clauses 9, 10, and 11 to describe the required architecture and procedures for JBIG2 encoding; and b
15、) a new Annex J to document optional JBIG2 encoding methods. The encoding procedures in clauses 9 and 10 are essentially the inverse of the decoding procedures already described in clauses 6 and 7 of ITU-T Rec. T.88 | ISO/IEC 14492. To simplify the required new documentation, description of each of
16、the encoding procedures is given by referring to the corresponding decoding procedures in clauses 6 and 7, wherever applicable. Clause 11 and Annex J, however, are new material and thus contain more detailed documentation. In clause 11 (although the encoding complements that of clause 8 of ITU-T Rec
17、. T.88 | ISO/IEC 14492), JBIG2 encoding architecture as well as its technical components are described, and their corresponding implementation methods are given by reference. In J.1, compliant example encoding methods are summarized in table form. ISO/IEC 14492:2001/Amd.1:2004(E) ITU-T Rec. T.88 (20
18、00)/Amd.1 (06/2003) 1 INTERNATIONAL STANDARD ITU-T RECOMMENDATION Information technology Lossy/lossless coding of bi-level images Amendment 1 Encoder 1) New clauses 9, 10, and 11 Add the following clauses: 9 Encoding procedures The encoding procedures in this clause are essentially the inverse of th
19、e decoding procedures already described in clause 6 and will not be duplicated here. The inverse of generic region encoding is described in 6.2. The inverse of generic refinement encoding is described in 6.3. The inverse of text region encoding is described in 6.4. The inverse of symbol dictionary e
20、ncoding is described in 6.5. The inverse of halftone region encoding is described in 6.6. The inverse of pattern dictionary encoding is described in 6.7. 10 Control encoding procedures The control encoding procedures in this clause are essentially the inverse of the decoding control procedures alrea
21、dy described in clause 7 and will not be duplicated here. The inverse of segment header syntax encoding is described in 7.2. The inverse of segment type encoding is described in 7.3. The segment types syntax for the region segment information field, symbol dictionary segment, text region segment, pa
22、ttern dictionary segment, halftone region segment, generic region, generic refinement segment, end of page segment, end of stripe segment, end of file segment, profiles segment, code table segment and extension segment are described in detail in 7.4.1 to 7.4.15 respectively. 11 Page break-up The pag
23、e break-up (“Front end“) procedures in this clause are conceptually the inverse of the page make-up (“Back end“) procedures already described in clause 8. However, page break-up also requires additional page and document decomposition steps prior to encoding. 11.1 Page break-up architecture This cla
24、use describes the JBIG2 encoder break-up defined by compliant, but optional, technical components (with a range of algorithms possible to implement each of these components). These JBIG2 page break-up components are a set of processing steps labelled: Capture, Filter, Orient (de-skew), Identify, eXt
25、ract, Screen, Align (register), Match, Post-match, Dictionary (optimize), and Refine. An example sequence of this component set is illustrated in the Architectural Components figure below as the horizontal axis with abbreviated labels C F O I X S A M P D R (leading from input on the left to a compre
26、ssed data stream on the right). The vertical dimension above each label represents the range of possible algorithms that may be used to implement each component. The horizontal band illustrates an example JBIG2 compliant page break-up method, using some algorithm for each architectural component and
27、 spanning over these components. ISO/IEC 14492:2001/Amd.1:2004(E) 2 ITU-T Rec. T.88 (2000)/Amd.1 (06/2003) A compliant JBIG2 encoder need not include all architectural components, nor use them in exactly the above sequence. 11.2 Page image decomposition A page image is decomposed into several groups
28、 of sub-images such as marks J2, line-arts, residues and halftones. Each group is identified and then compressed using an appropriate set of processing (architectural components) from those summarized in 11.2.1 to 11.2.12. Processing may include one or more of these component techniques prior to bit
29、stream creation. The specific algorithm selected for each processing step is left up to the implementer but compliant examples for each processing step are provided in J.1. Implementing a full combination of these components, each using a compliant example encoding method, will result in an encoder
30、capable of producing reasonable near-lossless quality for most 300 + dpi images. 11.2.1 Capture (rasterize) Capture (rasterization) is a process by which an image source is converted into a two-dimensional bi-level raster image. This is done by mapping a region of the image source to a set of pixels
31、 of the raster image, and then assigning a 1-bit colour value to each pixel. In the scope of this amendment, two types of images are defined: generated and scanned images. A generated image is an image converted from a computer-generated metafile or vector graphic (e.g., a bitmap rasterized from a d
32、ocument created using a typical word processor), whereas a scanned image is an image obtained from a paper document by means of imaging hardware such as a scanner or facsimile. 11.2.2 Filter In most cases, a scanning process is noisy, and the resulting scanned image may contain random pixel values n
33、ot representative of the original source. These pixels or small groups of pixels are called flyspecks. It is often desireable to remove flyspecks in a scanned image to improve compression efficiency as well as visual quality of the reconstructed image. A scanned image also contains quantization erro
34、rs, i.e., identical marks in the original image may be slightly different in the scanned image. Smoothing the edges of the marks helps to recover the equivalence of such identical marks in the scanned image and also improve compression efficiency. These filtering techniques are shown as a reference
35、in J.1. Filtering is seldom required for generated images although these techniques may still be applied. 11.2.3 Orient (de-skew) A scanned image may be skewed when it is scanned or photocopied at a slight angle, and it is often beneficial to identify and adjust any skew prior to compression. In mos
36、t texts, marks (characters) are aligned in straight lines, and examining the slope of these lines that align pairs of marks yields the skew angle. Several methods of de-skewing are shown as a reference in J.1. 11.2.4 Identify Identification of sub-image categories involves two processes: segmentatio
37、n and classification. First an image is segmented into groups of sub-mages or regions having similar characteristics. These regions (segments) are then classified into pre-defined categories such as textual data, line-art and halftones, to which appropriate compression methods are applied. ARCHITECT
38、URAL COMPONENTS Page Break-up C F O I X S A M P D REncoding Procedure ISO/IEC 14492:2001/Amd.1:2004(E) ITU-T Rec. T.88 (2000)/Amd.1 (06/2003) 3 11.2.5 Extract A symbol (character) is a mark consisting of black pixels. A symbol boundary is first traced by observing the connectivity of black pixels, a
39、nd the adjacent black pixels are extracted to form a symbol. Although simply extracting all the pixels confined by the boundary may work in most cases, it does not handle nested marks. Several methods are shown as a reference in J.1. 11.2.6 Screen Comparing an extracted mark against all the symbols
40、in the dictionary is inefficient especially when the dictionary size is large, and relatively complex matching criteria as described in 11.2.8 are used. Simple methods, such as restricting comparisons to only be made between marks and dictionary symbols with similar width and height, can be used to
41、find possible matching candidates. More detailed approaches are shown as a reference in J.1. 11.2.7 Align (register) Symbols are often aligned (registered) in the dictionary using the same criteria selected for the screening method in 11.2.6. When distribution of black pixels is tested against symbo
42、ls in a dictionary to find matching candidates, aligning symbols along their centroids can enhance the screening rate. More detailed approaches are shown as a reference in J.1. 11.2.8 Match Marks are extracted from a region containing textual data and compared with existing symbols in a dictionary,
43、in order to exploit any similarities between them for better compression. Basically, each mark is tested to determine whether it is similar enough to be considered a match to one of the existing symbols. One way of matching is to first obtain a difference bitmap between the mark and a symbol and tes
44、t the number of black pixels in the difference bitmap to a pre-defined threshold. Giving more weight to the clustered black pixels in the difference bitmap usually improves matching results. When a close match is found, a reference to the matching symbol in the dictionary is coded. When there is no
45、close match, the extracted mark is stored as a new symbol in the dictionary. 11.2.9 Post-match Several additional criteria and processing steps may be applied to the symbol dictionary to improve image quality. A best dictionary symbol shape may be determined by examining several similar symbols, whi
46、ch have already passed the matching step. Direct encoding of a symbol or alignment of symbol bottoms may also be used to improve symbol dictionary accuracy. 11.2.10 Dictionary (optimize) After a symbol dictionary has been generated, it may be examined further to identify any singletons J2. Singleton
47、s are symbols in the dictionary that have not been referenced by more than one mark. One may sometimes wish to remove such symbols from the dictionary and place them back into the residue sub-images (which contain any residual marks). Such a residue image is compressed using a JBIG2 generic entropy
48、encoder. 11.2.11 Refine Encoded image (or symbol) bitmaps may also be subsequently refined to similar (but different) bitmaps J1, J4. For example, where images are first encoded in a near-lossless manner (e.g., when scanned image symbols are encoded using dictionaries), they can be subsequently refi
49、nement encoded to a fully lossless representation of the original image. Also, successive dictionary symbols may be more efficiently encoded as refinements of symbols encoded previously. 11.3 Multi-page document composition An encoder may organize multi-page document segments using a sequential, random or embedded organization as described in Annex D (File Formats). Dictionary segments may be organized into one global segment, one or more segments per page or stripe, or a combination of global and page-specific dictiona