ITU-T T 88 AMD 1-2003 Information technology C Lossy lossless coding of bi-level images Amendment 1 Encoder (Study Group 16)《信息技术二级图像的有损无损编码增补第1次发布 2004 12 16》.pdf

资源描述

1、 INTERNATIONAL TELECOMMUNICATION UNION ITU-T T.88TELECOMMUNICATION STANDARDIZATION SECTOR OF ITU Amendment 1(06/2003) SERIES T: TERMINALS FOR TELEMATIC SERVICES Information technology Lossy/lossless coding of bi-level images Amendment 1: Encoder ITU-T Recommendation T.88 (2000) Amendment 1 ITU-T Rec

2、 T.88 (2000)/Amd.1 (06/2003) i INTERNATIONAL STANDARD ISO/IEC 14492 ITU-T RECOMMENDATION T.88 Information technology Lossy/lossless coding of bi-level images Amendment 1 Encoder Summary The objective of this amendment is to extend the ITU-T Rec. T.88 | ISO/IEC 14492 (JBIG2) standard for compression

3、 of bi-level images, currently specified in terms of compliant data streams and decoding, to also cover compliant encoding. In particular, this amendment: 1) specifies a compliant architecture for JBIG2 encoding; and 2) illustrates example encoding methods for each optional architecture component. S

4、pecification of the compliant architecture facilitates applications needing the benefits of a combined encoding/decoding standard, while still permitting competition in the encoding method chosen for any component. The compliant encoding method(s) chosen as examples for each architectural component

5、were selected to target expired (or expiring) patents, openly published methods or royalty-free patents. Selection of a representative method for each optional component should result in reasonable encoder performance in both compression and quality. Using the full combination of components, togethe

6、r with their example encoding methods, provides new users with a default encoder design and a benchmark for JBIG2 encoding that illustrates reasonable near-lossless quality for documents with 300 dpi, or higher, resolution. Source Amendment 1 to ITU-T Recommendation T.88 (2000) was approved on 29 Ju

7、ne 2003 by ITU-T Study Group 16 (2001-2004) under the ITU-T Recommendation A.8 procedure. An identical text is also published as ISO/IEC 14492, Amendment 1. This amendment includes the modifications introduced by erratum 1 on 16 December 2004. ii ITU-T Rec. T.88 (2000)/Amd.1 (06/2003) FOREWORD The I

8、nternational Telecommunication Union (ITU) is the United Nations specialized agency in the field of telecommunications. The ITU Telecommunication Standardization Sector (ITU-T) is a permanent organ of ITU. ITU-T is responsible for studying technical, operating and tariff questions and issuing Recomm

9、endations on them with a view to standardizing telecommunications on a worldwide basis. The World Telecommunication Standardization Assembly (WTSA), which meets every four years, establishes the topics for study by the ITU-T study groups which, in turn, produce Recommendations on these topics. The a

10、pproval of ITU-T Recommendations is covered by the procedure laid down in WTSA Resolution 1. In some areas of information technology which fall within ITU-Ts purview, the necessary standards are prepared on a collaborative basis with ISO and IEC. NOTE In this Recommendation, the expression “Administ

11、ration“ is used for conciseness to indicate both a telecommunication administration and a recognized operating agency. Compliance with this Recommendation is voluntary. However, the Recommendation may contain certain mandatory provisions (to ensure e.g. interoperability or applicability) and complia

12、nce with the Recommendation is achieved when all of these mandatory provisions are met. The words “shall“ or some other obligatory language such as “must“ and the negative equivalents are used to express requirements. The use of such words does not suggest that compliance with the Recommendation is

13、required of any party. INTELLECTUAL PROPERTY RIGHTS ITU draws attention to the possibility that the practice or implementation of this Recommendation may involve the use of a claimed Intellectual Property Right. ITU takes no position concerning the evidence, validity or applicability of claimed Inte

14、llectual Property Rights, whether asserted by ITU members or others outside of the Recommendation development process. As of the date of approval of this Recommendation, ITU had received notice of intellectual property, protected by patents, which may be required to implement this Recommendation. Ho

15、wever, implementors are cautioned that this may not represent the latest information and are therefore strongly urged to consult the TSB patent database. ITU 2005 All rights reserved. No part of this publication may be reproduced, by any means whatsoever, without the prior written permission of ITU.

16、 ITU-T Rec. T.88 (2000)/Amd.1 (06/2003) iii CONTENTS Page 1) New clauses 9, 10, and 11 1 2) New Annex J 5 iv ITU-T Rec. T.88 (2000)/Amd.1 (06/2003) Introduction In this amendment, the following new materials are added: a) new clauses 9, 10, and 11 to describe the required architecture and procedures

17、 for JBIG2 encoding; and b) a new Annex J to document optional JBIG2 encoding methods. The encoding procedures in clauses 9 and 10 are essentially the inverse of the decoding procedures already described in clauses 6 and 7 of ITU-T Rec. T.88 | ISO/IEC 14492. To simplify the required new documentatio

18、n, description of each of the encoding procedures is given by referring to the corresponding decoding procedures in clauses 6 and 7, wherever applicable. Clause 11 and Annex J, however, are new material and thus contain more detailed documentation. In clause 11 (although the encoding complements tha

19、t of clause 8 of ITU-T Rec. T.88 | ISO/IEC 14492), JBIG2 encoding architecture as well as its technical components are described, and their corresponding implementation methods are given by reference. In J.1, compliant example encoding methods are summarized in table form. ISO/IEC 14492:2001/Amd.1:2

20、004 (E) ITU-T Rec. T.88 (2000)/Amd.1 (06/2003) 1 INTERNATIONAL STANDARD ITU-T RECOMMENDATION Information technology Lossy/lossless coding of bi-level images Amendment 1 Encoder 1) New clauses 9, 10, and 11 Add the following clauses: 9 Encoding procedures The encoding procedures in this clause are es

21、sentially the inverse of the decoding procedures already described in clause 6 and will not be duplicated here. The inverse of generic region encoding is described in 6.2. The inverse of generic refinement encoding is described in 6.3. The inverse of text region encoding is described in 6.4. The inv

22、erse of symbol dictionary encoding is described in 6.5. The inverse of halftone region encoding is described in 6.6. The inverse of pattern dictionary encoding is described in 6.7. 10 Control encoding procedures The control encoding procedures in this clause are essentially the inverse of the decodi

23、ng control procedures already described in clause 7 and will not be duplicated here. The inverse of segment header syntax encoding is described in 7.2. The inverse of segment type encoding is described in 7.3. The segment types syntax for the region segment information field, symbol dictionary segme

24、nt, text region segment, pattern dictionary segment, halftone region segment, generic region, generic refinement segment, end of page segment, end of stripe segment, end of file segment, profiles segment, code table segment and extension segment are described in detail in 7.4.1 to 7.4.15 respectivel

25、y. 11 Page break-up The page break-up (“Front end“) procedures in this clause 9 are conceptually the inverse of the page make-up (“Back end“) procedures already described in clause 8. However, pagePage break-up also requires additional page and document decomposition steps prior to encoding. 11.1 Pa

26、ge break-up architectureEncoder model This clause describes the JBIG2 encoding architectureencoder break-up defined by compliant, but optional, technical components (with a range of algorithms possible to implement each of these components). These JBIG2 page break-upencoding components are a set of

27、processing steps labelled: Capture, Filter, Orient (de-skew), Identify, eXtract, Screen, Align (register), Match, Post-match, Dictionary (optimize), and Refine and Encode (bitstream generation). An example sequence of this component set is illustrated in the Encoder Architecturale Components figure

28、below as the horizontal axis with abbreviated labels C F O I X S A M P D R E (leading from input on the left to a compressed data stream on the right). The vertical dimension above each label represents the range of possible algorithms that may be used to implement each component. The horizontal ban

29、d illustrates an example JBIG2 compliant encoding page break-up method, using some algorithm for each architectural component of this JBIG2 encoding and spanning overall these components. ISO/IEC 14492:2001/Amd.1:2004 (E) 2 ITU-T Rec. T.88 (2000)/Amd.1 (06/2003) T.88AMD.1_F11.1C O X AM DRPage break-

30、up Encoding procedure Architectural componentsT.88AMD.1_F11.1EncoderNormativeInformativeScopeArchitectureOptionalmethodCFOIXSAMPDREcompresseddataTechnology componentsA compliant JBIG2 encoder need not include all architectural components, nor use them in exactly the above sequence. 11.2 Page image d

31、ecomposition A page image is decomposed into several groups of sub-images such as marks J2, line-arts, residues and halftones. Each group is identified and then compressed using an appropriate set of processing (architectural components) from those summarized in 11.2.1 to 11.2.121. Processing may in

32、clude one or more of these component techniques prior to bitstream creation. The specific algorithm selected for each processing step is left up to the implementer but compliant examples for each processing step are provided in J.1. Implementing a full combination of these components, each using a c

33、ompliant example encoding method, will result in an encoder capable of producing reasonable near-lossless quality for most 300 + dpi images. 11.2.1 Capture (rasterize) Capture (rasterization) is a process by which an image source is converted into a two-dimensional bi-level raster image. This is don

34、e by mapping a region of the image source to a set of pixels of the raster image, and then assigning a 1-bit colour value to each pixel. In the scope of this amendment, two types of images are defined: generated and scanned images. A generated image is an image converted from a computer-generated me

35、tafile or vector graphic (e.g., a bitmap rasterized from a document created using a typical word processor), whereas a scanned image is an image obtained from a paper document by means of imaging hardware such as a scanner or facsimile. 11.2.2 Filter In most cases, a scanning process is noisy, and t

36、he resulting scanned image may contain random pixel values not representative of the original source. These pixels or small groups of pixels are called flyspecks. It is often desireable to remove flyspecks in a scanned image to improve compression efficiency as well as visual quality of the reconstr

37、ucted image. A scanned image also contains quantization errors, i.e., identical marks in the original image may be slightly different in the scanned image. Smoothing the edges of the marks helps to recover the equivalence of such identical marks in the scanned image and also improve compression effi

38、ciency. These filtering techniques are shown as a reference in J.1. Filtering is seldom required for generated images although these techniques may still be applied. 11.2.3 Orient (de-skew) A scanned image may be skewed when it is scanned or photocopied at a slight angle, and it is often beneficial

39、to identify and adjust any skew prior to compression. In most texts, marks (characters) are aligned in straight lines, and examining the slope of these lines that align pairs of marks yields the skew angle. Several methods of de-skewing are shown as a reference in J.1. ISO/IEC 14492:2001/Amd.1:2004

40、E) ITU-T Rec. T.88 (2000)/Amd.1 (06/2003) 3 11.2.4 Identify Identification of sub-image categories involves two processes: segmentation and classification. First an image is segmented into groups of sub-mages or regions having similar characteristics. These regions (segments) are then classified in

41、to pre-defined categories such as textual data, line-arts and halftones, to which appropriate compression methods are applied. 11.2.5 Extract A symbol (character) is a mark consisting of black pixels. A symbol boundary is first traced by observing the connectivity of black pixels, and the adjacent b

42、lack pixels are extracted to form a symbol. Although simply extracting all the pixels confined by the boundary may work in most cases, it does not handle nested marks. Several methods are shown as a reference in J.1. 11.2.6 Screen Comparing an extracted mark against all the symbols in the dictionary

43、 is inefficient especially when the dictionary size is large, and relatively complex matching criteria as described in 11.2.8 are used. Simple methods, such as restricting comparisons to only be made between marks and dictionary symbols with similar width and height, can be used to find possible mat

44、ching candidates. More detailed approaches are shown as a reference in J.1. 11.2.7 Align (register) Symbols are often aligned (registered) in the dictionary using the same criteria selected for the screening method in 11.2.6. When distribution of black pixels is tested against symbols in a dictionar

45、y to find matching candidates, aligning symbols along their centroids can enhance the screening rate. More detailed approaches are shown as a reference in J.1. 11.2.8 Match Marks are extracted from a region containing textual data and compared with existing symbols in a dictionary, in order to explo

46、it any similarities between them for better compression. Basically, each mark is tested to determine whether it is similar enough to be considered a match to one of the existing symbols. One way of matching is to first obtain a difference bitmap between the mark and a symbol and test the number of b

47、lack pixels in the difference bitmap to a pre-defined threshold. Giving more weight to the clustered black pixels in the difference bitmap usually improves matching results. When a close match is found, a reference to the matching symbol in the dictionary is coded. When there is no close match, the

48、extracted mark is stored as a new symbol in the dictionary. 11.2.9 Post-match Several additional criteria and processing steps may be applied to the symbol dictionary to improve image quality. A best dictionary symbol shape may be determined by examining several similar symbols, which have already p

49、assed the matching step. Direct encoding of a symbol or alignment of symbol bottoms may also be used to improve symbol dictionary accuracy. 11.2.10 Dictionary (optimize) After a symbol dictionary has been generated, it may be examined further to identify any singletons J2. Singletons are symbols in the dictionary that have not been referenced by any other symbolsmore than one mark. One may sometimes wish to remove such symbols from the dictionary and place them back into the residue sub-images (which contain any residual marks). Such a residue image is compressed

展开阅读全文

ITU-T T 88 AMD 1-2003 Information technology C Lossy lossless coding of bi-level images Amendment 1 Encoder (Study Group 16)《信息技术 二级图像的有损 无损编码 增补第1次 发布 2004 12 16》.pdf

ITU-T T 88 AMD 1-2003 Information technology C Lossy lossless coding of bi-level images Amendment 1 Encoder (Study Group 16)《信息技术二级图像的有损无损编码增补第1次发布 2004 12 16》.pdf