1、ANSI INCITS 241-1994 (R1999)(formerly ANSI X3.241-1994 (R1999)for Information Systems Data Compression Method Adaptive Coding with Sliding Windowfor Information InterchangeCopyright American National Standards Institute Provided by IHS under license with ANSINot for ResaleNo reproduction or networki
2、ng permitted without license from IHS-,-,-AmericanNationalStandardApproval of an American National Standard requires review by ANSI that therequirements for due process, consensus, and other criteria for approval havebeen met by the standards developer.Consensus is established when, in the judgment
3、of the ANSI Board of StandardsReview, substantial agreement has been reached by directly and materiallyaffected interests. Substantial agreement means much more than a simplemajority, but not necessarily unanimity. Consensus requires that all views andobjections be considered, and that a concerted e
4、ffort be made toward theirresolution.The use of American National Standards is completely voluntary; their existencedoes not in any respect preclude anyone, whether he has approved the standardsor not, from manufacturing, marketing, purchasing, or using products, processes,or procedures not conformi
5、ng to the standards.The American National Standards Institute does not develop standards and will inno circumstances give an interpretation of any American National Standard.Moreover, no person shall have the right or authority to issue an interpretation ofan American National Standard in the name o
6、f the American National StandardsInstitute. Requests for interpretations should be addressed to the secretariat orsponsor whose name appears on the title page of this standard.CAUTION NOTICE: This American National Standard may be revised orwithdrawn at any time. The procedures of the American Natio
7、nal StandardsInstitute require that action be taken periodically to reaffirm, revise, or withdrawthis standard. Purchasers of American National Standards may receive currentinformation on all standards by calling or writing the American National StandardsInstitute.CAUTION: The developers of this sta
8、ndard have requested that holders of patents that may be required for theimplementation of the standard disclose such patents to the publisher. However, neither the developers nor the publisherhave undertaken a patent search in order to identify which, if any, patents may apply to this standard. As
9、of the date ofpublication of this standard, following calls for the identification of patents that may be required for the implementation ofthe standard, notice of one or more such claims has been received. By publication of this standard, no position is takenwith respect to the validity of this cla
10、im or of any rights in connection therewith. The known patent holder(s) has (have),however, filed a statement of willingness to grant a license under these rights on reasonable and nondiscriminatory termsand conditions to applicants desiring to obtain such a license. Details may be obtained from the
11、 publisher. No furtherpatent search is conducted by the developer or publisher in respect to any standard it processes. No representation ismade or implied that this is the only license that may be required to avoid infringement in the use of this standard.Published byAmerican National Standards Ins
12、titute11 West 42nd Street, New York, New York 10036Copyright 1994 by Information Technology Industry Council (ITI)All rights reserved.No part of this publication may be reproduced in anyform, in an electronic retrieval system or otherwise,without prior written permission of ITI, 1250 Eye Street NW,W
13、ashington, DC 20005.Printed in the United States of AmericaCopyright American National Standards Institute Provided by IHS under license with ANSINot for ResaleNo reproduction or networking permitted without license from IHS-,-,-ANSIX3.241-1994American National Standardfor Information Systems Data C
14、ompression Method Adaptive Coding with Sliding Windowfor Information InterchangeSecretariatComputer and Business Equipment Manufacturers AssociationApproved August 30, 1994American National Standards Institute, Inc.Copyright American National Standards Institute Provided by IHS under license with AN
15、SINot for ResaleNo reproduction or networking permitted without license from IHS-,-,-iiContentsPageForewordii1 Scope and conformance 12 Normative references.13 Definitions.14 Algorithm identifier.15 Data format .2AnnexesA Encoding overview.5B Example compression encoding method.8Copyright American N
16、ational Standards Institute Provided by IHS under license with ANSINot for ResaleNo reproduction or networking permitted without license from IHS-,-,-iiiForeword (This foreword is not part of American National Standard X3.241-1994.)This standard specifies a lossless data compression method that is i
17、ntendedfor general purposes. It contains features that make it particularly applicableto systems for recording information on interchangeable media.This standard was developed by Technical Committee X3B5, by X3 project882. The first draft was produced in November 1991. The second draft wasproduced i
18、n March 1992.There are two annexes in this standard. Both are informative and are notconsidered part of this standard.Requests for interpretation, suggestions for improvement or addenda, ordefect reports are welcome. They should be sent to the X3 Secretariat,Computer and Business Equipment Manufactu
19、rers Association, 1250 EyeStreet NW, Suite 200, Washington, DC 20005.This standard was processed and approved for submittal to ANSI by theAccredited Standards Committee on Information Processing Systems, X3.Committee approval of this standard does not necessarily imply that allcommittee members vote
20、d for its approval. At the time it approved thisstandard, the X3 Committee had the following members:James D. Converse, ChairDonald C. Loughry, Vice-ChairJoanne Flanagan, SecretaryOrganization Represented Name of RepresentativeAmerican Nuclear Society Geraldine C. MainSally Hartzell (Alt.)AMP, IncEd
21、ward KellyCharles Brill (Alt.)Apple Computer, Inc. .Karen HigginbottomDavid K. Michael (Alt.)AT an offset field (described in 5.5); a length field (described in 5.6).5.5 Offset fieldThe offset field of a string token is a variable-length bit pattern that represents the distance5 Data format5.1 Overv
22、iewThe data compression encoding method isdesigned to support an adaptive string com-pression algorithm that can find redundantmultiple byte patterns in the input data streamand replace them with shorter tokens in thecompressed output data stream. The outputdata stream alone may be used to reconstru
23、ctthe original input stream completely andexactly.5.2 Principle of operationThe input data stream and the output datastream shall consist of a stream of bytes.Within a byte, the bits shall be arranged withbit b8 as the most significant, and bit b1 asthe least significant.The output data bytes are co
24、mposed of astream of fields with a variable number of bitsin each field. The most significant bit of a fieldshall be placed into the most significantunused bit location of an output byte. Theother bits of a field shall be placed in bit loca-tions in an output byte by proceeding insequence towards th
25、e least significant end ofthe output byte. The end marker (see 5.7)may be used to force the output data streamto a byte-boundary.The output data stream consists of the follow-ing three field types: a raw byte token; a string token; an end marker.The structure of these fields is describedlater. Each
26、field may be present multiple timesand may appear in any order. Each field shallbe completed before a new field may begin.The encoding algorithm shall maintain a 2048-byte history buffer. Each byte that enters theencoding algorithm from the input data streamshall be placed into the history buffer. W
27、henthe history buffer becomes full, the newestbytes shall replace the oldest bytes. The histo-ry buffer shall always contain the last 2048bytes that have entered the encoding algorithmsince the last clearing of the history buffer.ANSI X3.241-19942Copyright American National Standards Institute Provi
28、ded by IHS under license with ANSINot for ResaleNo reproduction or networking permitted without license from IHS-,-,-ANSI X3.241-1994in bytes within the history buffer from the firstbyte of the matching pattern to the first byte ofthe source pattern. The number of bytes isreferred to as offset. The
29、minimum value ofoffset is 1, which represents the byte thatentered the history buffer just before thesource pattern. The maximum value of offsetis 2047.If the value of offset is less than or equal to127, an 8-bit pattern shall be generated. Bitb8 shall be a 1. Bits b1 through b7 shall bethe binary v
30、alue of offset.If the value of offset is greater than 127, a 12-bit pattern shall be generated. Bit b12 shallalways be a 0. Bits b11 through b1 shall bethe binary value of offset.5.6 Length fieldThe length field is a variable-length bit patternthat represents the length in bytes of thesource pattern
31、. The number of bytes isreferred to as length. The minimum value oflength is 2. The maximum value of length isunbounded.If the value of length is less than or equal to4, a 2-bit pattern shall be generated. These 2bits shall be the binary value of:(length 2).If the value of length is greater than 4,
32、andless than or equal to 7, a 4-bit pattern shall begenerated. Bits b3 and b4 shall always be avalue of 1. Bits b1 and b2 shall be the binaryvalue of:(length 5).If the value of length is greater than 7, a vari-able-bit pattern shall be used. Multiple 4-bitpatterns shall be generated with all bits se
33、t to1. The number of 4-bit patterns shall be:(N + 1), where N is the integer result of(length 8) 15).Then a 4-bit pattern shall be generated with abinary value of the remainder from the divi-sion operation.5.7 End markerThe end marker is a unique 9-bit pattern thatshall be generated at the end of a
34、block ofdata. The history buffer may be optionallycleared at the end of a block of data. The min-imum number of bytes that may be included ina block of data is 0. The maximum number ofbytes is unbounded.Bits b8 and b9 shall be 1s. Bits b1 through b7shall be 0s. Additional bits with a value of 0shall
35、 be written to the output bit stream untilan output byte-boundary is reached.3Copyright American National Standards Institute Provided by IHS under license with ANSINot for ResaleNo reproduction or networking permitted without license from IHS-,-,-ANSI X3.241-19944A.1 Encoding overviewThis annex con
36、tains an example algorithm that may be used to encode an input stream of data. Thisalgorithm consists of several processes. The process Encode is used to encode a complete block ofdata. This process will invoke other processes throughout its execution.This algorithm assumes that two buffers are avai
37、lable that are infinite in size. Every new byte thatenters the algorithm is inserted at the beginning of the history_buffer (while the entire contents ofthe history_buffer is moved by 1 byte to make space available). Every new byte is also inserted atthe beginning of the holding_buffer, but the hold
38、ing_buffer is emptied whenever a token is gener-ated that represents the data stored in the holding_buffer. The most recent 2048 bytes in the his-tory_buffer represent the history buffer as defined in 3.3. The holding_buffer represents thesource pattern as defined in 3.7.Other algorithms may be used
39、 to encode an input steam of data. Other algorithms need notrequire buffers of infinite size.A.1.1 EncodeThis process is used to compress an entire block of data. When this process is finished, the inputbyte stream will have been fully encoded and sent to the output byte stream.Figure A.1 Encode pro
40、cessA.1.2 Read_byteThis process is invoked by the Encode process. It accepts a single byte from the input bytestream and pushes that byte onto both the history_buffer and the holding_buffer.Figure A.2 Read_byte processProcess = Read_byte.Get 8-bit byte from input stream.Insert byte into history_buff
41、er.Insert byte into holding_buffer.Endprocess.Process = Encode.While (input data exists).Read_byte.If (there is no matching pattern in the history_buffer that exactly matches the source pat- tern in the holding_buffer, that also satisfies the condition that offset is less than 2047).Output_token.End
42、if.Endwhile.Flush.Endprocess.Annex A(informative)Example encoding algorithmCopyright American National Standards Institute Provided by IHS under license with ANSINot for ResaleNo reproduction or networking permitted without license from IHS-,-,-ANSI X3.241-19945A.1.3 Output_tokenThis process is invo
43、ked by the Encode process. If a single byte is being processed, it will be out-put as a raw byte token. If multiple bytes are being processed, they will be output as a stringtoken. A string token consists of an offset and a length.Figure A.3 Output_token processProcess = Output_token.If (number of b
44、ytes in holding_buffer 2).Put single 0 bit to output bit stream.Put oldest byte in holding_buffer to output bit stream.Clear the oldest byte from the holding_buffer.Elseif.Put single 1 bit to output bit stream.If (Offset 127).Put single 1 bit to output bit stream.Put 7-bit binary value of offset to
45、output stream.Elseif.Put single 0 bit to output bit stream.Put 11-bit binary value of offset to output stream.Endif.Output_length.Clear all bytes from the holding_buffer except the newest byte.Endif.Endprocess.Copyright American National Standards Institute Provided by IHS under license with ANSINot
46、 for ResaleNo reproduction or networking permitted without license from IHS-,-,-ANSI X3.241-19946A.1.4 Output_lengthThis process is invoked by the Output_token process. It will output the length portion of the stringtoken being processedFigure A.4 Output_length processA.1.5 OffsetThis process is inv
47、oked by the Output_token process. This process will calculate and return theoffset portion of the string token to the Output_token process.Figure A.5 Offset processProcess = Offset.Return the value of (the distance in bytes within the history_buffer from the first byte of the source pattern to the f
48、irst byte of the matching pattern).Endprocess.Process = Output_length.Set X to (number of bytes in holding_buffer 1).If (X 4).Put 2-bit binary value of (X 2) to output stream.Elseif.If (X 7).Put 2-bit pattern with all bits set to a 1 bit to output stream.Put (2-bit binary value of (X 5) to output st
49、ream.Elseif.Put 4-bit pattern with all bits set to a 1 bit to output stream.Set X to (X 8).While (X 15).Put 4-bit pattern with all bits set to a 1 bit to output stream.Set X to (X 15).Endwhile.Put 4-bit binary value of X to output stream.Endif.Endif.Endprocess.Copyright American National Standards Institute Provided by IHS under license with ANSINot for ResaleNo reproduction or networkin