1、 Reference numberISO/IEC 22091:2002(E)ISO/IEC 2002INTERNATIONAL STANDARD ISO/IEC22091First edition2002-09-15Information technology Streaming Lossless Data Compression algorithm (SLDC) Technologies de linformation Algorithme de compression sans perte de donnes en mode continu (SDLC) Adopted by INCITS
2、 (InterNational Committee for Information Technology Standards) as an American National Standard.Date of ANSI Approval: 7/7/2003Published by American National Standards Institute,25 West 43rd Street, New York, New York 10036Copyright 2003 by Information Technology Industry Council (ITI).All rights r
3、eserved.These materials are subject to copyright claims of International Standardization Organization (ISO), InternationalElectrotechnical Commission (IEC), American National Standards Institute (ANSI), and Information Technology Industry Council(ITI). Not for resale. No part of this publication may
4、 be reproduced in any form, including an electronic retrieval system, withoutthe prior written permission of ITI. All requests pertaining to this standard should be submitted to ITI, 1250 Eye Street NW,Washington, DC 20005.Printed in the United States of AmericaISO/IEC 22091:2002(E) PDF disclaimer T
5、his PDF file may contain embedded typefaces. In accordance with Adobes licensing policy, this file may be printed or viewed but shall not be edited unless the typefaces which are embedded are licensed to and installed on the computer performing the editing. In downloading this file, parties accept t
6、herein the responsibility of not infringing Adobes licensing policy. The ISO Central Secretariat accepts no liability in this area. Adobe is a trademark of Adobe Systems Incorporated. Details of the software products used to create this PDF file can be found in the General Info relative to the file;
7、 the PDF-creation parameters were optimized for printing. Every care has been taken to ensure that the file is suitable for use by ISO member bodies. In the unlikely event that a problem relating to it is found, please inform the Central Secretariat at the address given below. ISO/IEC 2002 All right
8、s reserved. Unless otherwise specified, no part of this publication may be reproduced or utilized in any form or by any means, electronic or mechanical, including photocopying and microfilm, without permission in writing from either ISO at the address below or ISOs member body in the country of the
9、requester. ISO copyright office Case postale 56 CH-1211 Geneva 20 Tel. + 41 22 749 01 11 Fax + 41 22 749 09 47 E-mail copyrightiso.ch Web www.iso.ch Printed in Switzerland ii ISO/IEC 2002 All rights reservedISO/IEC 22091:2002(E) ISO/IEC 2002 All rights reserved iiiContents 1 Scope 1 2 Conformance 1
10、3 Normative reference 1 4 Terms and definitions 1 4.1 Access Point 1 4.2 Control Symbol 1 4.3 Copy Pointer 1 4.4 data byte 1 4.5 Data Symbol 1 4.6 Displacement Field 1 4.7 Encoded Data Stream 1 4.8 Encoded Record 1 4.9 End Marker 2 4.10 End Of Record Symbol (EOR Symbol) 2 4.11 File Mark 2 4.12 File
11、Mark Symbol 2 4.13 Flush Symbol 2 4.14 History Buffer 2 4.15 Literal 1 2 4.16 Literal 2 2 4.17 Matching String 2 4.18 Match Count 2 4.19 Match Count Field 2 4.20 Pad 2 4.21 Record 2 4.22 Record Segment 2 4.23 Reset X Symbol 2 4.24 Reset 1 Symbol 2 4.25 Reset 2 Symbol 2 4.26 scheme 1 2 4.27 Scheme 1
12、Symbol 2 4.28 scheme 2 3 4.29 Scheme 2 Symbol 3 4.30 user data 3 5 Conventions and Notations 3 5.1 Representation of numbers 3 5.2 Names 3 6 Acronyms 3 7 Algorithm Overview 3 7.1 Scheme 1 Encoding 3 7.2 Scheme 2 Encoding 3 7.3 History Buffer 4 8 Encoding Specification 4 8.1 User Data 4 8.2 History B
13、uffer 4 8.3 Encoded Data Stream 4 ISO/IEC 22091:2002(E) iv ISO/IEC 2002 All rights reserved8.3.1 Access Point 5 8.4 Data Symbols 5 8.4.1 Literal 1 Data Symbols 5 8.4.2 Copy Pointer Data Symbols 5 8.4.3 Literal 2 Data Symbols 6 8.5 Control Symbols 7 8.6 Pad 8 ISO/IEC 22091:2002(E) ISO/IEC 2002 All ri
14、ghts reserved vForeword ISO (the International Organization for Standardization) and IEC (the International Electrotechnical Commission) form the specialized system for worldwide standardization. National bodies that are members of ISO or IEC participate in the development of International Standards
15、 through technical committees established by the respective organization to deal with particular fields of technical activity. ISO and IEC technical committees collaborate in fields of mutual interest. Other international organizations, governmental and non-governmental, in liaison with ISO and IEC,
16、 also take part in the work. In the field of information technology, ISO and IEC have established a joint technical committee, ISO/IEC JTC 1. International Standards are drafted in accordance with the rules given in the ISO/IEC Directives, Part 3. The main task of the joint technical committee is to
17、 prepare International Standards. Draft International Standards adopted by the joint technical committee are circulated to national bodies for voting. Publication as an International Standard requires approval by at least 75 % of the national bodies casting a vote. Attention is drawn to the possibil
18、ity that some of the elements of this International Standard may be the subject of patent rights. ISO and IEC shall not be held responsible for identifying any or all such patent rights. ISO/IEC 22091 was prepared by ECMA (as ECMA-321) and was adopted, under a special “fast-track procedure”, by Join
19、t Technical Committee ISO/IEC JTC 1, Information Technology, in parallel with its approval by national bodies of ISO and IEC. INTERNATIONAL STANDARD ISO/IEC 22091:2002(E) ISO/IEC 2002 All rights reserved 1Information technology Streaming Lossless Data Compression algorithm (SLDC) 1 Scope This Intern
20、ational Standard specifies a lossless compression algorithm to reduce the number of 8-bit bytes required to represent data records and File Marks. The algorithm is known as Streaming Lossless Data Compression algorithm (SLDC). One buffer size (1 024 bytes) is specified. The numerical identifier acco
21、rding to ISO/IEC 11576 allocated to this algorithm is 6. 2 Conformance A compression algorithm shall be in conformance with this International Standard if its Encoded Data Stream satisfies the requirements of this International Standard. 3 Normative reference The following normative document contain
22、s provisions which, through reference in this text, constitute provisions of this International Standard. For dated references, subsequent amendments to, or revisions of, any of these publications do not apply. However, parties to agreements based on this International Standard are encouraged to inv
23、estigate the possibility of applying the most recent editions of the normative document indicated below. For undated references, the latest edition of the normative document referred to applies. Members of ISO and IEC maintain registers of currently valid International Standards. ISO/IEC 11576:1994
24、Information technology Procedure for the registration of algorithms for the lossless compression of data 4 Terms and definitions For the purpose of this International Standard the following terms and definitions apply. 4.1 Access Point A location in the Encoded Data Stream at which data may be decod
25、ed. 4.2 Control Symbol A Control Symbol may change the compression scheme, reset the History Buffer, mark the end of a Record, indicate a File Mark, or indicate the termination of an Encoded Data Stream. 4.3 Copy Pointer A part of the Encoded Data Stream output in scheme 1 that replaces a string of
26、data bytes with a specification of a Matching String. 4.4 data byte An element of user data that is to be encoded. 4.5 Data Symbol An element of an Encoded Record that represents one or more data bytes. 4.6 Displacement Field A field in the Copy Pointer that specifies the location within the History
27、 Buffer of the first byte of a Matching String. 4.7 Encoded Data Stream The output stream after encoding User Data. 4.8 Encoded Record The output stream after encoding one Record of user data. ISO/IEC 22091:2002(E) 2 ISO/IEC 2002 All rights reserved4.9 End Marker A Control Symbol that denotes termin
28、ation of an Encoded Data Stream. 4.10 End Of Record Symbol (EOR Symbol) A Control Symbol that denotes the end of a Record in the Encoded Data Stream. 4.11 File Mark A recorded element used to mark organisational boundaries (e.g. directory boundaries) in user data. 4.12 File Mark Symbol A Control Sym
29、bol in Encoded Data Stream that denotes a File Mark in user data. 4.13 Flush Symbol A Control Symbol that, if required, is followed by Pad to make the size of the Encoded Data Stream an integer multiple of 32 bits. 4.14 History Buffer A data structure where incoming data bytes are stored for use by
30、scheme 1 compression and decompression. 4.15 Literal 1 A part of the Encoded Data Stream, output in scheme 1, that represents a single data byte not encoded into any Copy Pointer. 4.16 Literal 2 A part of the Encoded Data Stream, output in scheme 2, that represents a single data byte. 4.17 Matching
31、String A sequence of two or more bytes in the History Buffer that is identical with a sequence of bytes in the user data. 4.18 Match Count The length, in bytes, of a Matching String. 4.19 Match Count Field That part of a Copy Pointer that specifies the Match Count. 4.20 Pad A number of bits inserted
32、 into the Encoded Data Stream so that the size of Encoded Data Stream is an integer multiple of 32 bits. 4.21 Record An element of user data that contains at least one data byte. 4.22 Record Segment A section of a Record encoded in a given scheme. 4.23 Reset X Symbol A generic reference to either th
33、e Reset 1 Symbol or the Reset 2 Symbol. 4.24 Reset 1 Symbol A Control Symbol that indicates History Buffer reset, and that subsequent symbols are encoded in scheme 1. 4.25 Reset 2 Symbol A Control Symbol that indicates History Buffer reset, and that subsequent symbols are encoded in scheme 2. 4.26 s
34、cheme 1 A compression scheme that uses a History Buffer to achieve data compression. 4.27 Scheme 1 Symbol A Control Symbol that indicates subsequent Data Symbols are either Copy Pointers or Literal 1s. ISO/IEC 22091:2002(E) ISO/IEC 2002 All rights reserved 34.28 scheme 2 A packing scheme designed to
35、 encode uncompressible data with minimal expansion. 4.29 Scheme 2 Symbol A Control Symbol that indicates subsequent Data Symbols are encoded in scheme 2. 4.30 user data Information that is to be encoded, according to this compression algorithm. 5 Conventions and Notations 5.1 Representation of numbe
36、rs The following conventions and notations apply in this document unless otherwise stated. The setting of bits is denoted by ZERO or ONE. Numbers in binary notation and bit combinations are strings of digits represented by ZEROs and ONEs with the most significant bit to the left. Letters and digits
37、in parentheses represent numbers in hexadecimal notation. All other numbers are in decimal form. 5.2 Names The names of basic elements, e.g. specific fields, are written with a capital initial letter. 6 Acronyms EOR End Of Record lsb least significant bit msb most significant bit 7 Algorithm Overvie
38、w User data that is to be compressed according to this International Standard consists of Records and File Marks. Records consist of 8-bit data bytes, and may be of any non-zero length. Data bytes may be encoded in either scheme 1 or scheme 2. 7.1 Scheme 1 Encoding There may exist within Records rep
39、eating strings of two or more data bytes such that information about the length and position of one string may be substituted in place of a subsequent copy or copies of that same string. This information is known as a Copy Pointer. This International Standard allows Copy Pointer substitution when co
40、rresponding bytes of the two strings are offset by 1 to 1 023 data bytes within user data. Where string matches occur, data compression is possible, and the number of bits of encoded data can be less than the number of bits of user data, and data compression is possible. Any data bytes that are part
41、 of a repeated string may be encoded as a Copy Pointer. Any data byte that is not encoded as a Copy Pointer is encoded as a Literal 1, in which a leading bit set to ZERO is added to the data byte, thereby indicating that this is a Literal 1. Regions over which Copy Pointers and literal values are en
42、coded are defined as being encoded according to scheme 1. Scheme 1 encoding is identical with that of ISO/IEC 15200, except for the addition of Control Symbols. These are both implementations of the Lempel-Ziv 1 (LZ1) class of data compression algorithms. Following a Reset 1 Symbol or a Scheme 1 Sym
43、bol, all bytes of user data shall be encoded according to scheme 1. 7.2 Scheme 2 Encoding There may also exist within user data, regions in which few such repeating strings exist. Where there are no repeating strings, scheme1 encoding requires a 9-bit Literal 1value in the Encoded Data Stream for ev
44、ery data byte. This results in an Encoded Data Stream that has 12,5 % more bits than the user data. In order to avoid this data expansion, scheme 2 encoding may be used. In scheme 2 encoding, data bytes are copied to the output bit stream. In order for a decoder to distinguish a data byte set to (FF
45、) from a Control Symbol, a trailing bit set to ZERO is encoded following every data byte of (FF). For random data, this tends to produce an Encoded Data Stream that has about 0,05 % more bits than the user data. Following a Reset 2 Symbol or a Scheme 2 Symbol, all bytes of user data shall be encoded
46、 according to scheme 2. ISO/IEC 22091:2002(E) 4 ISO/IEC 2002 All rights reserved7.3 History Buffer Matching strings are found within a 1 024-byte History Buffer. Prior to a Reset X Symbol in the Encoded Data Stream, the History Buffer is undefined. Immediately following a Reset X Symbol, the History
47、 Buffer is defined as containing no data. As the first 1 024 data bytes following a Reset X Symbol are recorded, each byte is stored in a subsequent location in the History Buffer, from 0 to 1 023. For each data byte N, comparisons may be made with each of the data bytes at locations 0 to N-1 to tes
48、t for Matching Strings. Once the History Buffer is filled, new bytes replace previously stored bytes in locations 0 to 1 023. The storage location wraps from 1 023 to 0. For a data byte stored at location N, comparisons may be made with each of the data bytes at locations other than N, to test for M
49、atching Strings. Matching Strings may wrap around the end of the History Buffer (e.g. Offset 1 022, Length 10). By updating the History Buffer identically during decoding, the decoder History Buffer shall be identical, after outputting any specific data byte, with the encoder History Buffer after encoding that same data byte. It is, therefore, not necessary to separately include history content information within the Encoded Data Stream. This International Standard does not specify the conditions under which to reset the History Buffer
copyright@ 2008-2019 麦多课文库(www.mydoc123.com)网站版权所有
备案/许可证编号:苏ICP备17064731号-1