ETSI TS 123 042-2018 Digital cellular telecommunications system (Phase 2+) (GSM) Universal Mobile Telecommunications System (UMTS) LTE Compression algorithm for text messaging serv.pdf

上传人:inwarn120 文档编号:740686 上传时间:2019-01-11 格式:PDF 页数:81 大小:330.50KB
下载 相关 举报
ETSI TS 123 042-2018 Digital cellular telecommunications system (Phase 2+) (GSM) Universal Mobile Telecommunications System (UMTS) LTE Compression algorithm for text messaging serv.pdf_第1页
第1页 / 共81页
ETSI TS 123 042-2018 Digital cellular telecommunications system (Phase 2+) (GSM) Universal Mobile Telecommunications System (UMTS) LTE Compression algorithm for text messaging serv.pdf_第2页
第2页 / 共81页
ETSI TS 123 042-2018 Digital cellular telecommunications system (Phase 2+) (GSM) Universal Mobile Telecommunications System (UMTS) LTE Compression algorithm for text messaging serv.pdf_第3页
第3页 / 共81页
ETSI TS 123 042-2018 Digital cellular telecommunications system (Phase 2+) (GSM) Universal Mobile Telecommunications System (UMTS) LTE Compression algorithm for text messaging serv.pdf_第4页
第4页 / 共81页
ETSI TS 123 042-2018 Digital cellular telecommunications system (Phase 2+) (GSM) Universal Mobile Telecommunications System (UMTS) LTE Compression algorithm for text messaging serv.pdf_第5页
第5页 / 共81页
点击查看更多>>
资源描述

1、 ETSI TS 123 042 V15.0.0 (2018-06) Digital cellular telecommunications system (Phase 2+) (GSM); Universal Mobile Telecommunications System (UMTS); LTE; Compression algorithm for text messaging services (3GPP TS 23.042 version 15.0.0 Release 15) TECHNICAL SPECIFICATION ETSI ETSI TS 123 042 V15.0.0 (2

2、018-06)13GPP TS 23.042 version 15.0.0 Release 15Reference RTS/TSGC-0123042vf00 Keywords GSM,LTE,UMTS ETSI 650 Route des Lucioles F-06921 Sophia Antipolis Cedex - FRANCE Tel.: +33 4 92 94 42 00 Fax: +33 4 93 65 47 16 Siret N 348 623 562 00017 - NAF 742 C Association but non lucratif enregistre la Sou

3、s-Prfecture de Grasse (06) N 7803/88 Important notice The present document can be downloaded from: http:/www.etsi.org/standards-search The present document may be made available in electronic versions and/or in print. The content of any electronic and/or print versions of the present document shall

4、not be modified without the prior written authorization of ETSI. In case of any existing or perceived difference in contents between such versions and/or in print, the only prevailing document is the print of the Portable Document Format (PDF) version kept on a specific network drive within ETSI Sec

5、retariat. Users of the present document should be aware that the document may be subject to revision or change of status. Information on the current status of this and other ETSI documents is available at https:/portal.etsi.org/TB/ETSIDeliverableStatus.aspx If you find errors in the present document

6、, please send your comment to one of the following services: https:/portal.etsi.org/People/CommiteeSupportStaff.aspx Copyright Notification No part may be reproduced or utilized in any form or by any means, electronic or mechanical, including photocopying and microfilm except as authorized by writte

7、n permission of ETSI. The content of the PDF version shall not be modified without the written authorization of ETSI. The copyright and the foregoing restriction extend to reproduction in all media. ETSI 2018. All rights reserved. DECTTM, PLUGTESTSTM, UMTSTMand the ETSI logo are trademarks of ETSI r

8、egistered for the benefit of its Members. 3GPPTM and LTETMare trademarks of ETSI registered for the benefit of its Members and of the 3GPP Organizational Partners. oneM2M logo is protected for the benefit of its Members. GSMand the GSM logo are trademarks registered and owned by the GSM Association.

9、 ETSI ETSI TS 123 042 V15.0.0 (2018-06)23GPP TS 23.042 version 15.0.0 Release 15Intellectual Property Rights Essential patents IPRs essential or potentially essential to normative deliverables may have been declared to ETSI. The information pertaining to these essential IPRs, if any, is publicly ava

10、ilable for ETSI members and non-members, and can be found in ETSI SR 000 314: “Intellectual Property Rights (IPRs); Essential, or potentially Essential, IPRs notified to ETSI in respect of ETSI standards“, which is available from the ETSI Secretariat. Latest updates are available on the ETSI Web ser

11、ver (https:/ipr.etsi.org/). Pursuant to the ETSI IPR Policy, no investigation, including IPR searches, has been carried out by ETSI. No guarantee can be given as to the existence of other IPRs not referenced in ETSI SR 000 314 (or the updates on the ETSI Web server) which are, or may be, or may beco

12、me, essential to the present document. Trademarks The present document may include trademarks and/or tradenames which are asserted and/or registered by their owners. ETSI claims no ownership of these except for any which are indicated as being the property of ETSI, and conveys no right to use or rep

13、roduce any trademark and/or tradename. Mention of those trademarks in the present document does not constitute an endorsement by ETSI of products, services or organizations associated with those trademarks. Foreword This Technical Specification (TS) has been produced by ETSI 3rd Generation Partnersh

14、ip Project (3GPP). The present document may refer to technical specifications or reports using their 3GPP identities, UMTS identities or GSM identities. These should be interpreted as being references to the corresponding ETSI deliverables. The cross reference between GSM, UMTS, 3GPP and ETSI identi

15、ties can be found under http:/webapp.etsi.org/key/queryform.asp. Modal verbs terminology In the present document “shall“, “shall not“, “should“, “should not“, “may“, “need not“, “will“, “will not“, “can“ and “cannot“ are to be interpreted as described in clause 3.2 of the ETSI Drafting Rules (Verbal

16、 forms for the expression of provisions). “must“ and “must not“ are NOT allowed in ETSI deliverables except when used in direct citation. ETSI ETSI TS 123 042 V15.0.0 (2018-06)33GPP TS 23.042 version 15.0.0 Release 15Contents Intellectual Property Rights 2g3Foreword . 2g3Modal verbs terminology 2g3F

17、oreword . 6g3Introduction 6g31 Scope 7g32 References 7g32.1 Normative references . 7g32.2 Informative references 7g33 Abbreviations . 7g34 Algorithms 7g34.1 Huffman Coding . 7g34.2 Character Groups 9g34.3 UCS2 9g34.4 Keywords . 9g34.5 Punctuation . 10g34.6 Character Sets . 10g35 Compressed Data Stre

18、ams 10g35.1 Structure . 10g35.2 Compression Header 10g35.2.1 Compression Header - Octet 1 11g35.2.2 Compression Header - Octets 2 to n . 11g35.2.2.1 Compression Header reserved extension types and values 13g35.2.3 Identifying unique parameter sets . 13g35.3 Compressed Data 14g35.4 Compression Footer

19、 . 15g36 Compression processes. 16g36.1 Overview 16g36.1.1 Compression . 17g36.1.2 Decompression . 18g36.2 Character sets . 19g36.2.1 Initialization 19g36.2.2 Character set conversion . 20g36.2.3 Character case conversion 20g36.3 Punctuation processing . 20g36.3.1 Initialization 21g36.3.2 Compressio

20、n . 22g36.3.3 Decompression . 23g36.4 Keywords . 23g36.4.1 Dictionaries . 23g36.4.2 Groups 24g36.4.3 Matches. 26g36.4.4 Initialization 27g36.4.5 Compression . 27g36.4.6 Decompression . 28g36.5 UCS2 28g36.5.1 Initialization 28g36.5.2 Compression . 28g36.5.3 Decompression . 28g36.6 Character group pro

21、cessing 28g36.6.1 Character Groups 29g36.6.2 Initialization 30g3ETSI ETSI TS 123 042 V15.0.0 (2018-06)43GPP TS 23.042 version 15.0.0 Release 156.6.3 Compression . 30g36.6.4 Decompression . 32g36.7 Huffman coding 32g36.7.1 Initialization Overview . 33g36.7.2 Initialization 34g36.7.3 Build Tree . 35g3

22、6.7.4 Update Tree 35g36.7.5 Add New Node . 35g36.7.6 Compression . 36g36.7.7 Decompression . 36g37 Test Vectors 36g3Annex A (normative): German Language parameters . 38g3A.1 Compression Language Context 38g3A.2 Punctuators . 38g3A.3 Keyword Dictionaries. 39g3A.4 Character Groups 44g3A.5 Huffman Init

23、ializations. 47g3Annex B (normative): English language parameters 51g3B.1 Compression Language Context 51g3B.2 Punctuators . 51g3B.3 Keyword Dictionaries. 52g3B.4 Character Groups 57g3B.5 Huffman Initializations. 60g3Annex C (normative): Italian Language parameters 64g3Annex D (normative): French La

24、nguage parameters . 65g3Annex E (normative): Spanish Language parameters . 66g3Annex F (normative): Dutch Language parameters . 67g3Annex G (normative): Swedish Language parameters . 68g3Annex H (normative): Danish Language parameters . 69g3Annex J (normative): Portuguese Language parameters 70g3Ann

25、ex K (normative): Finnish Language parameters 71g3Annex L (normative): Norwegian Language parameters 72g3Annex M (normative): Greek Language parameters 73g3Annex N (normative): Turkish Language parameters . 74g3Annex P (normative): Reserved 75g3Annex Q (normative): Reserved 76g3Annex R (normative):

26、Default Parameters for Unspecified Language . 77g3R.1 Compression Language Context 77g3R.2 Punctuators . 77g3ETSI ETSI TS 123 042 V15.0.0 (2018-06)53GPP TS 23.042 version 15.0.0 Release 15R.3 Keyword Dictionaries. 77g3R.4 Character Groups 77g3R.5 Huffman Initializations. 77g3Annex S (informative): C

27、hange history . 79g3History 80g3ETSI ETSI TS 123 042 V15.0.0 (2018-06)63GPP TS 23.042 version 15.0.0 Release 15Foreword This Technical Specification has been produced by the 3GPP. The contents of the present document are subject to continuing work within the TSG and may change following formal TSG a

28、pproval. Should the TSG modify the contents of this TS, it will be re-released by the TSG with an identifying change of release date and an increase in version number as follows: Version x.y.z where: x the first digit: 1 presented to TSG for information; 2 presented to TSG for approval; 3 Indicates

29、TSG approved document under change control. y the second digit is incremented for all changes of substance, i.e. technical enhancements, corrections, updates, etc. z the third digit is incremented when editorial only changes have been incorporated in the specification; Introduction This clause intro

30、duces the concepts and mechanisms involved in the compression and decompression of a stream of data. Overview Central to the compression of a stream of data and the subsequent recovery of the original data is the that both sender and receiver have information that not only describes the content of t

31、he data stream, but how the stream is encoded. For example, a simple rule such as “its 8 bit data“ is enough to transport any character value in the range 0 to 255 with 8 bits being required for each and every character. In contrast if both sender and receive know that some characters are more frequ

32、ent than others, then the more frequent might be encoded in fewer bits while the less frequent in more - resulting in a net reduction of the total number of bits used to express the data stream. This knowledge of the nature of the data stream can be established in two ways. Either both sender and re

33、ceiver can agree some key aspects of the data stream prior to it being processed or key aspects of the data can be garnered dynamically during its processing. The disadvantage of an approach based on “prior information“ is that it must be known. It can either be carried as a header to the data strea

34、m, in which case it adds to the net size of the compressed stream. Or it can be fixed and known to the (de)compression algorithm itself in which case compression performance degrades as a given stream diverges in nature from these fixed and known states. In contrast, the disadvantage of “dynamic inf

35、ormation“ is that it must be discovered; typically this means a greater processing requirement for the (de)compressor. It also implies that compression performance is initially poor as the algorithm has to “learn“ about the data stream before it can apply this knowledge. It will also require greater

36、 working memory to store its knowledge about the data stream. The choice of compression algorithms is always a balancing of compression rate (in terms of fewer output bits), working memory requirements of the (de)compressor and CPU bandwidth. For the compression of SMS messages, there is the additio

37、nal requirement that it should work well (in terms of compression rate) even on short data streams. Compression / Decompression is an optional feature but when implemented, the only mandatory requirement is Raw Untrained Dynamic Huffman . The default initialisation for the Huffman Encoder / Decoder

38、operating in the Raw Untrained Dynamic Huffman mode are defined in annex R. (See also subclause 4.1.) i.e. There is no need for any pre-defined attributes such as language dependency to be included. This is of particular significance for entities such as an MS which may have memory storage constrain

39、ts. ETSI ETSI TS 123 042 V15.0.0 (2018-06)73GPP TS 23.042 version 15.0.0 Release 151 Scope The present document introduces the concepts and mechanisms involved in the compression and decompression of a stream of data. 2 References The following documents contain provisions which, through reference i

40、n this text, constitute provisions of the present document. - References are either specific (identified by date of publication, edition number, version number, etc.) or non-specific. - For a specific reference, subsequent revisions do not apply. - For a non-specific reference, the latest version ap

41、plies. In the case of a reference to a 3GPP document (including a GSM document), a non-specific reference implicitly refers to the latest version of that document in the same Release as the present document. 2.1 Normative references 1 3GPP TS 23.038: “Alphabets and language-specific information“. 2.

42、2 Informative references 2 “The Data Compression Handbook 2nd Edition“ by Mark Nelson and Jean-Loup Gailly, published by M or b) the (de)coder can adapt the frequency distribution it uses to (de)code characters based on the incidence of previous characters within the input stream. In both cases, the

43、 character frequency distribution is represented in a “tree“ structure, an example of which is shown in figure 1. “Z“f=1“W“f=1Nodef=2“T“f=4Nodef=6“R“f=6Nodef=12“A“f=10“O“f=10Nodef=20Nodef=32“E“f=40Root Nodef=72Figure 1: Character frequency distribution The tree represents the characters Z, W, T, R,

44、A, O and E which have frequencies of 1, 1, 4, 6, 10, 10 and 40 respectively. The characters may be coded as variable length bit streams by starting at the “character node“ and ascending to the “root node“. At each stage, if a left hand path is traversed, a 0 bit is emitted and if a right hand path i

45、s traversed a 1 bit is emitted. Thus the infrequent Z and W would require 5 bits, whereas the most frequent character E requires just 1 bit. The resulting bit stream is decoded by starting at the “root node“ and descending the tree, to the left or right depending on the value of the current bit, unt

46、il a “character node“ is reached. It is a requirement that at any time the trees expressing the character frequencies shall be identical for both coder and decoder. This can be achieved in a number of ways. Firstly, both coder and decoder could use a fixed and pre-agreed frequency distribution that

47、includes all possible characters but as noted above, this use of “prior information“ suffers when a given input stream has a significantly different character frequency distribution. Secondly, the coder may calculate the character frequency distribution for the entire input stream and prepend this i

48、nformation to the encoded bit stream. The decoder would then generate the appropriate tree prior to processing the bitstream. This approach offers good compression, especially if the character frequency information may itself be compressed in some manner. Approaches of this type are common but the c

49、ost of the prepended information for a potentially small data stream makes it less attractive. Thirdly, extend the algorithm such that although both coder and decoder start with known frequency distributions, and subsequently adapt these distributions to reflect the addition of each character in the input stream. One possibility is to have initial distributions that encompass all possible characters so that all that is required, as each input character is processed, is to increment the appropriate frequency and update the tree. However, the inclusion of all possible ch

展开阅读全文
相关资源
猜你喜欢
相关搜索

当前位置:首页 > 标准规范 > 国际标准 > 其他

copyright@ 2008-2019 麦多课文库(www.mydoc123.com)网站版权所有
备案/许可证编号:苏ICP备17064731号-1