1、Standard ECMA-1282ndEdition - December 1999Standardizing Information and Communication SystemsPhone: +41 22 849.60.00 - Fax: +41 22 849.60.01 - URL: http:/www.ecma.ch - Internet: helpdeskecma.ch8-Bit Single-Byte CodedGraphic Character Sets:Latin Alphabet No. 5.Standard ECMA-128December 1999Standardi
2、zing Information and Communication SystemsPhone: +41 22 849.60.00 - Fax: +41 22 849.60.01 - URL: http:/www.ecma.ch - Internet: helpdeskecma.chMB E-128-ii.doc 18-01-00 12,028-Bit Single-Byte CodedGraphic Character Sets:Latin Alphabet No. 5.Brief HistoryThe adoption of ECMA-6 (ISO/IEC 646) as the agre
3、ed international 7-bit code for information interchange had led tothe development of many national, international and application-oriented versions of this code.These versions had a number of limitations generally inherent to the size of the code: they did not provide all graphic characters which we
4、re needed; for some characters, specially for accented letters, it was necessary to resort to BACKSPACE sequences, whichcreated problems when processing data containing such composite characters; interchange among different versions was practically limited to the 82 common graphic characters.With th
5、e advent of 8-bit coding it was possible to increase the number of graphic characters. ISO/IEC 6937, forexample, provided a character set covering the requirements of most languages based on the Latin alphabet. Thischaracter set, although well suited for text communication, was difficult to use for
6、processing as some graphiccharacters were represented by one and others by two bit combinations.Thus the need was recognized for coded graphic character sets, each of which: is the same for all users of a given area, provides single-byte coding of all graphic characters, thus permitting easy process
7、ing, takes into account character sets used in the industry.In 1982 the urgency of the need for an 8-bit single-byte coded character set was recognized in ECMA as well as inANSI/X3L2 and numerous working papers were exchanged between the two groups. In February 1984 ECMA TC1submitted to ISO/TC97/SC2
8、 a proposal for such a coded character set. At its meeting of April 1984 SC2 decided tosubmit to TC97 a proposal for a new item of work for this topic. Technical discussions during and after this meetingled TC1 to adopt the coding scheme proposed by X3L2. International Standard ISO/IEC 8859-1 is bas
9、ed on this jointANSI/ECMA proposal. ECMA published its corresponding Standard ECMA-94 in March 1985.After this first publication, the work of ECMA TC1 on further coded graphic character sets has led to the followingresults:i. The first Edition, dated June 1988, of a Standard for Latin Alphabet No. 5
10、. This ECMA Standard has beenderived from Latin Alphabet No. 1 in which six Icelandic letters were referred by letters required for the Turkishlanguage.ii. The second Edition of Standard ECMA-94, dated June 1986, comprising four coded graphic character sets for theLatin script, identified as Latin A
11、lphabets No. 1 to No. 4. These alphabets have a number of characters incommon, in particular those allocated to columns 02 to 07. They have all been submitted to ISO/IEC JTC 1 - thesuccessor of ISO/TC97 - and are the subject of ISO/IEC 8859, Parts 1 to 4.iii. A series of ECMA Standards for coded gra
12、phic character sets comprising those characters of the Latin Alphabetsallocated to columns 02 to 07 and characters of another script for multiple-language applications. TheseStandards ECMA-113, ECMA-114, ECMA-118 and ECMA-121 cover the Cyrillic, Arabic, Greek and Hebrewscripts, respectively. They ha
13、ve been submitted to JTC 1 for further processing as ISO/IEC standards and havebeen published as Part 5, Part 6, Part 7 and Part 8, respectively, of ISO/IEC 8859.In 1999 the 2ndEdition of ISO/IEC 8859-9 has been published, as a technical revision of the 1stEdition of thisInternational Standard. The
14、2ndEdition of ECMA-128 has been made technically identical with the 2ndEdition ofISO/IEC 8859-9This 2ndedition of Standard ECMA-128 has been adopted by the ECMA General Assembly of December 1999.- i -Table of contents1Scope 12 Conformance 12.1 Conformance of information interchange 12.2 Conformance
15、of devices 12.2.1 Device description 12.2.2 Originating devices 12.2.3 Receiving devices 13 References 14 Definitions 24.1 bit combination 24.2 byte 24.3 character 24.4 code table 24.5 coded character set; code 24.6 coded-character-data-element (CC-data-element) 24.7 graphic character 24.8 graphic s
16、ymbol 24.9 position 25 Notation, code table and names 25.1 Notation 25.2 Layout of the code table 35.3 Names and meanings. 35.3.1 SPACE (SP) 35.3.2 NO-BREAK SPACE (NBSP) 35.3.3 SOFT HYPHEN (SHY) 36 Specification of the coded character set 36.2 Code table 87 Identification of the character set 87.1 I
17、dentification according to ECMA-35 and ECMA-43 87.2 Identification using the ISO International register of coded character sets to be used with escapesequences 9Annex A - Coverage of languages 11Annex B - Main differences between the first edition and this second edition of ECMA-128 13Annex C - Bibl
18、iography 15Annex D - Identification according to ISO/IEC 8824-1 (ASN.1) 171ScopeThis ECMA Standard specifies a set of 191 coded graphic characters identified as Latin alphabet No. 5.This set of coded graphic characters is intended for use in data and text processing applications and also forinformat
19、ion interchange. The set contains graphic characters used for general purpose applications in typicaloffice environments in at least the following languages:Albanian, Basque, Breton, Catalan, Danish, Dutch, English, Faroese, Finnish, French (with restrictions, seeannex A.1, Notes), Frisian, Galician
20、, German, Greenlandic, Irish Gaelic (new orthography), Italian, Latin,Luxemburgish, Norwegian, Portuguese, Rhaeto-Romanic, Scottish Gaelic, Spanish, Swedish and Turkish.This set of coded graphic characters may be regarded as a version of an 8-bit code according to StandardECMA-35 or Standard ECMA-43
21、 at level 1.This ECMA Standard may not be used in conjunction with any other ECMA Standards for 8-bit single-bytecoded graphic character sets. If coded characters from more that one ECMA Standard are to be used together,by means of code extension techniques, the equivalent coded character sets from
22、ISO/IEC 10367 should beused instead within a version of Standard ECMA-43 at level 2 or level 3.The coded characters in this set may be used in conjunction with coded control functions selected fromECMA-48. However, control functions are not used to create composite graphic symbols from two or moregr
23、aphic characters (see clause 6).NOTEThis ECMA Standard is not intended for use with Telematic services defined by ITU-T. If information codedaccording to this ECMA Standard is to be transferred to such services, it will have to conform to therequirements of those services at the access-point.2 Confo
24、rmance2.1 Conformance of information interchangeA coded-character-data-element (CC-data-element) within coded information for interchange is inconformance with this ECMA Standard if all the coded representations of graphic characters within thatCC-data-element conform to the requirements of clause 6
25、.2.2 Conformance of devicesA device is in conformance with this ECMA Standard if it conforms to the requirements of 2.2.1, and eitheror both of 2.2.2 and 2.2.3. A claim of conformance shall identify the document which contains thedescription specified in 2.2.1.2.2.1 Device descriptionA device that c
26、onforms to this ECMA Standard shall be subject of a description that identifies the meansby which the user may supply characters to the device, or may recognize them when they are madeavailable to him, as specified respectively in 2.2.2 and 2.2.3.2.2.2 Originating devicesAn originating device shall
27、allow its user to supply any sequence of characters from those specified inclause 6, and shall be capable of transmitting their coded representations within a CC-data-element.2.2.3 Receiving devicesA receiving device shall be capable of receiving and interpreting any coded representations of charact
28、ersthat are within a CC-data-element, and that conform to clause 6, and shall make the correspondingcharacters available to its user in such a way that the user can identify them from among those specifiedthere, and can distinguish them from each other.3 ReferencesECMA-6 7-Bit Input/Output Coded Cha
29、racter SetECMA-35 Code Extension Techniques- 2 -ECMA-43 8-Bit Coded Character Set Structure and RulesECMA-48 Control Functions for Coded Character SetsECMA-94 8-Bit Single-Byte Coded Graphic Character Sets - Latin Alphabets No. 1 to No. 4ECMA-113 8-Bit Single-Byte Coded Graphic Character Sets - Lati
30、n/Cyrillic AlphabetECMA-118 8-Bit Single-Byte Coded Graphic Character Sets - Latin/Greek AlphabetECMA-144 8-Bit Single-Byte Coded Graphic Character Sets - Latin Alphabet No. 6ISO/IEC 8824-1:1995, Information technology - Abstract Syntax Notation One (ASN.1): Specification ofbasic notation.4 Definiti
31、onsFor the purpose of this Standard the following definitions apply.4.1 bit combinationAn ordered set of bits used for the representation of characters.4.2 byteA bit string that is operated upon as a unit.4.3 characterA member of a set of elements used for the organization, control, or representatio
32、n of data.4.4 code tableA table showing the characters allocated to each bit combination in a code.4.5 coded character set; codeA set of unambiguous rules that establishes a character set and the one-to-one relationship between thecharacters of the set and their bit combinations.4.6 coded-character-
33、data-element (CC-data-element)An element of interchanged information that is specified to consist of a sequence of coded representationsof characters, in accordance with one or more identified standards for coded character sets.4.7 graphic characterA character, other than a control function, that ha
34、s a visual representation normally hand-written, printed ordisplayed, and that has a coded representation consisting of one or more bit combinations.NOTEIn the 8-bit single-byte coded graphic character sets a single bit combination is used to represent eachcharacter.4.8 graphic symbolA visual repres
35、entation of a graphic character or of a control function.4.9 positionThat part of a code table identified by its column and row co-ordinates.5 Notation, code table and names5.1 NotationThe bits of the bit combinations of the 8-bit code are identified by b8, b7, b6, b5, b4, b3, b2and b1, where b8is t
36、he highest-order, or most-significant bit and b1is the lowest-order, or least-significant bit.The bit combinations may be interpreted to represent numbers in binary notation by attributing thefollowing weights to the individual bits:- 3 -Bit b8b7b6b5b4b3b2b1Weight 128 64 32 16 8 4 2 1Using these wei
37、ghts, the bit combinations are identified by notations of the form xx/yy, where xx and yyare numbers in the range 00 to 15. The correspondence between the notations of the form xx/yy and the bitcombinations consisting of the bits b8to b1is as follows: xx is the number represented by b8, b7, b6and b5
38、where these bits are given the weights 8, 4, 2, and 1,respectively. yy is the number represented by b4, b3, b2and b1where these bits are given the weights 8, 4, 2, and 1,respectively.The bit combinations are also identified by notations of the form hk, where h and k are numbers in therange 0 to F in
39、 hexadecimal notation. The number h is the same as the number xx described above, and thenumber k the same as the number yy described above.5.2 Layout of the code tableAn 8-bit code table consists of 256 positions arranged in 16 columns and 16 rows. The columns and therows are numbered 00 to 15. In
40、hexadecimal notation the columns and the rows are numbered 0 to F.The code table positions are identified by notations of the form xx/yy, where xx is the column number andyy is the row number. The column and row numbers are shown at the top and left edges of the table,respectively. The code table po
41、sitions are also identified by notations of the form hk, where h is the columnnumber and k is the row number in hexadecimal notation. The column and row numbers are shown at thebottom and right edges of the table, respectively.The positions of the code table are in one-to-one correspondence with the
42、 bit combinations of the code. Thenotation of a code table position, of the form xx/yy, or of the form hk, is the same as that of thecorresponding bit combination.5.3 Names and meanings.This ECMA Standard assigns a unique name and a unique identifier to each graphic character. These namesand identif
43、iers have been taken from ISO/IEC 10646-1. This ECMA Standard also specifies an acronym foreach of the characters SPACE, NO-BREAK SPACE and SOFT HYPHEN. For acronyms only Latin capitalletters A to Z are used. It is intended that the acronyms be retained in all translations of the text.Except for SPA
44、CE (SP), NO-BREAK SPACE (NBSP), and SOFT HYPHEN (SHY), this ECMA Standarddoes not define and does not restrict the meanings of graphic characters.This ECMA Standard specifies a graphic symbol for each graphic character. This symbol is shown in thecorresponding position of the code table. However, th
45、is Standard does not specify a particular style or fontdesign for imaging graphic characters.5.3.1 SPACE (SP)A graphic character the visual representation of which consists of the absence of a graphic symbol.5.3.2 NO-BREAK SPACE (NBSP)A graphic character the visual representation of which consists o
46、f the absence of a graphic symbol, foruse when a line break is to be prevented in the text as presented.5.3.3 SOFT HYPHEN (SHY)A graphic character that is imaged by a graphic symbol identical with, or similar to, that representingHYPHEN, for use when a line break has been established within a word.6
47、 Specification of the coded character setThis ECMA Standard specifies 191 characters allocated to the bit combinations of the code table (table 2).None of these characters are combining characters.- 4 -NOTECombining characters are described in Standard ECMA-35 in 6.3.3.6.1 Characters of the set and
48、their coded representationSee table 1.Table 1 - Character set, coded representationBitcombina-tionHex Identifier Name02/00 20 U+0020 SPACE02/01 21 U+0021 EXCLAMATION MARK02/02 22 U+0022 QUOTATION MARK02/03 23 U+0023 NUMBER SIGN02/04 24 U+0024 DOLLAR SIGN02/05 25 U+0025 PERCENT SIGN02/06 26 U+0026 AM
49、PERSAND02/07 27 U+0027 APOSTROPHE02/08 28 U+0028 LEFT PARENTHESIS02/09 29 U+0029 RIGHT PARENTHESIS02/10 2A U+002A ASTERISK02/11 2B U+002B PLUS SIGN02/12 2C U+002C COMMA02/13 2D U+002D HYPHEN-MINUS02/14 2E U+002E FULL STOP02/15 2F U+002F SOLIDUS03/00 30 U+0030 DIGIT ZERO03/01 31 U+0031 DIGIT ONE03/02 32 U+0032 DIGIT TWO03/03 33 U+0033 DIGIT THREE03/04 34 U+0034 DIGIT FOUR03/05 35 U+0035 DIGIT FIVE03/06 36 U+0036 DIGIT SIX03/07 37 U+0037 DIGIT SEVEN03/08 38 U+0038 DIGIT EIGHT03/09 39 U+0039 DIGIT NINE03/10 3A U+003A COLON03/11 3B U+003B SEMICOLON03/12 3C U+003C LESS-THAN SIGN03/13
copyright@ 2008-2019 麦多课文库(www.mydoc123.com)网站版权所有
备案/许可证编号:苏ICP备17064731号-1