ECMA 94-1986 8-Bit Single-Byte Coded Graphic Character Sets - Latin Alphabets No 1 to No 4 (2nd Edition)《8-位单字节编码的图形字符集 No 1到No 4拉丁字母 第2版》.pdf

上传人:appealoxygen216 文档编号:704850 上传时间:2019-01-03 格式:PDF 页数:37 大小:1.47MB
下载 相关 举报
ECMA 94-1986 8-Bit Single-Byte Coded Graphic Character Sets - Latin Alphabets No 1 to No 4 (2nd Edition)《8-位单字节编码的图形字符集 No 1到No 4拉丁字母 第2版》.pdf_第1页
第1页 / 共37页
ECMA 94-1986 8-Bit Single-Byte Coded Graphic Character Sets - Latin Alphabets No 1 to No 4 (2nd Edition)《8-位单字节编码的图形字符集 No 1到No 4拉丁字母 第2版》.pdf_第2页
第2页 / 共37页
ECMA 94-1986 8-Bit Single-Byte Coded Graphic Character Sets - Latin Alphabets No 1 to No 4 (2nd Edition)《8-位单字节编码的图形字符集 No 1到No 4拉丁字母 第2版》.pdf_第3页
第3页 / 共37页
ECMA 94-1986 8-Bit Single-Byte Coded Graphic Character Sets - Latin Alphabets No 1 to No 4 (2nd Edition)《8-位单字节编码的图形字符集 No 1到No 4拉丁字母 第2版》.pdf_第4页
第4页 / 共37页
ECMA 94-1986 8-Bit Single-Byte Coded Graphic Character Sets - Latin Alphabets No 1 to No 4 (2nd Edition)《8-位单字节编码的图形字符集 No 1到No 4拉丁字母 第2版》.pdf_第5页
第5页 / 共37页
点击查看更多>>
资源描述

1、ECMA EUROPEAN COMPUTER MAN U FACTURER S ASSOCIATION STANDARD ECM1A-94 8-BIT SINGLE-BYTE CODED GRAPHIC CHARACTER SETS LATIN ALPHABETS No. 1 TO No. 4 2nd Edition - June 1986 - f BRIEF HISTORY The adoption of ECMA-6 (IS0 646) as the agreed international 7-bit code for information interchange had led to

2、 the development of many national, international and application-oriented versions of this code which are in wide used today. These versions have a number of limitations generally inherent to the size of the code: - they do not provide all graphic Characters which may be needed, - for some character

3、s, specially for accented letters, it is neces- sary to resort to BACKSPACE sequences, which creates problems when processing data containing such composite characters, - interchange among diEferent versions is practically limited to the 82 common graphic characters. With the advent of 8-bit coding

4、it was possible to increase the num- ber of graphic characters. IS0 6937/2, for example, provides a char- acter set covering the requirements of most languages based on the Latin alphabet. This character set, although well suited for text communication, is difficult to use for processing as some gra

5、phic characters are represented by one and others by two bit combina- tions. Thus the need was recognized for coded graphic character sets, each of which: - is the same for all users of a given area, - provides single-byte coding of all graphic characters thus permit- - takes into account character

6、sets used in the industry. Since 1982 the urgency of the need for an 8-bit single-byte coded character set was recognized in ECMA as well as in ANSI/X3L2 and nu- merous working papers were exchanged between the two groups. In February 1984 ECMA TC1 submitted to ISO/TC97/SC2 a proposal for such a cod

7、ed character set. At its meeting of April 1984 SC2 decided to submit to TC97 a proposal for a new item of work for this topic. Technical discussions during and after this meeting led TC1 to adopt the coding scheme proposed by X3L2. International Standard IS0 8859/1 is based on this joint ANSI/ECMA p

8、roposal. ECMA published its corresponding Standard ECMA-94 in March 1985. ting easy processing, After this first publication, the work of ECMA TC1 on further coded graphic character sets has led to the following results: i) The present second Edition of Standard ECMA-94 comprising four coded graphic

9、 character sets for the Latin script, identified as Latin Alphabets No 1 to No 4. These alphabets have a number of characters in common, in particular those allocated to columns 02 to 07. Latin Alphabet No 2 has been submitted to IS0 and is the subject of IS0 8859/2. ECMA ECMAU94 8b m 3404593 000726

10、b 8 m ii) A series of new ECMA Standards for coded graphic character sets comprising those characters of the Latin Alphabets allocated to columns 02 to 07 and characters of another script for multi- ple-language applications. These ECMA standards cover the Cyril- lic, Greek and Arabic scripts. It is

11、 intended to submit them to IS0 for further processing as IS0 standards. Adopted as an ECMA Standard by the General Assembly of June 26, 1986. - ECMA ECMA*94 b m 3404593 00072b7 T m 1. 2. 3. 4. 5. 6. 7. 8. 9. 10 . TABLE OF CONTENTS SCOPE FIELD OF APPLICATION CONFORMANCE REFERENCES DEFINITIONS 5.1 Bi

12、t Combinations; Byte 5.2 Character 5.3 Coded Character Set; Code 5.4 Code Table 5.5 Graphic Character 5.6 Graphic Symbol 5.7 Position NOTATION, CODE TABLE AND NAMES 6.1 Notation 6.2 Layout of the Code Tables 6.3 Names and Meanings 6.3.1 SPACE (SP) 6.3.2 NO-BREAK SPACE (NBSP) 6.3.3 SOFT HYPHEN (SHY)

13、SPECIFICATION OF THE FOUR LATIN ALPHABETS 7.1 Characters Allocated to Column 02 to 07 in each Latin Alphabet LATIN ALPHABET NO 1 8.1 Field of Application 8.2 Characters Allocated to Columns 10 to 15 8.3 Code Table 8.9 LATIN ALPHABET No 2 9.1 Field of Application 9.2 Characters Allocated to Columns 1

14、0 to 15 9.3 Code Table 9.4 Designation of the Character Set LATIN ALPHABET No 3 10.1 Field of Application 10.2 Characters Allocated to Columns 10 to 15 10.3 Code Table 10.4 Designation of the Character Set 10.5 Bit Combinations Not To Be Used Designation of the Character Set Paqe 1 1 1 1 1 1 2 2 2 2

15、 2 2 2 2 3 3 3 3 3 4 4 7 1 7 10 10 12 12 12 15 15 17 17 17 20 20 20 CI- _ ECHA ECHA*SLi b M 3L10L1573 0007268 I Paqe 11. LATIN ALPHABET No 4 11.1 Field of Application 11.2 Characters Allocated to Columns 10 to 15 11.3 Code Table 11.4 Designation of the Character Set 22 22 22 25 25 APPENDIX A - SYNOP

16、SIS OF THE CHARACTERS COMMON TO ALL FOUR ALPHABETS 27 APPENDIX B - SYNOPSIS OF THE CHARACTERS UNIQUE TO EACH ALPHABET 28 ECMA ECMA*4 8b 3404593 OO72b 3 -1- 1. SCOPE This ECMA Standard defines four 8-bit coded graphic character sets identified as Latin Alphabets No 1 to No 4, and specifies the coded

17、representation of each character by means of a single 8-bit byte. Characters common to two or more sets are allocated to the same code position. None of these characters are %on-spacingl. The use of control functions, such as BACKSPACE or CARRIAGE RETURN for the coded representation of composite cha

18、racters is prohibited by this Standard. 2. FIELD OF APPLICATION These sets of graphic characters are intended for use in data and text processing applications and may also be used for infor- mation interchange. They are suitable for use in a version of an 8-bit code according to ECMA-35 or ECMA-43.

19、A specific field of application is indicated for each character set. 3. CONFORMANCE A set of graphic characters is in conformance with this Standard if it comprises all graphic characters specified for one of the Latin Alphabets to the exclusion of any other and and if their coded representations ar

20、e those specified by this Standard. Any claim of conformance shall specify the Latin Alphabet imple- mented. 4. REFERENCES ECMA-6 : 7-bit Input/Output Coded Character Set ECMA-35 : Code Extension Techniques ECMA-43 : 8-bit Coded Character Set - Structure and Rules ECMA-48 : Control Functions ECMA-11

21、3 : 8-bit Single-Byte Coded Graphic Character Set - ECMA-114 : 8-bit Single-Byte Coded Graphic Character Set - ECMA- : 8-bit Single-Byte Coded Graphic Character Set - Latin/Cyrillic Alphabet Latin/Arabic Alphabet Latin/Greek Alphabet (in preparation) 5. DEFINITIONS For the purpose of this Standard t

22、he following definitions apply: 5.1 Bit Combinations; Byte An ordered set of bits that represents a character or is used as a part of the representation of a character. ECMA ECMA*94 8b W 3404593 0007270 T-E 5.2 5.3 5.4 5.5 5.6 5.7 -2- Character A member of a set of elements used for the organization

23、, con- trol or representation of data. Coded Character Set; Code A set of unambiguous rules that establishes a character set and the one-to-one relationship between each character of the set and its coded representation. Code Table A table showing the character allocated to each bit combina- tion in

24、 a code. Graphic Character A character, other than a control function, that has a visual representation normally handwritten, printed or displayed, and that has a coded representation consisting of one or more bit combinations. Note 1 In this Standard a single bit combination is used to represent ea

25、ch graphic character. Graphic Symbol A visual representation of a graphic character. Position That part of a code table identified by its column and row co-ordinates. 6. NOTATION, CODE TABLE AND NAMES 6.1 Notation The bits of the bit combinations of the 8-bit code are identi- fied by b8, b, b, b, b,

26、 bg, b2 and b, where b8 is the highest-order, or most-significant bit and b, is the lowest-order, or least-significant bit. The bit combinations may be interpreted to represent numbers in binary notation by attributing the following weights to the individual bits: I Weight1 128 64 32 16 8 4 Using th

27、ese weights, the bit combinations of the 8-bit code are interpreted to represent numbers in the range O to 255. In this Standard, the bit combinations are identified by nota- tions of the form xx/yy, where xx and yy are numbers in the range O0 to 15. The correspondence between the notations of the f

28、orm xx/yy and the bit combinations consisting of the bits b, to b, is as follows: .- y- ECMA ECMA*94 86 = 3404593 0007271 1 - - - -3- - xx is the number represented by b, b, b, and b, where these - yy is the number represented by b, b, b, and b, where these bits are given the weights 8, 4, 2 and 1 r

29、espectively; bits are given the weights 8, 4, 2 and 1 respectively. 6.2 Layout of the Code Tables Each 8-bit code table consists of 256 positions arranged in 16 columns and 16 rows. The columns and the rows are numbered O0 to 15. The code table positions are identified by notations of the form xx/yy

30、, where xx is the column number and YJT is the row number. The positions of the code table are in one-to-one correspon- dence with the bit combinations of the code. The notation of a code table position, of the form xx/yy, is the same as that of the corresponding bit combination, This Standard assig

31、ns at least one name to each character. In addition, it specifies a graphic symbol for each graphic character. By convention only capital letters, the graphic symbols of small letters and hyphens are used for writing the names of the characters. The names chosen to denote graphic characters are inte

32、nded to reflect their customary meaning. However, except for SPACE (SP), NO-BREAK SPACE (NBSP) and SOFT HYPHEN (SHY), this Stan- dard does not define and does not restrict the meanings of graphic characters. Neither does it specify a particular style or font design for imaging graphic characters. 6.

33、3 Names and Meaninqs 6.3.1 SPACE (SPI This character may be interpreted as a graphic character, a control character or as both. As a graphic character it has the visual representation consisting of the absence of a graphic symbol. A graphic character the visual representation of which con- sists of

34、the absence of a graphic symbol, for use when a line break is to be prevented in the text as presented. A graphic character that is imaged by a graphic symbol identical with, or similar to, that representing HYPHEN, for use when a line break is permitted in the text as presented. 6.3.2 NO-BREAK SPAC

35、E (NBSP) 6.3.3 SOFT HYPHEN (SHY) ECMA ECMA*4 Bb W 3404593 0007272 3 W -4- 7. SPECIFICATION OF THE FOUR LATIN ALPHABETS The four Latin Alphabets have 128 common characters, namely the 95 characters allocated to columns 02 to 07 and 33 characters allocated to code positions in columns 10 to 15. For th

36、e sake of simplicity the characters allocated to columns 02 to 07 are listed once only. The characters in columns 10 to 15 are listed for each Latin Al- phabet and the complete code table is shown for each of them. 7.1 Characters Allocated to Column 02 to 07 in each Latin Alphabet Bit Combination 02

37、/00 02/01 02/02 02/03 02/04 02/05 02/06 02/07 02/08 02/09 02/10 02/11 02/12 02/13 02/14 02/15 03/00 03/01 03/02 03/03 03/04 03/05 03/06 03/07 03/08 03/09 Name SPACE EXCLAMATION MARK QUOTATION MARK NUMBER SIGN DOLLAR SIGN PERCENT SIGN AMPERSAND APOSTROPHE LEFT PARENTHESIS RIGHT PARENTHESIS ASTERISK P

38、LUS SIGN COMMA HYPHEN, MINUS FULL STOP SOLIDUS DIGIT ZERO DIGIT ONE DIGIT TWO DIGIT THREE DIGIT FOUR DIGIT FIVE DIGIT SIX DIGIT SEVEN DIGIT EIGHT DIGIT NINE SIGN ECMA ECMA*4 Bb m 3404593 0007273 5 m Bit Combination 03/10 03/11 03/12 03/13 03/14 03/15 04/00 04/01 04/02 04/03 04/04 04/05 04/06 04/07 0

39、4/08 04/09 04/10 04/11 04/12 04/13 04/14 04/15 05/00 05/01 05/02 05/03 05/04 05/05 05/06 05/07 05/08 05/09 05/10 05/11 05/12 05/13 05/14 Name COLON SEMI COLON LESS-THAN SIGN EQUALS SIGN GREATER-THAN SIGN QUESTION MARK COMMERCIAL AT CAPITAL LETTER A CAPITAL LETTER B CAPITAL LETTER C CAPITAL LETTER D

40、CAPITAL LETTER E CAPITAL LETTER F CAPITAL LETTER G CAPITAL LETTER H CAPITAL LETTER I CAPITAL LETTER J CAPITAL LETTER K CAPITAL LETTER L CAPITAL LETTER M CAPITAL LETTER N CAPITAL LETTER O CAPITAL LETTER P CAPITAL LETTER Q CAPITAL LETTER R CAPITAL LETTER S CAPITAL LETTER T CAPITAL LETTER U CAPITAL LET

41、TER V CAPITAL LETTER W CAPITAL LETTER X CAPITAL LETTER Y CAPITAL LETTER Z LEFT SQUARE BRACKET REVERSE SOLIDUS RIGHT SQUARE BRACKET CIRCUMFLEX ACCENT ECMA ECMAP94 Bb E 3404593 0007274 7 -6- Bit Combination 05/15 06/00 06/01 06/02 06/03 06/04 06/05 06/06 06/07 06/08 06/09 06/10 06/11 06/12 06/13 06/14

42、 06/15 07/00 07/01 07/02 07/03 07/04 07/05 07/06 07/07 07/08 07/09 07/10 07/11 07/12 07/13 07/14 Name LOW LINE GRAVE ACCENT SMALL LETTER a SMALL LETTER b SMALL LETTER c SMALL LETTER d SMALL LETTER e SMALL LETTER f SMALL LETTER g SMALL LETTER i SMALL LETTER j SMALL LETTER h SMALL LETTER k SMALL LETTE

43、R 1 SMALL LETTER m SMALL LETTER n SMALL LETTER o SMALL LETTER p SMALL LETTER q SMALL LETTER r SMALL LETTER s SMALL LETTER t SMALL LETTER u SMALL LETTER v SMALL LETTER w SMALL LETTER x SMALL LETTER y SMALL LETTER Z LEFT CURLY BRACKET VERTICAL LINE RIGHT CURLY BRACKET TILDE LATIN ALPHABET No. 1 ECMA E

44、CMA*4 Ab m 3404593 O007276 O m - -7- 8. LATIN ALPHABET No. 1 This coded character set shall consist of 191 graphic charac- ters : - the 95 characters listed in 7.1, and - the 96 characters listed in 8.2 8.1 Field of Application The set contains graphic characters used for general purpose application

45、s in typical office environments in at least the following languages: Danish, Dutch, English, Faeroese, Finnish, French, German, Icelandic, Irish, Italian, Norwegian, Portuguese, Spanish and Swedish. 8.2 Characters Allocated to Columns 10 to 15 Bit Combination 10/00 10/01 10/02 10/03 10/04 10/05 10/

46、06 10/07 10/08 10/09 10/10 10/11 10/12 10/13 10/14 10/15 11/00 11/01 11/02 11/03 11/04 11/05 Name NO-BREAK SPACE INVERTED EXCLAMATION MARK CENT SIGN POUND SIGN CURRENCY SIGN YEN SIGN BROKEN BAR PARAGRAPH SIGN DIAERESIS COPYRIGHT SIGN FEMININE ORDINAL INDICATOR LEFT ANGLE QUOTATION MARK NOT SIGN SOFT

47、 HYPHEN REGISTERED TRADE MARK SIGN MACRON DEGREE SIGN, RING ABOVE PLUS-MINUS SIGN SUPERSCRIPT TWO SUPERSCRIPT THREE ACUTE ACCENT MICRO SIGN I=_- -, ECMA ECMA*4 b m 340Y593 0007277 2 W -8- Bit Combination 11/06 11/07 11/08 11/09 11/10 11/11 11/12 11/13 11/14 11/15 12/00 12/01 12/02 12/03 12/04 12/05

48、12/06 12/07 12/08 12/09 12/10 12/11 12/12 12/13 12/14 12/15 13/00 13/01 13/02 13/03 13/04 13/05 13/06 13/07 13/08 13/09 13/10 Name PILCROW SIGN MIDDLE DOT CEDILLA SUPERSCRIPT ONE MASCULINE ORDINAL INDICATOR RIGHT ANGLE QUOTATION MARK VULGAR FRACTION ONE QUARTER VULGAR FRACTION ONE HALF VULGAR FRACTI

49、ON THREE QUARTERS INVERTED QUESTION MARK CAPITAL LETTER A WITH GRAVE ACCENT CAPITAL LETTER A WITH ACUTE ACCENT CAPITAL LETTER A WITH CIRCUMFLEX ACCENT CAPITAL LETTER A WITH TILDE CAPITAL LETTER A WITH DIAERESIS CAPITAL LETTER A WITH RING ABOVE CAPITAL DIPHTHONG A WITH E CAPITAL LETTER C WITH CEDILLA CAPITAL LETTER E WITH GRAVE ACCENT CAPITAL LETTER E WITH ACUTE ACCENT CAPITAL LETTER E WITH CIRCUMFLEX ACCENT CAPITAL LETTER E WITH DIAERESIS CAPITAL LETTER I WITH GRAVE ACCENT CAPITAL LETTER I WITH ACUTE ACCENT CAPITAL LETTER I WITH CIRCUM

展开阅读全文
相关资源
猜你喜欢
相关搜索

当前位置:首页 > 标准规范 > 国际标准 > 其他

copyright@ 2008-2019 麦多课文库(www.mydoc123.com)网站版权所有
备案/许可证编号:苏ICP备17064731号-1