1、INTERNATIONAL STANDARD lSO/lEC 8859-l First edition 1998-04-I 5 Information technology - 8-bit single-byte coded graphic character sets - Part 1: Latin alphabet No. 1 Technologies de /information - Jeux de caracteres graphiques cod code: A set of unambiguous rules that establishes a character set an
2、d the one-to-one relationship between the characters of the set and their bit combinations. 4.6 coded-character-data-element (CC-data- element): An element of interchanged information that is specified to consist of a sequence of coded representations of characters, in accordance with one or more id
3、entified standards for coded character sets. 4.7 graphic character: A character, other than a control function, that has a visual representation normally handwritten, printed or displayed, and that has a coded representation consisting of one or more bit combinations. NOTE - In ISO/IEC 8859 a single
4、 bit combination is used to represent each character. 4.8 graphic symbol: A visual representation of a graphic character or of a control function. The bit combinations may be interpreted to represent numbers in binary notation by attributing the following weights to the individual bits: Bit b8 b7 b6
5、 b5 b4 b3 b2 bl Weight 128 64 32 16 8 4 2 1 Using these weights, the bit combinations are identified by notations of the form xx/yy, where xx and yy are numbers in the range 00 to 15. The correspondence between the notations of the form xx/yy and the bit combinations consisting of the bits b, to b,
6、is as follows: - xx is the number represented by b,s, b, b, and b, where these bits are given the werghts 8, 4, 2, and 1 respectively. - yy is the number represented by bp, b, b, and b, where these bits are given the weights 8, 4, 2, and 1 respectively. The bit combinations are also identified by no
7、tations of the form hk, where h and k are numbers in the range 0 to F in hexadecimal notation. The number h is the same as the number xx described above, and the number k the same as the number yy described above. 5.2 Layout of the code table An g-bit code table consists of 256 positions arranged in
8、 16 columns and 16 rows. The columns and the rows are numbered 00 to 15. In hexa- decimal notation the columns and the rows are numbered 0 to F. The code table positions are identified by notations of the form xx/yy, where xx is the column number and yy is the row number. The column and row numbers
9、are shown at the top and left edges of the table respectively. The code table positions are also identified by notations of the form hk, where h is the column number and k is the row number in hexadecimal notation. The column and row numbers are shown at the bottom and right edges of the table respe
10、ctively. 4.9 position: That part of a code table identified by its column and row coordinates. 5 Notation, code table, and names 5.1 Notation The bits of the bit combinations of the g-bit code are identified by b, b, b, b, b, b, b, and b, where b, is the highest-order, or most-significant bit and b,
11、 is the lowest-order, or least-significant bit. The positions of the code table are in one-to-one correspondence with the bit combinations of the code. The notation of a code table position, of the form xx it is specified in other International Standards, for example ISOAEC 6429. Table 2 - Code tabl
12、e of Latin alphabet No. 1 b4 b b2 bl 0001 02 03 0405 06070809 IO 11 12 13 14 151 oooo()o sp 0 a P - p NBSP A 0 6 3 0 0 0 0 101 !lAQaq i*hhKl 001002 w 2 B R b r a 2ii)iib2 001103 #3CScs f 3 i6, 26 3 0 10004 $4DTdt AGiiG4 l+l+l0l I I%IIwJl4ul I lLwlm5l o I I 006 K C k if. 6 ii ti B 110012 ,Nn- KjtPlib
13、E 111115 ; ?O 0 5 ISO/IEC 8859-1:1998(E) 0 ISO/IEC 7 Identification of the character set 7.1 Identification according to ISO/IEC 2022 and ISO/IEC 4873 The graphic characters of this part of ISO/IEC 8859 constitute a single coded character set. However in accordance with ISO/IEC 2022 and ISO/IEC 4873
14、 the code table of this part of ISO/IEC 8859 may be considered to consist of the following components: - The character SPACE represented by bit combination 02/00; - a 94-character GO graphic character set represented by bit combinations 02/01 to 07/14; - a 96-character Gl graphic character set repre
15、sented by bit combinations lO/OO to 15/15. When the identification methods of ISO/IEC 2022 or ISO/IEC 4873 are used this part of ISO/IEC 8859 shall be identified by the following pair of designation functions: GZD4 04/02 (ESC 02/08 04/02) Gl D6 04/01 (ESC 02/l 3 04/01) NOTE - The corresponding escap
16、e sequences are shown in parentheses. 7.2 Identification according to ISO/IEC 8824-l (ASN.l) In the terminology of ISO/IEC 8824-l the character set of this part of ISO/IEC 8859 and the corresponding coded representations are distinct, and are known as the “character abstract syntax” and the “charact
17、er transfer syntax” respectively. When the identification methods of ISO/IEC 8824-l are used this part of ISO/IEC 8859 shall be identified by the following object identifiers: - character set iso standard 8859 1 abstract-syntax (1) - coded representations iso standard 8859 1 transfer-syntax (0) The
18、corresponding object descriptors shall be: - character set “IS0 8859 part 1 repertoire” - coded representations “IS0 8859 part 1 code” 7.3 Identification using the IS0 International register of coded character sets to be used with escape sequences According to 7.1 above the character set of this par
19、t of ISO/IEC 8859 may be considered to consist of the character SPACE, a 94-character GO graphic character set, and a 96-character Gl graphic character set. The GO and Gl graphic character sets may be identified by the use of the Registration Numbers from the IS0 International register of coded char
20、acter sets to be used with escape sequences. When these registration numbers are used this part of ISO/IEC 8859 shall be identified by the following pair of registration numbers: - GO graphic character set ISO-IR 6 - Gl graphic character set ISO-IR 100 0 ISO/IEC ISOAEC 8859-l :1998(E) Annex A (infor
21、mative) Coverage of languages by parts 1 to 10 of lSO/IEC 8859 A.1 Languages of European origin written in Latin script The following parts of ISO/IEC 8859 specify coded character sets which comprise various different selections of characters based on the Latin alphabet. These sets are identified by
22、 the numbers 1 to 6 as shown: The following official and regional languages written in Europe are covered by the Latin alphabets l-6 as indicated by number in table A.l: ISO/IEC 8859-l Latin alphabet No. 1 ISO/IEC 8859-2 Latin alphabet No. 2 ISO/IEC 8859-3 Latin alphabet No. 3 ISO/IEC 8859-4 Latin a
23、lphabet No. 4 ISO/IEC 8859-9 Latin alphabet No. 5 ISO/IEC 8859-10 Latin alphabet No. 6 Table A.1 - Language coverage Language Covered by alphabet(s) Language Covered by alphabet(s) Language Covered by alphabet(s) Albanian 1 2 5 Frisian 1 5 Norwegian 1 4 5 6 Basque 1 5 Galician 1 5 Polish 2 Breton 1
24、5 German 1 2 3 4 5 6 Portuguese I 3 5 Catalan 1 5 Greenlandic 1 4 5 6 Rhaeto-Romanic 1 5 Croat 2 Hungarian 2 Romanian 2 Czech 2 Icelandic 1 6 Sami 4 6 Danish 1 4 5 6 Irish Gaelic 1 5 6 Scottish Gaelic 1 5 Dutch 1 5 (new orthography) Slovak 2 English 1 2 3 4 5 6 Italian 1 3 5 Slovene 2 4 6 Esperanto
25、3 Latin 1 2 3 4 5 6 Sorbian 2 Estonian 4 6 Latvian 4 Spanish 1 5 Faroese 1 6 Lithuanian 4 6 Swedish 1 4 5 6 Finnish 4 5 6 Luxemburgish 1 5 Turkish (3) 5 French (3) (5) Maltese 3 NOTES 1 The list of languages in table A.1 is not exhaustive. It shows the languages that are included in the Scope clause
26、 of each part of ISO/IEC 8859. 2 For writing French three characters (CE, (2, ?) not specified in parts 1, 3 and 9, are also needed. 3 The various Sami languages use partly differing orthographies. The character sets in parts 4 and IO cover the requirements of the Sami languages most commonly used i
27、n Finland, Norway and Sweden. For the Skolt Sami language used in Finland and Norway additional characters are needed. These are included in ISO-IR 158 and 197. 4 There are several official written languages outside Europe that are covered by Latin alphabet No. 1. Examples are Indonesian/Malay, Taga
28、log (Philippines), Swahili, Afrikaans. 5 Use of Latin alphabet No. 3 for Turkish is deprecated. 7 ISOAEC 8859-l :1998(E) 0 ISO/IEC A.2 Languages written in non-Latin scripts The following parts of ISO/IEC 8859 specify coded character sets which include graphic characters from alphabets other than th
29、e Latin alphabet: ISO/IEC 8859-5 Latin/Cyrillic alphabet ISO/IEC 8859-6 Latin/Arabic alphabet ISO/IEC 8859-7 Latin/Greek alphabet ISO/IEC 8859-8 Latin/Hebrew alphabet The following official and regional languages are covered by these alphabets: The Cyrillic characters included in part 5 cover Bulgar
30、ian, Byelorussian, (Slavic) Macedonian, Russian, Serbian and Ukrainian (as written up to 1990, see also Scope of part 5). The Arabic characters included in part 6 cover Arabic. The Greek characters included in part 7 cover Greek (monotonikdorthography). The Hebrew characters included in part 8 cover
31、 Hebrew. ISO/IEC ISOAEC 8859-1:1998(E) Annex B (informative) Main differences between IS0 8859-l :I 987 and this first edition of this part of ISO/IEC 8859 B.l The names of the graphic characters have been amended where necessary to align them with the names of characters adopted for all standards o
32、n coded character sets developed under the responsibility of ISO/IEC JTC 1. For each character the short identifiers specified in ISO/IEC 10646-l Amendment 9 have been added to table 1. 8.2 The new style of conformance clause, adopted for all standards on coded character sets, has been introduced. B
33、.3 Object identifiers conforming to Abstract Syntax Notation One (ASN.l, see ISO/IEC 8824-l) are specified in 7.2 for the character set, and the corresponding coded representations, of this part of ISO/IEC 8859. Registration numbers from the International register of coded character sets to be used
34、with escape sequences, have been included as an additional method of identifying the coded character set of this part of ISO/IEC 8859. B.4 The previous Annex A (Geographical areas of application of the coded character set of this part of IS0 8859) has been replaced by a new Annex A that identifies t
35、he coverage of languages by parts l-l 0 of ISO/IEC 8859. The previous Annex B (Relationship with IS0 6937/2) has been deleted. 8.5 Various editorial adjustments and clarifications have been made to the text of the standard. The hexadecimal equivalents of the bit combinations have been added to table
36、s 1 and 2, and a revised font has been used for the graphic symbols in table 2. B.6 Annex C, Bibliography, has been added. 9 ISOAEC 8859-l : 1998(E) 0 ISOAEC Annex C (informative) Bibliography ISOAEC 6429:1992, Information technology - Control functions for coded character sets. ISOAEC 10367:1991, I
37、nformation technology - Standardized coded graphic character sets for use in B-bit codes. ISO/IEC 10646-l :1993, information technology - Universal Mu/tip/e-Octet Coded Character Set (KS) - Part 1: Architecture and Basic Multilingual Plane. IS0 International register of coded character sets to be used with escape sequences. 10 lSO/lEC 8859-1:1998(E) 0 ISOAEC ICS 35.040 Descriptors: data processing, text processing, information interchange, data transmission, data codes, IS0 eight-bit code, coded character sets, Latin characters. Price based on 10 pages