1、Standard ECMA-433rdEdition December 1991Reprinted in electronic form in January 1999Standardizing Information and Communication SystemsPhone: +41 22 849.60.00 - Fax: +41 22 849.60.01 - URL: http:/www.ecma.ch - Internet: helpdeskecma.ch8-Bit Coded Character SetStructure and Rules.Standard ECMA-43Dece
2、mber 1991Standardizing Information and Communication SystemsPhone: +41 22 849.60.00 - Fax: +41 22 849.60.01 - URL: http:/www.ecma.ch - Internet: helpdeskecma.chMB E-043.DOC 24-02-99 15,458-Bit Coded Character SetStructure and Rules.Brief HistoryECMA published the first edition of this Standard ECMA-
3、43 for an 8-bit coded character set in December 1974. It was a verygeneral standard based on the facilities offered by the code extension techniques of Standard ECMA-35.Since 1974 these techniques have evolved considerably and, as a consequence, a 4th edition of Standard ECMA-35 waspublished in Marc
4、h 1985. It was then decided to revise Standard ECMA-43 so as to take advantage of the additional facilitiesprovided by Standard ECMA-35 and at the same time to specify a definite structure and precise rules for the definition of an 8-bit coded character set. The 2nd edition of Standard ECMA-43 was t
5、echnically identical with the 2nd edition of ISO 4873.Further developments of ISO 4873 led to the introduction of a new features, the “G Set hierarchy“, which allows the presenceof a coded character in more than one G set. Moreover the G0 set is now fully specified. It corresponds to the graphic par
6、t ofthe International Reference Version (IRV) of Standard ECMA-6 (sixth edition of December 1991).Adopted by the General Assembly of ECMA in December 1991.- ii -.- i -Table of contents1Scope 12 Conformance and implementation 12.1 Conformance 12.1.1 Conformance of information interchange 12.1.2 Confo
7、rmance of devices 12.2 Implementation 13 Normative references 24 Definitions 24.1 Active position 24.2 Bit combination 24.3 Byte 24.4 Character 24.5 Character position 24.6 Coded-character-data-element (CC-data-element) 24.7 Coded character set; code 24.8 Code extension 24.9 Code table 24.10 Control
8、 character 24.11 Control function 34.12 Device 34.13 Escape sequence 34.14 Final byte 34.15 Graphic character 34.16 Graphic symbol 34.17 Repertoire 34.18 User 35 Notation, code table and names 35.1 Notation 35.2 Code table 45.3 Names 46 Structure of the 8-bit code 46.1 Elements of the 8-bit code 46.
9、2 Identification of the elements of the 8-bit code 56.3 Invocation 56.3.1 C0 set 56.3.2 Character SPACE 56.3.3 G0 set 56.3.4 Character DELETE 56.3.5 C1 set 56.3.6 G1 set 56.3.7 G2 set 56.3.8 G3 set 5- ii -7 Specification of the characters of the 8-bit code 57.1 C0 set 57.2 Character ESCAPE 57.3 Char
10、acter SPACE 67.4 G0 set 67.5 Character DELETE 87.6 C1 set 87.7 G1 set 87.8 G2 set 87.9 G3 set 97.10 Summary of the specification of the 8-bit code 98 Levels 98.1 Level 1 98.2 Level 2 98.3 Level 3 109 Version of the 8-bit code 109.1 Contents of a version 109.2 Unique coding of characters 1010 Identif
11、ication of version and level 1010.1 Purpose and context of identification 1010.2 Identification of level 1110.3 Identification of a version 1110.4 Switching from one version to another 1110.5 Switching from one level to another 11Annex A - Restrictions applicable to the C0 and C1 sets 15Annex B - Sh
12、ift functions 17Annex C - Composite graphic characters 19Annex D - Use of bit combinations 00/14 and 00/15 21Annex E - Main differences between the 2nd edition (1985) and the present (third) edition 231ScopeThis ECMA Standard specifies an 8-bit code derived from, and compatible with, the 7-bit coded
13、 character setspecified in ECMA-6.The characteristics of this code are also in conformance with the code extension techniques specified inECMA-35.This ECMA Standard specifies an 8-bit code with a number of options. It also provides guidance on how toexercise the options to define specific versions.T
14、his code is primarily intended for general information interchange within an 8-bit environment among dataprocessing systems and associated equipment, and within data communication systems. The need for graphiccharacters and control functions in data processing has also been taken into account.The co
15、de includes the ten digits as well as the 52 small and capital letters of the basic Latin alphabet andmay include accented letters, special Latin letters and/or the letters of one or several non-Latin alphabet(s).2 Conformance and implementation2.1 Conformance2.1.1 Conformance of information interch
16、angeA coded-character-data-element (CC-data-element) within coded information for interchange is inconformance with a version of this ECMA Standard if all the coded representations of characters withinthat CC-data-element conform to the requirements of clause 9.A claim of conformance shall identify
17、the version adopted.2.1.2 Conformance of devicesA device is in conformance with this ECMA Standard if it conforms to the requirements of 2.1.2.1, andeither or both of 2.1.2.2 and 2.1.2.3 below. A claim of conformance shall identify the document whichcontains the description specified in 2.1.2.1, and
18、 shall identify the version adopted.2.1.2.1 Device descriptionA device that conforms to this ECMA Standard shall be the subject of a description that identifies themeans by which the user may supply characters to the device, or may recognize them when they aremade available to him, as specified resp
19、ectively in 2.1.2.2 and 2.1.2.3 below.2.1.2.2 Originating devicesAn originating device shall allow its user to supply any sequence of characters from the versionadopted, and shall be capable of transmitting their coded representations within a CC-data-element.2.1.2.3 Receiving devicesA receiving dev
20、ice shall be capable of receiving and interpreting any coded representations ofcharacters that are within a CC-data-element, and that conform to 2.1.1 of this ECMA Standard, andshall make the corresponding characters available to its user in such a way that the user can identifythem from among those
21、 of the version adopted, and can distinguish them from each other.2.2 ImplementationThe use of this code requires definitions of its implementation in various media. For example, these couldinclude punched tapes, punched cards, magnetic and optical media and transmission channels, thuspermitting int
22、erchange of data to take place either indirectly by means of an intermediate recording in aphysical medium, or by local connection of various units (such as input and output devices and computers)or by means of data transmission equipment.The implementation of this code in physical media and for tra
23、nsmission, taking into account the need forerror checking, is the subject of other international standards.- 2 -3 Normative referencesECMA-6 7-bit coded character set for information interchange (1991).ECMA-35 Code extension techniques (1985).ECMA-48 Control functions for 7-bit and 8-bit coded chara
24、cter sets (1991).ISO International Register of Coded Character Sets to be Used with Escape Sequences (ISO 2375).4 DefinitionsFor the purpose of this ECMA Standard the following definitions apply.4.1 Active positionThe character position which is to image the graphic symbol representing the next grap
25、hic character orrelative to which the next control function is to be executed.NOTE 1In general, the active position is indicated in a display by a cursor.4.2 Bit combinationAn ordered set of bits used for the representation of characters.4.3 ByteA bit string that is operated upon as a unit.4.4 Chara
26、cterA member of a set of elements used for the organization, control or representation of data.4.5 Character positionThe portion of a display that is imaging or is capable of imaging a graphic symbol.4.6 Coded-character-data-element (CC-data-element)An element of interchanged information that is spe
27、cified to consist of a sequence of coded representationsof characters, in accordance with one or more identified standards for coded character sets.NOTE 2In a communication environment according to the Reference Model for Open Systems Interconnection ofISO 7498, a CC-data-element will form all or pa
28、rt of the information that corresponds to thePresentation-Protocol-Data-Unit (PPDU) defined in that International Standard.NOTE 3When information interchange is accomplished by means of interchangeable media, a CC-data-elementwill form all or part of the information that corresponds to the user data
29、, and not that recorded duringformatting and initialization.4.7 Coded character set; codeA set of unambiguous rules that establishes a character set and the one-to-one relationship between thecharacters of the set and their bit combinations.4.8 Code extensionThe techniques for the encoding of charac
30、ters that are not included in the character set of a given code.4.9 Code tableA table showing the character allocated to each bit combination in a code.4.10 Control characterA control function the coded representation of which consists of a single bit combination.- 3 -4.11 Control functionAn action
31、that affects the recording, processing, transmission, or interpretation of data, and that has acoded representation consisting of one or more bit combinations.4.12 DeviceA component of information processing equipment which can transmit, and/or receive, coded informationwithin CC-data-elements.NOTE
32、4It may be an input/output device in the conventional sense, or a process such as an application programor gateway function.4.13 Escape sequenceA string of bit combinations that is used for control purposes in code extension procedures. The first ofthese bit combinations represents the control funct
33、ion ESCAPE.4.14 Final byteThe bit combination that terminates an escape sequence or a control sequence.4.15 Graphic characterA character, other than a control function, that has a visual representation normally handwritten, printed ordisplayed, and that has a coded representation consisting of one o
34、r more bit combinations.4.16 Graphic symbolA visual representation of a graphic character or of a control function.4.17 RepertoireA specified set of characters that are represented by means of one or more bit combinations of a codedcharacter set.4.18 UserA person or other entity that invokes the ser
35、vices provided by a device.NOTE 5This entity may be a process such as an application program if the “device“ is a code convertor or agateway function, for example.NOTE 6The characters, as supplied by the user or made available to him, may be in the form of codes local to thedevice, or of non-convent
36、ional visible representations, provided that 2.1.2 above is satisfied.5 Notation, code table and names5.1 NotationThe bits of the bit combinations of the 8-bit code are identified by b8, b7, b6, b5, b4, b3, b2and b1, whereb8is the highest-order, or most-significant bit, and b1is the lowest-order, or
37、 least-significant, bit.The bit combinations may be interpreted to represent integers in the range 0 to 255 in binary notation byattributing the following weights to the individual bits:Bit b8b7b6b5b4b3b2b1Weight 128 64 32 16 8 4 2 1In this ECMA Standard, the bit combinations are identified by notat
38、ions of the form xx/yy, where xx andyy are numbers in the range 00 to 15. The correspondence between the notations of the form xx/yy and thebit combinations consisting of the bits b8to b1, is as follows:- xx is the number represented by b8, b7, b6and b5where these bits are given the weights 8, 4, 2
39、and 1respectively;- 4 - yy is the number represented by b4, b3, b2and b1where these bits are given the weights 8, 4, 2 and 1respectively.The notations of the form xx/yy are the same as the ones used to identify code table positions, where xx isthe column number and yy is the row number (see 5.2).5.2
40、 Code tableAn 8-bit code table consists of 256 positions arranged in 16 columns and 16 rows. The columns and rowsare numbered 00 to 15.The code table positions are identified by notations of the form xx/yy, where xx is the column number andyy is the row number.The positions of the code table are in
41、one-to-one correspondence with the bit combinations of the code.The notation of a code table position, of the form xx/yy, is the same as that of the corresponding bitcombination.5.3 NamesThis ECMA Standard assigns a unique name to each character. In addition, it specifies an acronym forcontrol chara
42、cters and for the characters SPACE and DELETE, and a graphic symbol for each graphiccharacter. By convention, only capital letters, space and hyphen are used for writing the names of thecharacters. For acronyms only capital letters, and digit are used. It is intended that the acronyms and thisconven
43、tion be retained in all translations of the text.The names chosen to denote graphic characters are intended to reflect their customary meaning. However,this ECMA Standard does not define and does not restrict the meanings of graphic characters. Neitherdoes it specify a particular style or font desig
44、n for the graphic characters when imaged.6 Structure of the 8-bit code6.1 Elements of the 8-bit codeThe 8-bit code consists of the following parts (see figure 1).a) A C0 setA set of up to 30 control characters represented by bit combinations 00/00 to 01/15, except 00/14 and00/15 which shall be unuse
45、d.b) The character SPACEA graphic character represented by bit combination 02/00.c) A G0 setA set of 94 graphic characters represented by bit combinations 02/01 to 07/14.d) The character DELETEA character represented by bit combination 07/15.e) A C1 setA set of up to 32 control characters represente
46、d by bit combinations 08/00 to 09/15.f) A G1 setA set of up to 96 graphic characters represented by bit combinations 10/00 to 15/15.g) A G2 setA set of up to 96 graphic characters.h) A G3 setA set of up to 96 graphic characters.- 5 -6.2 Identification of the elements of the 8-bit codeThe method of i
47、dentification of the code elements listed in 6.1 is specified in clause 10.6.3 Invocation6.3.1 C0 setThe identification of the C0 set also invokes that set.6.3.2 Character SPACEThe character SPACE shall be represented by bit combination 02/00. It is not explicitly invoked.6.3.3 G0 setThe G0 set shal
48、l be as specified in 7.4. It is not explicitly invoked.6.3.4 Character DELETEThe character DELETE shall be represented by bit combination 07/15. It is not explicitly invoked.6.3.5 C1 setThe identification of the C1 set also invokes that set.6.3.6 G1 setThe identification of the G1 set also invokes t
49、hat set. The locking-shift function LS1R shall also invokethe G1 set.6.3.7 G2 setEither the set as a whole shall be invoked by the locking-shift function LS2R (see annex B) intocolumns 10 to 15, or individual characters of it shall be invoked by means of the single-shift functionSS2, (see 7.6).6.3.8 G3 setEither the set as a whole shall be invoked by the locking-shift function LS3R (see annex B) intocolumns 10 to 15, or individual characters of the set shall be invoked by means of the single-shiftfunction SS3 (see 7.6).7 Specification of the characters of the 8-bit codeThe use of cont