1、Designation: E2078 00 (Reapproved 2010)Standard Guide forAnalytical Data Interchange Protocol for MassSpectrometric Data1This standard is issued under the fixed designation E2078; the number immediately following the designation indicates the year oforiginal adoption or, in the case of revision, the
2、 year of last revision. A number in parentheses indicates the year of last reapproval. Asuperscript epsilon () indicates an editorial change since the last revision or reapproval.1. Scope1.1 This guide covers the implementation of the MassSpectrometric Data Protocol in analytical software applica-ti
3、ons. Implementation of this protocol requires:1.1.1 Specification E2077, which contains the full set ofdata definitions. The mass spectrometric data protocol is notbased upon any specific implementation; it is designed to beindependent of any particular implementation so that imple-mentations can ch
4、ange as technology evolves. The protocol isimplemented in categories to speed its acceptance throughactual use.1.1.2 Specification E2077 contains a full description of thecontents of the data communications protocol, including theanalytical information categories with data elements and theirattribut
5、es for most aspects of mass spectrometric tests.1.2 The analytical information categories are a practicalconvenience for breaking down the standardization processinto smaller, more manageable pieces. It is easier for develop-ers to build consensus and produce working systems based onsmaller informat
6、ion sets, without the burden and complexity ofthe hundreds of data elements contained in all the categories.The categories also assist vendors and end users in using theguide in their computing environments.1.3 The network common data format (NetCDF) data inter-change system is the container used to
7、 communicate databetween applications in a way that is independent of bothcomputer architectures and end-user applications. In essence, itis a special type of application designed for data interchange.1.4 The common data language (CDL) template for massspectrometry is a language specification of the
8、 mass spectrom-etry dataset being interchanged. With the use of the NetCDFutilities, this human-readable template can be used to generatean equivalent binary file and the software subroutine callsneeded for input and output of data in analytical applications.2. Referenced Documents2.1 ASTM Standards
9、:2E2077 Specification for Analytical Data Interchange Proto-col for Mass Spectrometric Data2.2 Other Standard:NetCDF Users Guide32.3 ISO Standards:48601:1988 Data elements and interchange formats, (Firstedition published 1988-06-15; with Technical Corrigen-dum 1 published 1991-05-01)3. List of Conte
10、nts and Use3.1 NetCDF ToolkitThe protocol is an application pro-gramming interface (API) layered on top of the public domainNetCDF toolkit. NetCDF is a set of tools that facilitate readingor writing platform-independent, self-describing data files. Alldata in a NetCDF file is written using the exter
11、nal datarepresentation (XDR). XDR was developed by Sun Microsys-tems and is used for platform-independent file systems for allworkstations and personal computers. Each NetCDF dataelement is self-describing - it has a name, type, and dimen-sionality. A NetCDF file contains three parts: a dimensionsse
12、ction, which defines the names and size of all dimensionsused to describe variables; a variables section, which definesthe names, data types, dimensionality, and attributes for allvariables used in the file; and finally, a data section, whichcontains the actual values assigned to the variables. Attr
13、ibutesare numbers or strings which augment the description ofvariables or the file as a whole.3.1.1 For example, a variable “x_axis_ values” might con-tain an array of numbers representing the abscissa of atwo-dimensional data set. It would have a dimension, possiblynamed “x_axis_size,” which would
14、specify the number of1This guide is under the jurisdiction of ASTM Committee E13 on MolecularSpectroscopy and Separation Science and is the direct responsibility of Subcom-mittee E13.15 on Analytical Data.Current edition approved Nov. 1, 2010. Published November 2010. Originallyapproved in 2000. Las
15、t previous edition approved in 2005 as E2078 00 (2005).DOI: 10.1520/E2078-00R10.2For referenced ASTM standards, visit the ASTM website, www.astm.org, orcontact ASTM Customer Service at serviceastm.org. For Annual Book of ASTMStandards volume information, refer to the standards Document Summary page
16、onthe ASTM website.3Available for Russell K. Rew, Unidata Program Center, University Corporationfor Atmospheric Research, P.O. Box 3000, Boulder, CO 80307-3000.4Available from ISO, 1 Rue de Varembe, Case Postale 56, CH 1211, Geneve,Switzerland.1Copyright ASTM International, 100 Barr Harbor Drive, PO
17、 Box C700, West Conshohocken, PA 19428-2959, United States.abscissa points. The variable might have some descriptiveattributes, such as “units” (with a value of“ Seconds,” perhaps),“scale_factor” (with a value of 1000.0, specifying that allstored abscissa values should be multiplied by 1000.0 to get
18、 theactual value), or “long_name” (with value“ Time”, whichmight be used to label the abscissa when drawing a plot).3.1.2 The NetCDF toolkit has been placed in the publicdomain by the Unidata Program Center, a non-profit softwaresupport organization for the University Corporation for Atmo-spheric Re
19、search. The Unidata Program Center is funded bythe National Science Foundation, National Center for Atmo-spheric Research, and other organizations and provides ongo-ing development and support of NetCDF and related tools.3.1.3 The NetCDF version currently supported in thisimplementation is 2.3.2.3.2
20、 Data StructuresEach of the analytical informationclass tables in the specification document has a correspondingdata structure; however, not every field in each table has acorresponding data element in a structure, and the data struc-tures may have elements that do not appear in any class table.Most
21、 of these differences are due to details of the implemen-tation which could not be hidden.3.2.1 The data structures provide the mapping between theattribute name and data type described in the specification andthe field and actual data type in the file. The actual NetCDFdimension, variable, and attr
22、ibute names are hidden from theAPI level. These names in fact are irrelevant for applicationprograms; it is the data structure which provides the informa-tion interchange between the application and the file.3.2.2 Each data structure and its mapping to an analyticalinformation class are described in
23、 detail later in this guide.3.2.3 Application Programming Interface Functions:3.2.3.1 The application programming interface providesprogrammatic access to the contents of the files. Mass spectraldata occurs in three forms: global information, which relates tothe contents of the entire file, informat
24、ion which describes eachpart of a multi-component instrument, and information whichchanges on a scan-by-scan basis for spectra and library entries.API functions are provided for opening a file for reading orwriting; closing a file; reading and writing global, per-component instrument, and per-scan s
25、pectral and library infor-mation; initializing and clearing data structure contents; and afew miscellaneous utility functions. Each of these functions isdescribed in detail in a later section of this guide.3.2.4 Enumerated SetsMany of the attributes listed in theAnalytical Data Interchange Protocol
26、for Mass SpectrometricData specification have an enumerated set of associated values.The attribute may take only one value from that restricted set.In the implementation, each such attribute is defined as aformal C type, and the allowed values are defined as anenumerated set of that formal type. Eac
27、h enumerated value isassociated with a unique string literal, and it is these stringliterals, not the enumeration values, which are written to orread from the file. This practice both enforces the use of theproper enumeration values and follows the NetCDF dictumthat files be self-describing. If the
28、enumeration values werewritten instead of the strings, then some lookup mechanismwould be required external to the NetCDF file to translate thenumber into something meaningful.4. Conventions4.1 The format convention adopted in this guide is asfollows:(1) Normal text is presented in this font (Times
29、NewRoman).(2) API symbols (functions, formal types, etc.) are pre-sented in boldface Helvetica font.(3) Parameters to API functions are presented in italicHelvetica font.(4) Example code is presented in normal Helvetica font.4.2 Other ConventionsAll indices begin at zero (C con-vention). In several
30、data structures, a scan_no or inst_noelement must be loaded before reading or writing. Thisidentifies the scan or instrument component number for whichdata will be read or written. In all cases, scan or instrumentcomponent numbers begin at zero.4.2.1 All date/time stamps are formatted using the ISOs
31、tandard 8601 format referenced in the specification. An APIutility function is provided for conversion between date/timeinformation in numeric form and ISO-8601 string format (seems_convert_date(), below).5. Mass Spectrometric Data Protocol Distribution Kit5.1 It is intended that potential users of
32、this implementationcan obtain a complete NetCDF and API distribution kit fromvarious instrument vendors Web sites. Information on how toobtain the kit will be posted on the ASTM website(www.astm.org) under Committee E01.25.5.2 The Analytical Data Interchange Protocol for MassSpectrometric Data distr
33、ibution kit contains:5.2.1 SoftwareNetCDF distribution kit from Unidata(with the modified makefile needed to make the kit compile outof the box).5.2.2 NetCDF Users Guidesupplied by Unidata ProgramCenter.5.2.3 Specification E2077.5.2.4 Guide E2078.6. Hardware and Software6.1 This section describes th
34、e hardware and software con-figurations used for testing. In general, the NetCDF systemputs very few requirements on the hardware because mostroutines are left on disk. Only routines being used at anyparticular time are kept in memory.Any limitations found weretypically those not imposed by NetCDF b
35、ut ones imposed bythe operating system or environment.6.1.1 Hardware (Personal Computers)The personal com-puter system hardware originally used for testing was:6.1.1.1 Intel 80286 processor,6.1.1.2 640K minimum,6.1.1.3 Monochrome, EGA, VGA graphics,6.1.1.4 20 megabyte minimum, 80 megabyte hard-disk
36、istypical, and6.1.1.5 A mouse (optional).E2078 00 (2010)26.1.1.6 NetCDF works well on AT-class machines andhigher. NetCDF does not have the items in 6.1.1.1-6.1.1.5 asrequirements. These are just the minimum, base-level systemsthat were used.6.1.2 SoftwareNetCDF runs on MS-DOS, OS/2, Macin-tosh, Win
37、dows 95, and Windows NT operating systems forpersonal computers. NetCDF was originally ported from UNIXto DOS running on an IBM-PS/2 Model 80. It was recentlyported to the Macintosh OS. NetCDF is written in the Cprogramming language, and there are FORTRAN jacketsavailable for applications that want
38、to use FORTRAN calls.The personal computer software originally employed for test-ing and developing NetCDF applications was:6.1.2.1 Microsoft DOS V3.3 or above,6.1.2.2 Microsoft C Compiler V6.0,6.1.2.3 Microsoft Windows V3.0,6.1.2.4 Microsoft Windows SDK, and6.1.2.5 NetCDF Version 2.0.1.6.1.3 Workst
39、ations and ServersNetCDF runs easily onUNIX workstations such as Sun 3, Sun 4, VAXstations,DECstation 3100, VAXstation II running ULTRIX or VMS,and IBM RS/6000. There are no particular hardware require-ments for workstation class machines, since all workstationshave the minimum hardware outlined for
40、 personal computersin 6.1.1.7. Significance and Use7.1 General Coding GuidelinesThe NetCDF libraries aresupplied to developers as source code. End users receive thelibraries in compiled binary form as part of a vendorsapplication.7.1.1 Developers setting out to write a program to converttheir data f
41、iles to the Mass Spectrometric Data Protocol shouldconsider using the NetCDF utilities ncgen and ncdump. Afterdevelopers create the NetCDF file they should use the ncdumpprogram to generate the ASCII representation of the data file,and examine it to ensure the data are being correctly put intothe fi
42、le.7.2 Make Files for NetCDF Libraries and UtilitiesIngeneral the compilation is straightforward. The make files weremodified after they were received from the Unidata Corpora-tion, because they did not compile the first time on PCs. Thechanges needed to get the Unidata distribution to run on DOSare
43、 (1) rename the file MAKEFILE to UNIX.MK, and (2)rename MSOFT.MK to MAKEFILE, and then run NMAKE.The default switches in the Unidata distribution use theswitches for the floating point coprocessor and MicrosoftWindows options.7.2.1 The protocol kit contains some complete makefileexamples for Microso
44、ft C V6.0 running on DOS. The Mi-crosoft C V6.0 compiler manual should be consulted for theexact meaning of the compiler and linker options.7.2.2 The VMS and SunOS compilation instructions are indirectories for those operating systems.7.3 NetCDF Library Build OrderThe NetCDF librariesmust be built i
45、n a specific order. The correct order to build theNetCDF directories is:UTILXDRSRCNCDUMPNCGENNCTEST7.3.1 The UTIL and XDR makefiles work as distributedusing NMAKE with Microsoft C V6.0.8. CDL Template Structure8.1 A NetCDF template is built from CDL statements and isstructured into three sections: (
46、1) dimension declarations, (2)variable declarations, and (3) the data section.8.2 Afew points of clarification about the CDLlanguage aregiven here to facilitate its understanding. For more in-depthinformation on CDL, please consult the NetCDF Users Guide.8.2.1 A NetCDF template starts with the word
47、“NetCDF”followed by the dataset name.8.2.2 CDL comments are indicated by two forward slashcharacters (/).8.2.3 Section indicators (dimensions:, variables:, and data:)end with a colon character (:). These are the only tokens thatend with a colon character.8.2.4 Statements within sections end with the
48、 semicoloncharacter (;).8.2.5 Variable names beginning with numbers must bepreceded by an underline character (_). Otherwise the ncgenparser will flag an error.8.2.5.1 Underline characters were chosen for this protocolover hyphen characters, because some compilers may interprethyphens as subtraction
49、 operators. The feature of CDL thatallows implicit numerical datatyping of attributes in not beingused in the first version of the template. Instead, all floatingpoint attributes are being handled as strings. This forcesprogrammers to explicitly type variables, thereby encouragingmore deliberate programming styles. For example:aia_template_revision = “0.8”; /M12345:netcdf_revision = “2.0.1”; /M12345Consult the NetCDF Users Guide for more completeinformation on CDL syntax and usage.8.2.6 Underline characters only can be used as separatorsbetween words within var