1、Designation: E 2310 04Standard Guide forUse of Spectral Searching by Curve Matching Algorithmswith Data Recorded Using Mid-infrared Spectroscopy1This standard is issued under the fixed designation E 2310; the number immediately following the designation indicates the year oforiginal adoption or, in
2、the case of revision, the year of last revision. A number in parentheses indicates the year of last reapproval. Asuperscript epsilon (e) indicates an editorial change since the last revision or reapproval.1. Scope1.1 Spectral searching is the process whereby a spectrum ofan unknown material is evalu
3、ated against a library (database)of digitally recorded reference spectra. The purpose of thisevaluation is classification of the unknown and, where pos-sible, identification of the unknown. Spectral searching isintended as a screening method to assist the analyst and is notan absolute identification
4、 technique. Spectral searching is notintended to replace an expert in infrared spectroscopy. Spectralsearching should not be used without suitable training.1.2 The user of this document should be aware that theresults of a spectral search can be affected by the followingfactors described in Section
5、5: (1) Baselines, (2) sample purity,(3) Absorbance linearity (Beers Law), (4) sample thickness,(5) sample technique and preparation, (6) physical state of thesample, (7) wavenumber range, (8) spectral resolution, and (9)choice of algorithm.1.2.1 Many other factors can affect spectral searching re-su
6、lts.1.3 The scope of this document is to provide a guide for theuse of search algorithms for mid-infrared spectroscopy. Themethods described herein may be applicable to the use of thesealgorithms for other types of spectroscopic data, but each typeof data search should be assessed separately.1.4 The
7、 Euclidean distance algorithm and the first derivativeEuclidean distance algorithm are described and their usediscussed. The theory and common assumptions made whenusing search algorithms are also discussed, along with guide-lines for the use and interpretation of the search results.2. Referenced Do
8、cuments2.1 ASTM Standards:2E 131 Terminology Relating to Molecular SpectroscopyE 334 Practice for General Techniques of Infrared Mi-croanalysisE 573 Practices for Internal Reflectance SpectroscopyE 1252 Practice for General Techniques of Qualitative In-frared AnalysisE 1642 Practice for General Tech
9、niques of Gas Chromatog-raphy Infrared (GC/IR) AnalysisE 2105 Practice for General Techniques of Thermogravi-metric Analysis (TGA) Coupled with Infrared Analysis(TGA/IR)E 2106 Practice for General Techniques of LiquidChromatographyInfrared (LC/IR) and Size ExclusionChromatographyInfrared (SEC/IR)3.
10、Terminology3.1 DefinitionsFor general definitions of terms and sym-bols, refer to Terminology E 131.3.1.1 reference spectruman established spectrum of aknown compound or chemical sample.3.1.1.1 DiscussionThis spectrum is typically stored inretrievable format so that it may be compared against thesam
11、ple spectrum of an analyte.3.1.1.2 DiscussionThis term has sometimes been used torefer to a background spectrum; such usage is not recom-mended.3.1.2 spectral searchingthe process whereby a spectrumof an unknown material is evaluated against a library of digitalreference spectra. Each reference spec
12、trum in the library isindividually compared to the spectrum of the unknown, andassigned a numerical value as to the goodness of fit. To performthis comparison, each data point in the unknown spectrum iscompared to each corresponding point in the reference spec-trum.3.1.3 peak searchingthe process wh
13、ereby the peak tableof the spectrum of an unknown material is evaluated against alibrary of peak tables. Each reference spectrum in the librarycontains a peak table and the peak table is individuallycompared to the peak table of the unknown, and assigned anumerical value as to the goodness of fit.3.
14、1.4 spectral librarya collection of reference spectrastored in a computer readable form, also called a library,database, or spectral database.1This guide is under the jurisdiction of ASTM Committee E13 on MolecularSpectroscopy and is the direct responsibility of Subcommittee E13.03 on InfraredSpectr
15、oscopy.Current edition approved Feb. 1, 2004. Published Feb. 2004.2For referenced ASTM standards, visit the ASTM website, www.astm.org, orcontact ASTM Customer Service at serviceastm.org. For Annual Book of ASTMStandards volume information, refer to the standards Document Summary page onthe ASTM web
16、site.1Copyright ASTM International, 100 Barr Harbor Drive, PO Box C700, West Conshohocken, PA 19428-2959, United States.3.1.5 search algorithmthe mathematical formula used tomake a point-by-point comparison of two spectra.3.1.6 hit quality valuethe spectral search software com-pares each spectrum in
17、 the database to that of the unknown,and assigns a numeric value for each library entry demonstrat-ing how similar the two spectra are.3.1.6.1 DiscussionThere are several methods for assign-ing Hit Quality values and either a high or low value can beassigned as the best match. Refer to the software
18、manufacturersdocumentation.3.1.7 hit quality index (HQI)a table which ranks thelibrary spectra in the database according to their Hit Qualityvalues (see 7.5).3.1.8 Euclidean Distance algorithmthe Euclidean Dis-tance algorithm measures the Euclidean distance between eachlibrary spectrum and the unkno
19、wn spectrum by treating thespectra as normalized vectors. The closeness of the match, or,HQI, is calculated from the square root of the sum of thesquares of the difference between the vectors for the unknownspectrum and each library spectrum.3.1.9 First Derivative Euclidean Distance algorithminthe F
20、irst Derivative Euclidean Distance algorithm the Euclid-ean distance is also computed, except the derivative of eachspectrum is calculated prior to the Euclidean distance calcula-tion.3.1.10 normalizationthe mathematical technique used tocompensate for an intensity difference between two spectra(see
21、 5.1).4. Theory4.1 Beers LawOne of the basic principles that makespectral searching possible is Beers Law (see TerminologyE 131), which states that A = abc, where A is the absorbance,a is the absorptivity, b is the sample pathlength, and c is theconcentration of the analyte of interest. As long as B
22、eers Lawapplies, two spectra of the same material recorded undersimilar conditions can be made to appear the same by normal-ization of the data.NOTE 1In an ideal case, this is true for transmittance spectra, butthere are differences in the spectral peak intensities when reflectancespectra are compar
23、ed to transmittance spectra.5. Spectral Data Pre-Treatment5.1 Normalization:5.1.1 Normalization of spectra compensates for the differ-ences in sample quantity (concentration or pathlength, or both)used to generate the reference spectra in the library and that ofthe unknown. The spectra are normalize
24、d over the completespectral range of the library. When searching less than the fullspectral range of the library, the spectra must be re-normalizedover the new range before an accurate comparison can bemade. Normalization of a spectrum for library searching is atwo step process. First, the minimum a
25、bsorbance value in theselected spectral range is subtracted from all the absorbances inthe same range. The resulting values are then scaled bydividing by the maximum result value in the range. The endresult is a spectrum (or a sub-range portion of a spectrum)where the minimum value is zero (0) and t
26、he maximum is one(1) absorbance. If the range chosen for normalization has onlyone or two strong bands and a few medium intensity bands, therange of the spectrum must be reselected or the spectrum willbe dominated by the strong bands in the spectrum and the HQIwill be insensitive to weaker fingerpri
27、nt bands necessary foridentification of a specific compound. Successful compoundidentification may require the spectral match exclude thestrongest bands, then the normalization will be based on amedium intensity band, and weak fingerprint bands will beemphasized in the HQI.5.2 Data Point Matching:5.
28、2.1 The algorithms used for searching a spectrum againsta library use a calculation that mathematically compares thedata points of the spectrum being searched to the data points ofthe spectra in the library. This requires that the data points inboth the sample and library spectra occur at the same f
29、re-quency. If the data points in the sample and library spectra arenot aligned in this manner, then one of the spectra must bemathematically altered (interpolated) to make the data pointsmatch. Typically the unknown spectrum being searched isaltered to match the data point spacing of the spectra in
30、thelibrary.5.2.2 Data point matching is commonly accomplished usinga linear data point interpolation method. In this method, theslope and offset of a line segment is calculated between theabsorbances of every pair of data points in the spectrum. A newset of absorbances is calculated by locating the
31、values thatoccur on the line segments at positions corresponding to thedatapoint frequency of the library spectrum.6. Conditions or Issues Affecting Results6.1 Spectral quality is one of the primary conditions orissues that can affect search results. There is no substitute fora carefully recorded sp
32、ectrum. There are several conditions orissues that affect spectral quality as pertains to spectral search-ing. These conditions or issues apply to both the spectra usedto create the reference database and to the unknown spectrum.6.2 Baselines:6.2.1 A flat baseline is preferred for the Euclidean dist
33、ancealgorithm as the Euclidean distance algorithm compares eachdata point in the unknown spectrum to the corresponding datapoint in the reference spectrum. The effect of an offset or slopein the baseline is interpreted as a difference between the twospectra. Therefore, when a spectrum with a sloping
34、 baseline oroffset is evaluated using the Euclidean distance algorithm, asimple baseline correction should be used.NOTE 2Negative bands can also produce an offset in the baseline asa result of the data normalization process.6.2.2 The first derivative Euclidean distance algorithmminimizes the effect
35、of an offset or sloping baseline. In thisalgorithm, the comparison is made between the difference of apair of adjacent points in the unknown spectrum to thedifference between the corresponding pair of adjacent points inthe reference spectrum. In effect, this causes the first derivativeEuclidean dist
36、ance algorithm to look only at the differences inthe slope of adjacent data points between the two spectra. Fig.1 shows how the two algorithms view the same two spectra.NOTE 3The first derivative algorithm converts a sloping baseline intoan offset that is then eliminated by the normalization procedu
37、re.E23100426.3 Sample Purity:6.3.1 The physical state of the sample should be as close aspossible to the physical state of the reference materials used toobtain the library. For example, a pure liquid sample wouldideally be searched against a library of spectra of only liquidreference materials. A s
38、ample which is probably a mixture,such as a commercial formulation, should be compared to alibrary of commercial formulations.6.3.2 In some cases the nature of the sample may not be wellunderstood. An unknown sample may be a pure material or amixture. It may have additional contaminants that will af
39、fect itsspectrum by adding spurious bands. In addition there areseveral other sources of spurious spectral features that mayappear as either positive or negative bands. Several of these arelisted below:36.3.2.1 Features due to variations in the carbon dioxide orwater vapor levels in the optical path
40、,6.3.2.2 Bands from a mulling agent,6.3.2.3 Halide salts used as window material and as thediluent for both pellets and diffuse reflection analysis oftencontain contaminants such as adsorbed water, hydrocarbon andnitrates. Always use dry halide salts and keep unused halidesalts in a desiccator,6.3.2
41、.4 Water can alter the spectrum of the sample from itsdry state. Spectra of inorganic samples with waters of hydra-tion are particularly sensitive to adsorbed water,6.3.2.5 Solvent bands from samples run in solution, and6.3.2.6 Bands from solvents left over from an extraction orfrom casting a film f
42、rom a solution.NOTE 4Retain spectra of any solvents used, so that bands due to thesolvent can be identified in the spectrum of the unknown.NOTE 5If the solvent bands in a region of the spectrum cannot beremoved from the spectrum (by either re-recording the spectrum, using anuncontaminated sample, or
43、 by spectral subtraction using the solventreference spectrum), then that region of the spectrum should be excludedduring a search. It is not sufficient to remove the offending bands digitallyby drawing a straight line through the region before the search. The searchalgorithm will calculate a poor ma
44、tch in this region for any referencespectrum containing features in the region. It should be realized that theremoval of the solvent bands may also remove underlying features in thesample spectrum.6.4 Absorbance Linearity (Beers Law):6.4.1 A spectrum recorded using good practices (see Prac-tices E 3
45、34, E 1252, E 1642, E 2105, and E 2106) shouldfollow Beers Law, and so maintain the relative absorbanceintensities of its bands, independently of sample thickness. Aslong as this ratio between the bands is maintained, the spectracan be normalized and a good comparison between spectra canbe made. For
46、 a spectrum to meet this requirement, each ray oflight of a given frequency must pass through the same amountof sample. There are at least two general cases where this maynot happen.6.4.1.1 One case occurs when there is an uneven thicknessof sample in the beam. For example, if the sample is wedgesha
47、ped in thickness, or irregular in shape, some rays of lightpass through the thin part and some rays pass through the3Coleman, Patricia B., Practical Sample Techniques for Infrared Analysis, CRCPress, FSBN# 0849342031: 8/26/93.The bottom two spectra demonstrate the results of the 1st derivative of a
48、spectrum with a sloping baseline as compared to a spectrum with a flat baseline.The two spectra in the bottom trace are almost completely overlapped.FIG. 1E2310043thicker part of the wedge. A similar concern arises whenmaking KBr pellets for analysis. Unless the powder is carefullyspread in the pell
49、et die, the pellet can be pressed with a densitygradient across the diameter. The sample must also be evenlydistributed by thorough mixing of the sample and pellet matrix.This is of particular concern when the beam geometry issmaller than the sample diameter, and is a common problemwhen using a beam condensing accessory or an infraredtransmitting microscope.6.4.1.2 A second case is when the sample does not com-pletely cover the entire beam cross-section. This occurs with afilm that has a void in it, or when a spectrum of a liquid isrecorded with an air bubble