1、AIIN TRI9 96 lQL2348 O500942 548 AIIM 1996 Errata ANSUAIIM TR39-1996 Guidelines for the Use of Media Error Monitoring and Reporting Techniques for the Verification of Stored Data on Optical Digital Data Disks Page 20 Subclause 7.3.3 Monitoring media error first column, eleventh line: topography, Ins
2、tead ofcodeword per band of sectors,” read “codeword per band of tracks.” Page 20 Subclause 7.3.3 Monitoring media error second column, third line: topography, Instead of sectors using a 3-D surface plot,” read “of tracks using a 3-D surface plot.” AssIatlon for hiammtlon uni Inuge Yuugmant 1 1 O0 W
3、ayne Avenue Suite 1 1 O0 Silver Spring, Maryland 20910-5603 Telephone 301-587-8202 1 Approved As Standards Institute (ANSI) Technical Repori January 23, 1996 ASSOCIATION FOR INFORMATION ANO IMAGE 1100 Woyne Avenue Suite 11 O0 Silver Spring, Morylond 20910 AIIM MANAGEMENT INTERNATIONAI 301 587-8202 A
4、IIM TR39 616 LRL2348 0500944 310 ANSUAIIM TR39- 1996 O by Association for Information and Image Management International 1 100 Wayne Avenue, Suite 1 100 Silver Spring, MD 20910-5603 Tel: 30 1 /587-8202 Fax: 30 1 /587-27 1 1 ISBN 0-89258-306- 1 Printed in the United States of America AIIM TR3% 9b LO1
5、2348 0500945 257 ANSVAIIM TR39-1996 Technical Report for Information and Image Management - Guidelines for the Use of Media Error Monitoring and Reporting Techniques for the Verification of Stored Data on Optical Digital Data Disks An ANSI Technical Report prepared by the Association for Information
6、 and Image Management International Abstract: This document provides guidelines for the use of ANSVAIIM MS59 (and other similar) media error monitoring and reporting tools. The media error monitoring procedures addressed in this technical report include the maintenance of information on media errors
7、 and data degradation, the use of statistical models and sampling plans, the automation of testing procedures, and the use of generated data in making policy decisions involving data backup and transfer. _ - - ANSVAIIM TR39-1996 Guidelines for the Use of Media Error Monitoring and Reporting Techniqu
8、es Contents Foreword . i 1 Scope and purpose 1 2 References 1 3 Definitions 2 4 Abbreviations . 2 5 Media error reporting tools provided by the ANSVAIIM MS59 standard . 2 6 Deciding what to test 5 7 Analyzing the media error reports provided by MS59-compliant optical disk subsystems or devices, or s
9、imilar media error reports 8 8 Using error distributions and statistical models to evaluate data integrity 26 Annexes Annex A A summary of the ANSVAIIM MS59 command set 3 1 Annex B The modified Gilbert model 34 Annex C Uniform, random error distributions . 35 Figures Figure 1 Figure 2 Figure 3 Figur
10、e 4 Figure 5 Figure 6 Figure 7 Figure 8 Figure 9 Figure 10 Figure 11 Figure 12 Figure 13 Figure 14 An example of a 5-way interleaved data field format . 4 An example of a BER control chart 8 Sector reallocations . 12 Monitoring bad sector IDS . 13 Monitoring BERS . 14 Monitoring the maximum number o
11、f bytes in error per codeword per sector . 15 Relative frequency distribution of the maximum number of bytes in error per codeword per sector. . 16 Relative frequency distribution of the number of bytes in error per sector . 17 Maximum number of bytes in error per codeword per sector per radial area
12、 18 Maximum number of bytes in error per codeword per sector per band of tracks 19 Maximum number of bytes in error per codeword per sector 21 Maximum number of bytes in error per sector . 22 Three-dimensional BER per band of sectors over the disk 23 Contour graph of frequency distribution of sector
13、s with 1 or more bytes in error 24 Figure 15 Figure 16 Figure 17 Figure B 1 Tables Table Al Table A2 Combined 3-D graphics 25 Comparing burst length profiles 27 Examples of the uniform random error model for predicting the maximum bytes in error per codeword per sector distributions . 30 State diagr
14、am of the modified Gilbert model. . 34 Media Error Levels 33 Verify Media Error Level 33 Foreword Data and records managers in many organizations are already using optical disk-based information systems for storing and retrieving large sets of valuable information. The optical disk drives that are u
15、sed by these systems incorporate powerful, but not unlimited, error correction capabilities. If the level of errors in an optical digital data disk sector exceeds the error detection and correction mechanisms implemented in the optical disk drive controller, the sector becomes uncorrectable and data
16、 loss occurs. ANSVAIIM MS59, Media Error Monitoring and Reporting Techniques for the Verification of Stored Data on Optical Digital Data Disks, provides a comprehensive and standard set of media error monitoring and reporting tools. These tools will enable managers and users of optical disk librarie
17、s to verify the integrity of stored data both initially - when data are transferred to the media - and periodically - when monitoring the status of data. The media error information that can be obtained using drives that are fully compliant with the standard includes a list of reallocated sectors wa
18、rnings about exceeded verify media error levels corrections above media levels the total number of bytes in error, the number of bytes in error per sector, and the maximum number of bytes in error in any sector codeword the corrected or uncorrected sector content errors encountered when header infor
19、mation such as the sector address, sector marks, and synchronization signals is read the maximum length of contiguous defective bytes addition, the ANSUAIIM MS59 standard provides information about media error levels that are set in the drive and the ability to modify these media error levels. The m
20、edia error monitoring and reporting techniques allow users to obtain media error information whenever required and at different levels of sophistication. i Association for Information and Image Management International AIIM TR37 76 U 1012348 0500746 193 = _ ANSUAIIM TR39-1996 Guidelines for the Use
21、of Media Error Monitoring and Reporting Techniques Depending upon the importance of stored data, the organizations data storage requirements, and the required data retention period, users might want to implement some or all of the tools provided by ANSUAIIM MS59- compliant drives. The use of statist
22、ical techniques and the maintenance of reported media error data should also be considered. Users whose population of optical digital data disks is very large might require the application of sampling techniques to test disks from their population. In addition, consideration should also be given to
23、defining appropriate early warning indicators of data degradation and to the development of backup and data transfemng policies. This technical report provides an introduction to the MS59 command set and useful information on - sampling data - displaying measured values - using modeled distributions
24、 as baselines for Data retirement and transfer policies should balance user requirements of data integrity and storage costs. TR39 describes recommended methods and guidelines for media error monitoring, including the maintenance of data degradation information. It also provides information on the u
25、se of statistical data, data that would allow users to make more informed decisions when they develop policies and procedures for data backup and data transfer. However, this technical report cannot provide specific intervals for media testing or data transfer because these activities are applicatio
26、n-dependent. comparing media error results Suggestions for improving this technical report or requests for further information should be sent to the Chair, AIIM Standards Board, Association for Information and Image Management International, 1100 Wayne Avenue, Suite 1100, Silver Spring, Maryland, 20
27、910-5603. At the time it approved this technical report, the AIIM Standards Board had the following members: Name of ReDresentative Oreanization Represented Judy Kilpatrick, Chair Association for Information and Image Management International Bell one that is representative of the entire population
28、from which the sample items are drawn. The desired measurements are performed only on the sample, and conclusions are drawn for the entire population. Samples might be sectors, tracks, or bands of tracks from a specific disk or disks taken from a population. Practically, however, the time required t
29、o obtain media error information for each sample item should 5 Association for Information and Image Management International AIIM TR39 b m LL234 0500952 497 m ANSVAIIM TR39-1996 Guidelines for the Use of Media Error Monitoring and Reporting Techniques be considered. Where sampling is used to reduce
30、 test time, sampling individual sectors or even individual tracks could defeat this objective. Inherent in any sampling operation is the risk of estimation errors that could lead to the acceptance of defective items or the rejection of non-defective items. Users of the MS59 standard might find that
31、rejecting non-defective items is acceptable; it could result only in unnecessary data transfer. However, the risk of accepting defective items might be unacceptable because it could lead to data loss. If the limitations of sampling are appropriately considered, sampling methods can be used to provid
32、e useful estimations of population data integrity. This information might not otherwise be available. In addition, sampling can also be used as a screening technique to determine the need for additional tests. 6.1 Sampling methods 6.1.1 Random sampling Items can be sampled from a population in a num
33、ber of ways; some involve “random” number generators, which might or might not be confirmed to be fully random. For example, random (N), the C-language random number generator, chooses an integer from O to N - 1. The items comprising the population should be numbered by their physical location. The
34、generated values could then be used to select a random sample of items from the population. An alternative to random population sampling is to partition the population into M physically contiguous locations, each containing m items. Forcing the selection of x samples from each of the M locations wou
35、ld then generate a sample of size mXx = n. This sampling decreases the likelihood of neglecting any physical area of the population, especially if the “random” number generator deviates from randomness. In fact, it might be acceptable in some circumstances to dispense with the random number generato
36、r altogether. The sample could be distributed evenly throughout the population by sampling items at regular physical intervals. New disks that have been stored and handled according to the manufacturers specifications might have very sparse and seemingly random error profiles. However, disk defect d
37、istributions can become less random with age and handling. In addition, storage or usagehandling conditions within populations of disks might be significantly different. The samples could then violate the assumption of statistical equivalence, that is, the samples are not representative of the popul
38、ation. Disk defects can result from improper user handling, storage conditions, or usage. Once created, the resultant damage and effects can spread preferentially to adjacent areas. Sectors in a particular area might have a high byte error rate (BER) while the average BER over the entire disk might
39、be too low to result in general degradation of data integrity within the media. In the case of data errors localized in particular defective sectors, three situations can occur. - The number of defective sectors is relatively low and the level of bytes in error within these sectors is low enough to
40、be handled by the ECC. Only the defective sectors are reallocated, and hence no data loss occurs. - The number of defective sectors is relatively low but the level of errors within at least one sector exceeds the ECC capacity. Data might be lost. - The number of defective sectors is excessive, whate
41、ver the byte error rate. The reallocation capability of the disk drive might be exceeded, and data could be lost. These kinds of defects might not be detected by random sampling methods. Scanning the entire disk could be necessary. Restricting the investigation to particular areas where one suspects
42、 the data are affected might also be useful. One criterion for determining whether defects are localized or randomly distributed over the entire disk could be the clustering of bytes in error or error bursts over the disk (the variation in the distribution of errors over different areas of the media
43、). Variations can be evaluated by representing the distribution of the mean BER per sector over the entire disk or by using lag plots to identify correlations in sector or track error events. 6.1.2 Adaptive error-driven sampling The sampling procedures discussed within 6.1.1 are not varied, no matte
44、r how many defective items are detected. By contrast, an adaptive error-driven method forces the retrieval of items adjacent to defective items. For example, if an item randomly selected to be a part of a sample is in error, then neighboring items along or across tracks might be included in the samp
45、le. This adaptive error-driven sampling is more complex than random sampling procedures, but, where errors are localized, this method would tend to detect more errors. It would produce a sample that might be of more interest to users of the MS59 standard, whose priority is the integrity of stored da
46、ta. 6.2 Estimation from sampling methods 6.2.1 Estimating confidence intervals 6 Association for Information and Image Management International AIIM TR39 96 3032348 0500953 323 ANSUAIIM TR39-1996 Guidelines for the Use of Media Error Monitoring and Reporting Techniques By applying sampling methods t
47、o data storage media, an inherent risk exists. Incorrect estimations of the state of data within the submitted disk or population of disks could occur because all data are not tested. The loss of stored data could result, which would certainly be unacceptable. Statistics associated with a sampling m
48、ethod should provide a confidence interval, A. For example, if this parameter is the BER, a specified probability exists that the true BER exceeds the measured BER by more than A. The user is responsible for choosing an appropriate sampling procedure, including the size of the sample and the way it
49、is drawn from the total population. The user must ensure that the uncertainty of the measured parameter is acceptable for data integrity requirements. 6.2.2 Control charts Control charts are frequently used to monitor quality (that is, uniformity and conformance to a standard) of manufactured products over time. They are also used to monitor the ability of measurement systems to yield reproducible results over time by periodically measuring and comparing an unchanging artifact. Historical data are needed to set the target level and the acceptabl