In this era of enhanced laboratory performance and staff shortage, laboratory protocols for all aspects of laboratory activities are essential. There is no consensus procedure among regulatory bodies to perform systematic method and linearity validation.
Basic statistics were used to help us identify the best statistics for every situation encountered in the method validation protocols. The mean, the median and the mode were chosen as measures of central tendency. The standard deviation (SD), coefficient of variation (CV) and the F-ratio were selected as measures of variation. The paired and Student’s t-tests were the tests used to evaluate differences between two means. The ANalysis Of VAriance (ANOVA) was used to compare more than two instruments or lots of reagent or calibrator. Four graphical representations were also entertained: the scatter plot, the difference plot, the Analysis of Means and the Analysis of Variance.
In-depth review of method comparison of sodium by two methods and a total of 40 patient samples was performed. Statistical calculation and assessment of clinical utility of tests are necessary to achieve appropriate conclusion. Comparison of the means between two methods is best made with the use of unpaired t test, as the paired t-test will provide statistical difference where the difference is not clinically significant. Evaluation of reproducibility within-run and between-runs is best performed with the use of SD and F-ratio. A limited number of 4-8 sample replicates would be satisfactory to assess reproducibility when the assay CV is less than 10%. More replicates would be necessary for tests with lesser precision. The F-ratio is the appropriate statistical tool to assess reproducibility when two instruments are compared. When more than two instruments are compared for bias the ANOVA statistics are appropriate as is the Analysis of Means graphic. A wrinkle is necessary to compare statistically when more than two instruments or methods are compared. The Analysis of Variation graphic is helpful in these situations.
Standardization of laboratory protocols is an asset to every laboratory. This approach provides detailed procedures to the laboratory staff, and helps to optimize time spent on method validation. In an effort to standardize our laboratory protocols for method and linearity validation in terms of both bias and reproducibility we sought aid from the CLSI (NCCLS), CLIA, the Ontario Quality Management Program- Laboratory Services (QMPLS) as well as the current text by Burtis et al. (Todd and Sanford). We provide in Tables I (for Bias) and II (for Precision) the summary of recommendations by the various organizations for a number of parameters. The reader will see from the tables that there is no consensus among the groups. Some groups don’t even mention key parameters for a successful validation. We also found no assistance for dealing with lot to lot variation.
Table 1: Summary of Recommendations from Various Organizations on Evaluation of Bias
|Source||Samples (n)||Day||Patients/day||Calculation||Runs per day|
|Burtis (2012 ed.)|
|CLSI (EP 15-A)||20||3-4||5-7/d||T-test, difference plot||1 each day for 3-4 days|
NG: No recommendation
Table 2: Summary of Recommendations from Various Organizations on Evaluation of Precision
|Source||Samples (n)||Day||Levels||Calculation||Runs per day|
|CLSI (EP 15-A)||15||5||2||ANOVA||1|
Given the current situation we chose to try to shed light on the following questions:
1. What is the number of samples needed to detect bias between two methods?
2. What is the number of samples needed to detect bias with more than two methods (such as glucose meters)?
3. What is a satisfactory number of replicates to establish a mean for a new lot of control? And what is a satisfactory way to determine the SD for a new lot of control.
4. What is the best approach to detect lot-to-lot variation over time within a method and how best to cope with such variation?
5. What is the number of samples and target concentrations within and between runs to measure precision?
Our goals are to provide useful tools to minimize time spent by the laboratory technical management staff on the review of correlation data and to ensure the most adequate interpretation of data is performed before concluding on the acceptability of the new lot of reagent/calibrators and its implementation in the laboratory.
Material and Methods
Analytes and Validation Data
To address our issues, for each of the five questions listed above, we used patient samples accompanied by computer simulation to clarify and support the patient data. It might be pointed out that the computer cannot differentiate between patient data and simulated data. The analytes we studied were PSA, cyclosporine, sodium, potassium, ionized calcium, hemoglobin, oxygen, TSH and glucose using POCT devices. These analytes were chosen to reflect a wide variety of instruments, methodologies and patient sample types. In each case we used a variety of statistical calculations in an effort to clarify which statistics provide the most help in answering the five questions listed above.
Statistical Analysis and Graphs
In order to assess their usefulness in the context of method validation, we used a number of classical statistics and well as four graphical representations as aids. The statistics were the mean, the median and the mode as measures of central tendency; the standard deviation (SD), coefficient of variation (CV) and the first and third quartiles (Q1 and Q3) as measures of variation. To detect differences between two means we used the paired Student’s t-tests as the most sensitive (although as we pointed out above, it may be ‘too’ sensitive in the clinical setting) and clear statistic for this purpose. We collected data on the slope and intercept in a number of cases to illustrate its pros and cons. Similarly, we will show data from the correlation equation to show its values and pitfalls.
For comparing more than two instruments or lots of reagent/calibrator we used the ANalysis Of VAriance (ANOVA). While this excellent tool is misunderstood, and perhaps misnamed, we will illustrate how to use it and discuss its advantages and disadvantages.
Two of the graphs we selected to evaluate- the scatter plot and the difference plot are well known and quite useful when two instruments or reagent lots are being compared. Two other graphs we have found particularly helpful particularly when more than two instruments or lots of reagent or calibrator are studied are the Analysis of Means and the Analysis of Variance. These will be illustrated when appropriate. All the statistics and graphical representations were done using Excel. [Note: Click here for templates that will take your data and aid you in doing method validations, assessing linearity, validate a reference range (even dividing the data into Males and Females and by age in decades)]
Download the full-length PDF, “Method and Linearity Validation: A Concensus Approach” for much more.