Vol. 25 • Issue 7 • Page 48
As discussed in the first part of this series,1 IQCP is an important part of the lab’s approach to reporting useful data. We now present five templates based in the Microsoft Excel software program. These templates and the examples that accompany them will aid in calculating a variety of statistics required when validating a new method.
The more data you collect, the more confidence you will have in your statistics. Two of the three middle statistics-the mean (average) and median will assist you in determining whether the data are “normal” or “Gaussian.” The standard deviation (SD) is usually used to determine the range of the reference interval (RI), usually 95%. Rounded off, ±2 SD from the mean is the 95% range (Figure 1). According to Gauss, approximately 95% of the data would be included in the range if we used 2SD.
Sorting the data from lowest to highest will help decide whether the data is symmetrical. Drawing a histogram of the sorted data would show whether the data are symmetrical. Note that the histogram is not a perfect histogram, while the curve is (Figure 2). The histogram is based on measuring the level of the analyte being studied in patient samples and then plotting them. A histogram with only 30 values (and even sometimes with 50) will not be as smooth as the curve, which is based on a formula.
In this template, we suggest using five or six calibrators and four replicates for each calibrator to assess the linearity of a method. Three calibrators would be the minimum number to provide a line that might be linear (Figures 3a, 3b, and 3c). Four or more calibrators provide a higher degree of confidence when stating the upper limit of linearity. In Template 2, we show examples of various outcomes when assessing linearity.
Using four replicates for each calibrator provides an estimate of the precision at those calibrator values. The template displays the SD and coefficient of variation (CV) for each of the chosen calibrators based on the replicate data entered for each. The standard estimate of the error (SEE) is a measure of precision-how close all the points are to the calculated line (based on the slope and intercept). This statistic is similar to the SD as it is a measure of variation. In this case, it applies to all the calibrators. We prefer the SDs for each calibrator as a better idea of precision.
Method validation studies should uncover how a new method compares to the present method, or how a new assay performs. Template 3 answers questions such as:
• What is the bias between the two methods?
• Does the difference between the two methods change significantly as the values increase or decrease?
• Is a new reference interval needed if the new method is adopted?
We suggest that at least 30 patient samples be used for the comparison. The range of these samples should be divided into low, middle and high in three equal sized sets (Figures 4a and 4b).
Template 3 has several statistics that are also found in Templates 1 and 2. As you study the “definitions” of the statistics in Template 3, you will see that there are t-tests. These might be called tests for “overlap.” The slope and intercept provide information similar to that of the t-test and the percent bias. You will see this when you look at the exercises in Template 3. We think that the paired t-test is a solid test to compare the differences between the two methods. The percent bias and the combination of the slope are more difficult to interpret than the t-test.
There several ways to determine the precision of a method. Template 4 illustrates two of them. The first is to measure four or more patient samples, covering the linearity in triplet and then calculating the F-ratio. The other is to run several samples on both methods two or three times within a run and across runs. This method is illustrated with Template 4.
Many clinical laboratories do not perform these two tests. One reason is that the vendor has performed them and published the results in the “Directions, Accuracy, and Precision” paper included in the reagents. A second reason is that a peer reviewed article has already been published from a clinical laboratory in which the recovery and inference data are provided for the method you are studying.
A third reason is that the materials needed to perform these studies, especially the ones for studying hormones, can be quite expensive, and the experiments themselves tedious.
We think the examples accompanying the Template will illustrate the protocol to be followed. We suggest performing both recovery and interference with three or four replicates for each of four or five levels of the analyte the method is said to measure, as well as four or five levels of the interference. For example, suppose you are validating a new method for glucose. You might use fructose and galactose as possible interfering sugars.
David Plaut is a chemist and statistician in Plano, Texas; Natalie Lepage is a clinical biochemist and a biochemical geneticist at the Children’s Hospital of Eastern Ontario and associate professor in the department of pathology and laboratory medicine at the University of Ottawa, Ontario, Canada.
1. Plaut D, Lepage N, Przekop K. IQCP Answers, Part 1, ADVANCE for Administrators of the Laboratory. 25;18