Instrument and Measurement

Of Order

Standard Deviation

Precision

Error

Stochastic Error

Systematic Error

Resolution

Accuracy

Calibration

Calibration Drift

We use several simple concepts and phrases in our analysis of our measurements. Although they are simple, they are nevertheless difficult to understand. Well, maybe you can understand them easily, but it took us years to understand them properly. Here is an introduction to the fundamental principles and parameters of the measurement analysis we perform here in our lab. To serve as an example, we consider the development of a fictitious instrument: the Microwave-Reflection Thermometer (MRT).

We *measure* a *physical quantity* with an *instrument*. The physical quantity is something we assume has a true and absolute value. All instruments measuring the physical quantity will come up with the same value if they are perfectly accurate. But real instruments are not perfectly accurate, so their measurements are not equal to the physical quantity. A *measurement* is a value produced by an instrument when it tries to determine a physical quantity.

When we say the microwave power radiated by our MRT is "of order one hundred microwatts", we mean it could be 150 μW, it could be 50 μW, it could even be 200 μW. We are being vague. When we say *of order*, we mean not less than a third and not more than three times. You might think we should just say, "between 33 μW and 300 μW", but that would be misleading. It would suggest that the numbers 33 and 300 had some significance. They don't. The only number that's significant is our 100, and we're saying it's hardly accurate at all. Despite being hardly accurate at all, the estimate is good for a lot of things. For example, we might see that human safety regulations say that any emission less than 10,000 μW is harmless. So our of-order estimate of radiated power tells us that our MRT is quite safe, because 100 μW is a hundred times less than 10,000 μW.

The standard deviation of a set of measurements is an indication of how much the measurements vary from their average value. We can't just take the average amount they vary from their average value, because that would be zero: the positive differences would cancel the negative. We could, instead, turn all the negative differences into positive differences so they would not cancel one another. But we don't do that, because the act of dropping the negative sign is difficult to handle in mathematics. To be specific: its slope is undefined at zero. Instead, we turn the negatives into positives by squaring them. The square of a negative number is always positive. To get the *standard deviation* of a set of numbers, we square all the deviations from the mean, add them together, divide by the number of measurements, and take the square root. The standard deviation is the *root mean square* of the deviations.

Suppose measure the temperature of a large metal table in our lab five times using our fictitious MRT instrument. We get the following results.

Index | Result (°C) |
Deviation (°C) |
Square of Deviation (°C ^{2}) |
---|---|---|---|

1 | 28.0 | +2 | 4 |

2 | 25.0 | -1 | 1 |

3 | 26.0 | 0 | 0 |

4 | 27.0 | +1 | 1 |

5 | 24.0 | -2 | 4 |

Average | 26 | 0.0 | 2 |

The *average* easurement is 26°C. The *mean square* deviation is 2 °C^{2}. The *standard deviation* is the *root mean square* deviation, which is √2 = 1.4 °C.

The *precision* of a measurement is its standard deviation when the physical quantity remains constant. Suppose we measure the temperature of our large metal table with a mercury thermometer we trust to 0.1 °C, and we find that the temperature remained constant to within ±0.1 C while we took our MRT measurements. The precision of our MRT is 1.4 °C, because the standard deviation of the MRT temperature was 1.4 °C while the actual temperature was constant. The 1.4 °C variation is due to the MRT itself, not the temperature of the table.

The *error* in a measurement is the difference between the measurement and the true value of the physical quantity. The variations in our MRT measurement are *errors*, because the temperature was constant during our MRT measurements. But the errors could be larger even than these variations. Suppose our mercury thermometer measured the temperature of the table to be 23°C. The error on each of our measurements will be the measurement value minus 23°C. Our MRT measurement errors were 5, 2, 3, 4, and 1 °C.

The *stochastic error* in a measurement is the error that is random from one measurement to the next. Stochastic errors tend to be gaussian, or *normal*, in their distribution. That's because the stochastic error is most often the sum of many random errors, and when we add many random errors together, the distribution of their sum looks guassuian, as shown by the Central Limit Theorem. The standard deviation of the stochastic error is equal to the precision. The stochastic error of our MRT is 1.4°C rms.

Because most stochastic errors are guassian, and all have zero mean by definition, the stochastic error will be less than twice the precision 95% of the time. So if we look at a plot of MRT measurement verses time, containing hundreds of measurements made while the table temperature remained constant, we can guess the precision easily. We just figure by eye how far apart two lines would have to be, one above zero and one below zero, to enclose almost all the measurements. These two lines will be at roughly two standard deviations above and two standard deviations below zero. So we divide the 95% spread in our measurements by four to get the precision.

The *systematic error* in a measurement is the error that is not random from one measurement to the next. Suppose the mercury thermometer measured the temperature of the table to be 23°C during our five MRT measurements. The average MRT measurement was 26°C. Its systematic error is 3°C.

The resolution of a measurement is how well it can distinguish between two values of the measured physical quantity. The precision of our MRT is 1.4°C. If one end of the table is 0.1°C warmer than the other, the probability of our MRT being right when it says which end is warmer will be close to 50%, because our stochastic error is much larger than the difference we are trying to observe.

When a stochastic error is gaussian, which is most of the time, there is a 68% chance that the magnitude of the error will be less than one standard deviation of the distribution. If we measure the temperature at both ends of our table, and the far end measurement is one standard deviation higher than the near end, it is 68% likely that the MRT measurement of the far end will be higher than the MRT measurement of the near end. In other words, it's 68% likely that our MRT will be correct in telling us which end is warmer, provided that the warmer end is one standard deviation warmer than the other end.

That's how we arrive at our definition of resolution. The *resolution* of a measurement is the amount by which the measured quantity must change for it to be 68% likely that our measurement will be correct in saying whether the change was up or down.

Because almost all stochastic errors are gaussian, the resolution of a measurement is almost always equal to its precision.

The accuracy of a measurement is an indication of size of its errors. To determine the accuracy of a measurement, we need to compare it to another far more accurate measurement. We can determine the accuracy of our MRT using our mercury thermometer. The *accuracy* of a measurement is the root mean square of its error. The accuracy of our MRT is 3.3°C.

If our MRT has a constant systematic error of 3°C, then we can subtract this 3°C from the MRT measurement to get a better estimate of the actual temperature. To *calibrate* a measurement is to remove systematic errors by comparing the measurement to a more accurate measurement. We modify our MRT to that it subtracts 3°C from its measurement. We no longer see the original measurement at all. The systematic error is gone, and all we have left is the 1.4°C stochastic error.

If they systematic error in a measurement changes slowly with time, the calibration we performed to remove the systematic error from the measurement will no longer be effective. This slow change is called *calibration drift*. In order to trust a calibration, we must monitor the systematic error of our instrument, or re-calibrate the instrument.