My reasons for jumping into stats was to directly compare two measurement methods… with multiple trials, on multiple ILDs (inter-landmark distances). I don’t really go for “funny name, lol” things, but when Bland and Borg are cited in the same paper on stats (which I long thought of [cluelessly/ignorantly] as boring). Eponysterical.
But getting real, the issues raised by Bland and Altman sound pretty interesting, and they raise the issue that many tests of this sort may be using misleading information… I have tried to duplicate their methods in my own little H.T.-UGR/Inquiry Study.
When comparing a new method of measurement with a standard method, one of the things we want to know is whether the difference between the measurements by the two methods is related to the magnitude of the measurement. A plot of the difference against the standard measurement is sometimes suggested, but this will always appear to show a relationship between difference and magnitude when there is none. A plot of the difference against the average of the standard and new measurements is unlikely to mislead in this way. This is shown theoretically and illustrated by a practical example using measurements of systolic blood pressure.
In earlier papers [1,2] we discussed the analysis of studies of agreement between methods of clinical measurement. We had two issues in mind: to demonstrate that the methods of analysis then in general use were incorrect and misleading, and to recommend a more appropriate method. We saw the aim of such a study as to determine whether two methods agreed sufficiently well for them to be used interchangeably. This led us to suggest that the analysis should be based on the differences between measurements on the same subject by the two methods. The mean difference would be the estimated bias, the systematic difference between methods, and the standard deviation of the differences would measure random fluctuations around this mean. We recommended 95% limits of agreement, mean difference plus or minus 2 standard deviations (or, more precisely, 1.96 standard deviations), which would tell us how far apart measurements by the two methods were likely to be for most individuals.