Theory
In a set of replicate measurements of a physical or chemical quantity, one or more of the obtained values may differ considerably from the majority of the rest. In this case there is always a strong motivation to eliminate those deviant values and not to include them in any subsequent calculation (e.g. of the mean value and/or of the standard deviation). This is permitted only if the suspect values can be "legitimately" characterized as outliers.
Usually, an outlier is defined as an observation that is generated from a different model or a different distribution than was the main "body" of data. Although this definition implies that an outlier may be found anywhere within the range of observations, it is natural to suspect and examine as possible outliers only the extreme values.
The rejection of suspect observations must be based exclusively on an objective criterion and not on subjective or intuitive grounds. This can be achieved by using statistically sound tests for "the detection of outliers".
The Dixon's Q-test is the simpler test of this type and it is usually the only one described in textbooks of Analytical Chemistry in the chapters of data treatment. This test allows us to examine if one (and only one) observation from a small set of replicate observations (typically 3 to 10) can be "legitimately" rejected or not.
Q-test is based on the statistical distribution of "subrange ratios" of ordered data samples, drawn from the same normal population. Hence, a normal (Gaussian) distribution of data is assumed whenever this test is applied. In case of the detection and rejection of an outier, Q-test cannot be reapplied on the set of the remaining observations.
In statistics, Dixon's Q test, or simply the Q test, is used for identification and rejection of outliers. This test should be used sparingly and never more than once in a data set. To apply aQ test for bad data, arrange the data in order of increasing values and calculate Q as defined:
Where gap is the absolute difference between the outlier in question and the closest number to it. If Qcalculated > Qtable then reject the questionable point.
EXAMPLE:
For the data:
0.189, 0.167, 0.187, 0.183, 0.186, 0.182, 0.181, 0.184, 0.181, 0.177
Arranged in increasing order:
0.167, 0.177, 0.181, 0.181, 0.182, 0.183, 0.184, 0.186, 0.187, 0.189
Outlier is 0.167. Calculate Q:
With 10 observations, Qcalculated (0.455) > Qtable (0.412), so reject it with 90% confidence. However, at 95% confidence, Qcalculated (0.455) < Qtable (0.466).
Therefore keep 0.167 at 95% confidence or reject it at 90% confidence.
TABLE
This table summarize the limit values of the test.
No comments:
Post a Comment