As we noted on the previous post, the extreme temperatures during June 2021 in western Canada and USA were being erroneously flagged by some of the HadISD QC tests. This is the first in likely a number of posts as we delve deeper into the causes and resolutions.
Climatological Outlier Check
We go back to the station which we showed in the previous post, 711130-99999 (Agassiz, BC). The plot we showed was one of the raw diagnostic outputs from HadISD which we ran to see what was going on.
Firstly, while implementing changes for this test, we noticed that the plot was incomplete. It does correctly show the distribution from which the flagging thresholds were calculated as well as the highlighted flags in red.
However, for this test, the thresholds are determined from the data themselves, using the distribution of the anomalised observations. So that the addition of each month during the year does not affect the thresholds set by any test, the thresholds are calculated from a distribution using the complete years only. In this case, that's all data from all Junes up to the end of 2020. What was shown on that plot was the distribution (black) of all the observations from complete years only, and the fitted Gaussian (blue). The red values were the observations that were flagged, which were from data in 2021 only, but missing were data from June 2021 which were not flagged. The updated plot is below (Fig. 1), where the grey histogram includes all data from June 2021. This shows that there are other observations in June 2021 which are warmer than the average of previous years. Some are even warmer than all previous years, but not sufficiently so to be flagged by this test.
[As a reminder, this test fits a Gaussian to the histogram, and then uses this to determine a threshold (from where the fit crosses y=0.1). Observations which are further from the peak than the threshold are treated in two ways. If they are separated by an empty bin from the main distribution, then they are flagged. However, if they are "attached" (no empty bin) then they are tentatively flagged (and could be re-instated by the buddy check).]
This updated plot highlighted something we hadn't realised when adapting this test to work for the monthly updates. The thresholds are set on the complete-year data (up to 2020), and because these data all fall into a single distribution, this test identifies any observations further than this value as bad and flags them (highlighted in red). However, when including the 2021 data in the monthly update, there isn't an empty bin in the distribution. We note that had some observations in June in earlier (complete) years been very hot or very cold, they would have been correctly flagged by the test.
So, the first thing we have done is to rectifiy this, and ensure for cases like this, rather than flags being set, only tentative flags are set (Fig. 2), as these values are part of a contiguous distribution rather than being separate from it. In the case of monthly updates, the threshold for flagging (requiring that empty bin) is re-estimated, and shown by the purple line in Fig. 2. As in the case of complete years, any observation in the final year further from the peak of the distribution is only tentatively flagged as there is no empty bin (pink).
The other thing to address was to allow for a skew in the Gaussian fit as it is clear that a symmetric function is not the best fit to these data, which is shown in Fig. 3. This now reduces the number of observations on which a tentative flag is set to single figures. However, at the low temperature end, the threshold for the tentative flag has reduced, so should Agassiz get a very cold June, then it's possible some values may get tentative flags set instead
Fig. 3: Same as Fig. 2, but now using a skewed-Gaussian ditribution for the fit. |
The automated quality control works on data from meteorogical stations from around the world. For the case of Agassiz in Canada, we could be reasonably confident that all the data from June 2021 has been included in this update to HadISD, and therefore we could not bother with the "complete" versus "in progress" year distinction. However, for other locations, we do get data coming through in earlier months than the most recent one (e.g. data filling in during January through to May for the release that included June). In that case it is possible (though maybe unlikely) that thresholds for this test in earlier months could change from monthly release to monthly release, resulting in values being flagged or unflagged in different releases. Our approach is more stable during the monthly updates, and so we keep this distinction.
At the end of the calendar year, we run the QC on the data for the final data release of that HadISD version. For this release, it is on a complete year, so for that release only (the ones processed in January each year), then this distinction isn't made. Therefore all the June data will go into the distribution from which the thresholds are determined. The original form of the test would have received a contiguous distribution for this station, and so only set tentative flags. However, but updating it to use the skew-Gaussian, in fact no observations are flagged (Fig 4.).
Fig 4. Same as Fig 3. but for the version of the test as would be run for the update at the end of a calendar year. |
We will continue checking other QC tests as well as run further diagnostics before these changes are implemented in the HadISD QC suite, with a version number increment to reflect the changes. It is likely that these will not be available in time for the release in August 2021 (including data up to the end of July).