Tuesday 22 August 2017

Digitisation and reporting resolution

A couple of years ago James Goldie (UNSW) contacted me about an issue he found in HadISD relating to the reporting resolution of temperature and humidity information for stations in Australia.

In the HadISD, the data vary between single-degree, half-degree and 1/10th degree resolution.  However, variations between these can cause some interesting striations in derived quantities.

James has written up his work, with some cool animated plots at his blog

Wednesday 7 June 2017

High windspeed values

Thanks to Phil Jones (UEA) and colleagues for pointing out this issue.

There are a number of stations which have wind values of 88 m/s which also stands out as a repeating value (see Figure 1). 

Fig 1. Station 151080-99999 (Ceahlau Toaca, 46.983N, 25.950E, 1898.0m) showing the wind speeds and inhomogeneities (vertical lines).  The cluster of high values between 1991 and 2001 is clear (v2.0.1.2016f).
These may be the result of a mistyped missing data code in the original data.  It is also clear that this station may have rounding or conversion problems - we have not had the chance to investigate in detail so far.

The maximum wind speed used for the record check is 113.3m/s (derived from a maximum gust speed - https://wmo.asu.edu/content/world-maximum-surface-wind-gust), so this would not exclude these values.  The wind speeds are not passed through the distributional or frequent value checks as the shape of the distribution is not gaussian and to this point, these tests have been written assuming this shape.  Nor is the spike check applied.  Therefore, unfortunately, our QC suite is not (yet) clever enough at identifying these erroneous values.

At the current time we do not have a solution to these issues - we would rather make folks aware than try and implement a "quick fix" which causes issues elsewhere.  We will look into this during the course of this year and hope to roll out improvements to the wind QC in the next update.

The stations which have been noted as affected by repeated high values are:
151080-99999
156150-99999
156270-99999
228370-99999






Though others are noted to have one or a few high values.


Please do not hesitate to get in touch if you do spot any issues or would like more information on these.

Thursday 9 February 2017

HadISD v2.0.1.2016p

We have just released HadISD version 2.0.1.2016p.  All plots and files should be on the website.  Between the release of v2.0.0.2015p in September, there have been no updates to years in the past.  The ISD raw data were downloaded on 19th January 2017 and processed over the following days.  

The station selection was re-run, and so the station list has updated, with now 7877 stations present in this version.  There have also been some minor changes to the quality control tests (affecting wind measurements) outlined below.  A file indicating which stations are new to HadISD and which are no longer included compared to v2.0.0 is available.

As a result of requests from users, in this version we have passed the wind speed observations through the spike check, and also the wind direction observations through the repeated values (streak) check.

The threshold values used to activate flagging in the spike check are calculated from the properties of the data themselves, using the distribution of differences between one observation and the next.

For the streak check, although the parameters are calculated using the distribution of repeated values, these are only used to flag values if they are less than the defaults used in HadISD versions 1.0.x.  We ensure that no calm periods are assessed when applying the streak check.  The default values depend on the resolution of the wind direction and are in the table below (see Table 4 in Dunn et al, 2012 for more information).

Wind Direction Streak Check
Resolution
(degrees)
Repeated Streak (h) Repeated Streak (d) Repeated Hours Repeated Days
90 120 28 28 10
45 96 28 28 10
22 72 21 21 7
10 48 14 14 7
1 24 7 14 5



However if you find something strange, do let us know using the contact details on the HadISD website.  Please note the stations which are known to have issues are documented on this blog and on the website.

The quality control code used in this version will be uploaded to the github repository in the coming days.