Skip to main content

General QoD Description (OBC-SQC, ISD, SPV)

At the moment, there are three data quality mechanisms being applied at WeatherXM. These mechanisms are designed to assess both the quality of the data itself and the status of the station deployments. Each mechanism follows a different approach to evaluate the reliability of a station’s data, and they ultimately work complementarily, contributing to a more comprehensive evaluation of each station. Although the final result of these data quality mechanisms affects user rewards, their primary objectives are:

  • To determine which stations produce data reliable enough to be used internally by the company, as well as by third-party companies and organizations as a product.
  • To motivate users to improve their deployments, aiming for the highest possible measurement quality.
  • To ensure that the data is of high enough quality to be used for improving the weather forecasts that WeatherXM produces at station locations.

Currently, the following three data quality mechanisms are used to classify and assess stations and their data:

  • Out-of-Bounds Check (OBC) and Self Quality Check (SQC)
  • Indoors Station Detector (ISD)
  • Station Photo Verification (SPV)

Out-of-Bounds Check (OBC) and Self Quality Check (SQC)

Both OBC and SQC aim to detect suspicious data at the raw, minute, and hourly level. This means that these mechanisms can identify unusual variations in different station parameters caused by faulty deployments or malfunctioning sensors, when such variations occur and when the prevailing meteorological conditions allow them to be detected.

For example, a sudden temperature change caused by using a very short mast that keeps the station too close to the surface of a rooftop will be detected primarily on a calm day with alternating sunshine and cloud cover. Under such conditions, the station will record very abrupt temperature changes; otherwise, detection would not be possible.

OBC and SQC therefore flag specific raw data points that appear suspicious (they do not label an entire deployment as problematic). These two mechanisms run within a unified algorithm every day, assessing the data from the previous day. The checks performed are as follows:

A. Out-of-Bounds Check OBC is a simple process that involves identifying values that fall outside the specifications set by the manufacturer for each sensor. As an example, if the temperature sensor is able to measure within a temperature range of -60 to 60°C, then the value 80°C is considered as faulty.

B. Self Quality Check (SQC) This mechanism uses the data from the station itself (and therefore does not rely on third-party data for comparison) to detect unexpected changes in various parameters, which may be due either to poor placement of the weather station or to a malfunction in one of its sensors.

The checks performed are:

  • Check for stagnation (lack of variability) in the parameters. For example, a constant recording of zero wind speed and an unchanging wind direction for six hours, combined with low relative humidity, is an indication that the station is located in a place where air circulation is not as expected, or that the anemometer has frozen (if the temperature has been >0°C for several hours).
  • Check for suspicious spikes over short time intervals, which may occur either due to poor station placement or because of a faulty sensor. For example, a temperature change >3°C within 1 minute is considered unrealistic. If the station’s sampling interval is greater than 1 minute, the respective threshold is increased linearly, provided that the sampling rate remains below 10 minutes.
  • Check for gaps in the data. Each station model, according to its specifications, is capable of sending data packets at a specific time interval. If the number of packets received on an hourly basis is lower than the threshold set by the specifications, this indicates a problem in data transmission (e.g., poor 4G/Wi-Fi/Helium signal).

Similar checks (for stagnation and spikes), but with different time windows (based on WMO –Guide to the Global Observing System (WMO-No. 488), pg.200– guidelines as well as our own data analysis), are carried out for all parameters (see full documentation).

Finally, the ratio of invalid to expected valid data is calculated on an hourly basis in order to determine the final score of successful meteorological observations.

Indoors Station Detector (ISD)

The ISD is a mechanism primarily aimed at identifying stations that are active but have not been placed outdoors. However, there is a range of other classifications that can be assigned to a station’s deployment status. The ISD mechanism runs once per day and evaluates the data from the entire previous day, but for the final evaluation result it also takes into account the outcomes from the last seven days in order to provide a fairer classification of the station’s deployment. The key difference compared to OBC-SQC is that ISD evaluates the entire day as a whole, assigning a single annotation related to the station’s deployment status.

For deployment assessment, the ISD uses solar radiation analysis techniques, comparing actual measurements with theoretical radiation values for each location on the planet and for each day of the year. In parallel, various combined thresholds are applied to temperature, wind direction/speed, and relative humidity (thresholds derived from exploratory data analysis) in order to identify certain significantly problematic deployment cases. It should be noted, however, that ISD is able to detect only a small subset of problematic deployments, as its main focus is on identifying stations that are located indoors. The classifications resulting from the ISD mechanism are shown in Table 1. See full ISD documentation here.

Table 1. ISD annotations, their short description, and their impact on the quality score.
Note that the short description is statistically derived, and the actual status of the station may differ.

ISD AnnotationShort DescriptionImpact on Score
OUTDOORSWell deployedNO
INDOORSIn a house or a very closed areaYES
ALMOST INDOORSIn a closed area e.g., a balconyYES
BAD DEPLOYMENTIn an area surrounded by significant obstaclesYES
PROBABLY BAD DEPLOYMENTIn an area that has some obstacles that may affect mainly wind, precipitation and solar irradiance observationsYES
INCLINED POLE NORTHOn a north-tilted pole or the light sensor needs cleaningYES
INCLINED POLE EASTOn an east-tilted poleYES
INCLINED POLE SOUTHOn a south-tilted poleYES
INCLINED POLE WESTOn a west-tilted poleYES
PROBLEMATIC LIGHT SENSORWeird spikes in solar irradianceNO
REPLACE ILLUMINANCESolar irradiance is always 0YES
ALMOST NIGHTThe station is located in a high-latitude region that experiences darkness during the winter months, with little to no sunlightNO
INDOORS FROZENThe station looks like indoors due to freezing conditionsNO
INDOORS ALMOST FROZENThe station looks like indoors almost due to freezing conditionsNO
BAD DEPLOYMENT FROZENThe station looks like bad deployment due to freezing conditionsNO
PROBABLY DEPLOYMENT FROZENThe station looks like probably bad deployment due to freezing conditionsNO
UNKNOWNLess than 40% of the expected data during the period of the day with solar irradiance >0.1 W m−2NO

Station Photo Verification (SPV)

At WeatherXM, we aim to ensure that all station installations in our network follow, as closely as possible, the guidelines set by the World Meteorological Organization (WMO) (see also our obstacle experiment). The SPV mechanism evaluates the deployment of each station as depicted in the photos submitted by the user through the mobile app. Through SPV, seven key aspects of the photos are examined in order to assess both the overall deployment and the extent to which each parameter can be measured accurately. This mechanism assigns a score to the station each time the user submits a set of photos, which is then evaluated internally by company staff. The 7 aspects and how they affect the SPV score are described in Table 2. Each aspect is assessed separately based on the condition shown in the photo under evaluation, unless an issue is identified that affects the station’s measurements regardless of orientation. Each photo receives a score for each of the 5 aspects that can be scored, and then the average score per photo is calculated. Finally, the average score of the 4 photos (one for each cardinal direction) is computed, which determines the final SPV score. It should be noted that if there is no photo for a specific cardinal direction, the score for that direction is considered to be zero. The SPV scoring rules and guidelines are comprehensively described in this document.

Table 2. The SPV aspects, their short description and the available score options.

SPV AspectShort DescriptionScore Options
Photo ValidityEvaluation of whether the photo meets the standards set by the company0, 0.5, 1
Temperature/Relative HumidityEvaluation of the validity of the temperature and relative humidity measurements0, 0.5, 1
WindEvaluation of the validity of the wind measurements0, 0.5, 1
PrecipitationEvaluation of the validity of the precipitation measurements0, 0.5, 1
Solar IrradianceEvaluation of the validity of the solar irradiance measurements0, 0.5, 1
Mast HeightEvaluation of the relative mast height in relation to the objects in the surrounding area-
Mast InclinationClassification of a visibly tilted mast toward a specific direction-