General QoD Description (OBC-SQC, ISD, SPV)

At the moment, there are three data quality mechanisms being applied at WeatherXM. These mechanisms are designed to assess both the quality of the data itself and the status of the station deployments. Each mechanism follows a different approach to evaluate the reliability of a station’s data, and they ultimately work complementarily, contributing to a more comprehensive evaluation of each station. Although the final result of these data quality mechanisms affects user rewards, their primary objectives are:

To determine which stations produce data reliable enough to be used internally by the company, as well as by third-party companies and organizations as a product.
To motivate users to improve their deployments, aiming for the highest possible measurement quality.
To ensure that the data is of high enough quality to be used for improving the weather forecasts that WeatherXM produces at station locations.

Currently, the following three data quality mechanisms are used to classify and assess stations and their data:

Out-of-Bounds Check (OBC) and Self Quality Check (SQC)
Indoors Station Detector (ISD)
Station Photo Verification (SPV)

Out-of-Bounds Check (OBC) and Self Quality Check (SQC)

Both OBC and SQC aim to detect suspicious data at the raw, minute, and hourly level. This means that these mechanisms can identify unusual variations in different station parameters caused by faulty deployments or malfunctioning sensors, when such variations occur and when the prevailing meteorological conditions allow them to be detected.

For example, a sudden temperature change caused by using a very short mast that keeps the station too close to the surface of a rooftop will be detected primarily on a calm day with alternating sunshine and cloud cover. Under such conditions, the station will record very abrupt temperature changes; otherwise, detection would not be possible.

OBC and SQC therefore flag specific raw data points that appear suspicious (they do not label an entire deployment as problematic). These two mechanisms run within a unified algorithm every day, assessing the data from the previous day. The checks performed are as follows:

A. Out-of-Bounds Check OBC is a simple process that involves identifying values that fall outside the specifications set by the manufacturer for each sensor. As an example, if the temperature sensor is able to measure within a temperature range of -60 to 60°C, then the value 80°C is considered as faulty.

B. Self Quality Check (SQC) This mechanism uses the data from the station itself (and therefore does not rely on third-party data for comparison) to detect unexpected changes in various parameters, which may be due either to poor placement of the weather station or to a malfunction in one of its sensors.

The checks performed are:

Check for stagnation (lack of variability) in the parameters. For example, a constant recording of zero wind speed and an unchanging wind direction for six hours, combined with low relative humidity, is an indication that the station is located in a place where air circulation is not as expected, or that the anemometer has frozen (if the temperature has been >0°C for several hours).
Check for suspicious spikes over short time intervals, which may occur either due to poor station placement or because of a faulty sensor. For example, a temperature change >3°C within 1 minute is considered unrealistic. If the station’s sampling interval is greater than 1 minute, the respective threshold is increased linearly, provided that the sampling rate remains below 10 minutes.
Check for gaps in the data. Each station model, according to its specifications, is capable of sending data packets at a specific time interval. If the number of packets received on an hourly basis is lower than the threshold set by the specifications, this indicates a problem in data transmission (e.g., poor 4G/Wi-Fi/Helium signal).

Similar checks (for stagnation and spikes), but with different time windows (based on WMO –Guide to the Global Observing System (WMO-No. 488), pg.200– guidelines as well as our own data analysis), are carried out for all parameters (see full documentation).

Finally, the ratio of invalid to expected valid data is calculated on an hourly basis in order to determine the final score of successful meteorological observations.

Indoors Station Detector (ISD)

The ISD is a mechanism primarily aimed at identifying stations that are active but have not been placed outdoors. However, there is a range of other classifications that can be assigned to a station’s deployment status. The ISD mechanism runs once per day and evaluates the data from the entire previous day, but for the final evaluation result it also takes into account the outcomes from the last seven days in order to provide a fairer classification of the station’s deployment. The key difference compared to OBC-SQC is that ISD evaluates the entire day as a whole, assigning a single annotation related to the station’s deployment status.

For deployment assessment, the ISD uses solar radiation analysis techniques, comparing actual measurements with theoretical radiation values for each location on the planet and for each day of the year. In parallel, various combined thresholds are applied to temperature, wind direction/speed, and relative humidity (thresholds derived from exploratory data analysis) in order to identify certain significantly problematic deployment cases. It should be noted, however, that ISD is able to detect only a small subset of problematic deployments, as its main focus is on identifying stations that are located indoors. The classifications resulting from the ISD mechanism are shown in Table 1. See full ISD documentation here.

Table 1. ISD annotations, their short description, and their impact on the quality score.
Note that the short description is statistically derived, and the actual status of the station may differ.

ISD Annotation	Short Description	Impact on Score
OUTDOORS	Well deployed	NO
INDOORS	In a house or a very closed area	YES
ALMOST INDOORS	In a closed area e.g., a balcony	YES
BAD DEPLOYMENT	In an area surrounded by significant obstacles	YES
PROBABLY BAD DEPLOYMENT	In an area that has some obstacles that may affect mainly wind, precipitation and solar irradiance observations	YES
INCLINED POLE NORTH	On a north-tilted pole or the light sensor needs cleaning	YES
INCLINED POLE EAST	On an east-tilted pole	YES
INCLINED POLE SOUTH	On a south-tilted pole	YES
INCLINED POLE WEST	On a west-tilted pole	YES
PROBLEMATIC LIGHT SENSOR	Weird spikes in solar irradiance	NO
REPLACE ILLUMINANCE	Solar irradiance is always 0	YES
ALMOST NIGHT	The station is located in a high-latitude region that experiences darkness during the winter months, with little to no sunlight	NO
INDOORS FROZEN	The station looks like indoors due to freezing conditions	NO
INDOORS ALMOST FROZEN	The station looks like indoors almost due to freezing conditions	NO
BAD DEPLOYMENT FROZEN	The station looks like bad deployment due to freezing conditions	NO
PROBABLY DEPLOYMENT FROZEN	The station looks like probably bad deployment due to freezing conditions	NO
UNKNOWN	Less than 40% of the expected data during the period of the day with solar irradiance >0.1 W m⁻²	NO

Station Photo Verification (SPV)

At WeatherXM, we aim to ensure that all station installations in our network follow, as closely as possible, the guidelines set by the World Meteorological Organization (WMO) (see also our obstacle experiment). The SPV mechanism evaluates the deployment of each station as depicted in the photos submitted by the user through the mobile app. Through SPV, seven key aspects of the photos are examined in order to assess both the overall deployment and the extent to which each parameter can be measured accurately. This mechanism assigns a score to the station each time the user submits a set of photos, which is then evaluated internally by company staff. The 7 aspects and how they affect the SPV score are described in Table 2. Each aspect is assessed separately based on the condition shown in the photo under evaluation, unless an issue is identified that affects the station’s measurements regardless of orientation. Each photo receives a score for each of the 5 aspects that can be scored, and then the average score per photo is calculated. Finally, the average score of the 4 photos (one for each cardinal direction) is computed, which determines the final SPV score. It should be noted that if there is no photo for a specific cardinal direction, the score for that direction is considered to be zero. The SPV scoring rules and guidelines are comprehensively described in this document.

Table 2. The SPV aspects, their short description and the available score options.

SPV Aspect	Short Description	Score Options
Photo Validity	Evaluation of whether the photo meets the standards set by the company	0, 0.5, 1
Temperature/Relative Humidity	Evaluation of the validity of the temperature and relative humidity measurements	0, 0.5, 1
Wind	Evaluation of the validity of the wind measurements	0, 0.5, 1
Precipitation	Evaluation of the validity of the precipitation measurements	0, 0.5, 1
Solar Irradiance	Evaluation of the validity of the solar irradiance measurements	0, 0.5, 1
Mast Height	Evaluation of the relative mast height in relation to the objects in the surrounding area	-
Mast Inclination	Classification of a visibly tilted mast toward a specific direction	-

Out-of-Bounds Check (OBC) and Self Quality Check (SQC)​

Indoors Station Detector (ISD)​

Station Photo Verification (SPV)​

Out-of-Bounds Check (OBC) and Self Quality Check (SQC)

Indoors Station Detector (ISD)

Station Photo Verification (SPV)