The use of personal weather station observation for improving precipitation estimation and interpolation

The number of personal weather stations (PWS) with data available online through the internet is increasing gradually in many parts of the world. The purpose of this study is to investigate the applicability of these data for the spatial interpolation of precipitation for high intensity events of different durations. Due to unknown errors and biases of the observations rainfall amounts of the PWS network are not considered directly. Instead, only their temporal order is assumed to be correct. The crucial step is to find the stations with informative measurements. This is done in two steps, first by selecting the 5 locations using time series of indicators of high precipitation amounts. The remaining stations are checked whether they fit into the spatial pattern of the other stations. Thus, it is assumed that the percentiles of the PWS network are accurate. These percentiles are then translated to precipitation amounts using the distribution functions which were interpolated using the information from German National Weather Service (DWD) data only. The suggested procedure was tested for the State of Baden-Württemberg in Germany. A detailed cross validation of the interpolation was carried out for aggregated precipitation 10 amounts of 1, 3, 6, 12 and 24 hours. For each aggregation nearly 200 intense events were evaluated. The results show that filtering the secondary observations is necessary as the interpolation error after filtering and data transformation decreases significantly. The biggest improvement is achieved for the shortest time aggregations.


Study Area and Data
The federal state of Baden-Württemberg is located in South-West Germany and has an area of approximately 36,000 km 2 . The 60 annual precipitation varies between 600 and 2100 mm (Deutscher Wetterdienst, 2020), and the highest amounts are recorded in the higher elevations of the mountain ranges of the Black Forest. The rain gauge network of the German Weather Service (DWD) in Baden-Württemberg (referred to as primary network from here on) currently comprises 111 stations for the study period with high temporal resolution data (Fig. 1). The gauges used in this network are typically weighing gauges. In order to asses the spatial variability within a dense network of primary gauges, the precipitation data from the municipality secondary network varies over time. The time from 2015 to 2019 was considered for this study. At the end of the time period over 3000 stations from the secondary network were available. One can see that many stations have less than one year of observations. The proposed methodology cannot accommodate these stations, but in the future it is likely that a large portion of them can be considered.  It is assumed that the secondary stations may have individual measurement problems, (e.g. incorrect placement, lack of and/or wrong maintenance, data transmission problems) and due to their large number there is no possibility to check their proper placing and functioning directly. Furthermore, at many locations (especially in urban areas) there is no possibility to set up the rain gauges so that they fulfil the WMO standards. Therefore, the first goal is to filter out stations which deliver data contradicting the observations of the primary network which meet the WMO standards. Two filters are applied -the first one 90 compares the secondary time series with the closest primary series with the focus on intense precipitation. The second filter is designed to remove individual contradicting observations using a spatial comparison.

High intensity indicator based filtering
This relationship is independent of a possible station bias and is only important for high intensities, since for most hydrological applications low precipitation values play a minor role. A secondary station is useful if this relationship holds. Unfortunately 95 4 https://doi.org/10.5194/hess-2020-42 Preprint. Discussion started: 12 February 2020 c Author(s) 2020. CC BY 4.0 License.
the assumption can only be checked for selected test locations. Since is not intended to use the data from the secondary stations directly, their temporal ranks which are considered as indicator series of intense precipitation are used for this purpose instead.
Observations from the primary and secondary network are available at short time steps and can be be aggregated to different ∆t durations. The usefulness of the secondary data is investigated for different time aggregations. Z ∆t (x, t) is the (partly unknown) precipitation at location x and time t integrated over the time interval ∆t. It is assumed that this precipitation 100 is measured by primary network at locations {x 1 , . . . , x N }. The measurements of the secondary network are indicated as Y ∆t (y j , t) at locations {y 1 , . . . , y M }. Note that Y is considered to be a random field, and thus methods like Co-Kriging or Kriging with an external drift are not applicable.
In order to identify stations which are likely to deliver reasonable data for high intensities, indicator correlations are used.
For a selected variable U = Z or U = Y and probability α the indicator series For any two locations corresponding to the primary network x i and x j and any α and ∆t the correlation (in time) of the indicator series is ρ Z,α,∆t (x i , x j ) and provides an information on how precipitation series vary in space. This indicator correlation usually decreases with increasing separation distance. This decrease is not at the same rate everywhere and not the same for different thresholds and aggregations. For the secondary network indicator correlations ρ Z,Y,α,∆t (x i , y j ) with the 110 series in the primary network can be calculated. This can then be compared to the indicator correlations of the primary network.
The sample size has a big influence on the variance of the indicator correlations. Therefore, to take into account the limited interval of availability of the secondary observations, indicator correlations of the primary network corresponding to the same periods for which the secondary variable is available are used for the comparison. This is done individually for each secondary site. A secondary station is flagged as suspicious if its indicator correlations with the nearest primary network points are below 115 the lowest indicator correlation corresponding to the primary network for the same time steps and at the same separation distance. This means if: then the secondary station shows weaker association to the primary than what one would expect from primary observations.
In this case it is reasonable to discard the measured time series corresponding to the secondary network at location y i . This 120 procedure can be repeated for a set of selected α values. High α-s (dependent on the aggregation interval ∆t are preferred as the goal is to improve precipitation estimation for strong precipitation events.

Precipitation amount estimation for secondary observations
After the selection of the potentially useful secondary stations the next step is to correct their observations. The distribution function of the measured precipitation values at locations x i of the primary and at locations y j of the secondary network 125 are denoted as F xi,∆t (z) and G yj ,∆t (z) respectively. The basic assumption for the suggested approach is that the measured precipitation data from the secondary network may be biased in their values but are good in their order (at least for high intensities). This means that if at times t 1 and t 2 : This means that the measured precipitation amount from the secondary network is likely to have an unknown bias, but the 130 order of values at a location is preserved. This assumption is likely to be reasonable for high precipitation intensities. Thus, the percentile of the precipitation observed at a given time at a secondary location can be used for the estimation of precipitation amounts. Since this is a percentile and not a precipitation amount it has to be converted to a precipitation amount for further use.
This can be done using the distribution function of precipitation amounts corresponding to the location y j and the aggregation ∆t. As the secondary observation could be biased their distribution G yj ,∆t cannot be used for this purpose. Thus, one needs 135 an unbiased estimation of the local distribution functions.
Distribution functions based on long observation series are available for the locations of the primary network. For locations of the secondary network they have to be estimated via interpolation. This can be done by using different geostatistical methods.
A method for interpolating distribution functions for short aggregation times is presented in Mosthaf and Bardossy (2017).
Another possibility is to interpolate the quantiles corresponding to selected non-percentiles or interpolating percentiles for 140 selected precipitation amounts. Another alternative to estimate distribution functions corresponding to arbitrary locations is to use functional kriging (Giraldo et al., 2011) to interpolate the distribution functions directly. The advantage of interpolating distribution functions is that they are strongly related to geographical locations of the selected location and to topography.
These variables are available in high spatial resolution for the whole investigation domain. Additionally, observations from different time periods and time aggregations can also be taken into account as co-variates.

145
In this paper Ordinary Kriging (OK) is used for the interpolation of the quantiles and for the percentiles to construct the distribution functions both for the locations of the secondary observations and for the whole interpolation grid. For a given aggregation ∆t, time t and target secondary location y j the observed percentile of precipitation is: For the observations of the primary network the quantiles of the precipitation distribution at the primary stations are selected.

150
The distributions at the primary stations are based on the same time steps as those which have valid observations at the target secondary station. This way a possible bias due to the short observation period at the secondary location can be avoided. The quantiles are: These quantiles are interpolated using OK to obtain an estimate of the precipitation at the target location.
Here the λ i -s are the weights calculated using the Kriging equations. Note that the precipitation amount at the target location is obtained via interpolation, but the interpolation is not using the primary observations corresponding to the same time, but instead is using the quantiles corresponding to the percentile of the target secondary station observation. Thus these values may exceed all values observed at the primary stations at time t. Note that this correction of the secondary observations is 160 non-linear. This procedure is used for all locations which were accepted after application of the temporal filter.

Event based spatial filtering
While some stations may work properly in general, due to unforeseen events (such as battery failure or transmission errors) at certain times they may deliver individual false values. In order to filter out these errors a simple geostatistical outlier detection method is used as described in Bárdossy and Kundzewicz (1990). For a given aggregation ∆t, time t and target secondary 165 location y j the precipitation amount is estimated via OK using the observations of aggregation ∆t at time t of primary stations.
This value is denoted as Z * ∆t (y j , t). If the precipitation amount at the secondary station estimated using (6) differs very much from Z * ∆t (y j , t), the secondary location is discarded for the interpolation. As limit for the difference 3 times the Kriging standard deviation was selected. Formally: 1. Unbiased : 3. Uncorrelated with the parameter value: For the primary network we assume that ε(x i , t) = 0.
The interpolation is based on the observations For any location x To minimize the estimation variance an equation system similar to the OK system has to be solved, namely: Note that OK is a special case of this procedure with the additional assumption ε(y j , t) = 0. This system leads to an increase of the weights for the primary and a decrease of the weights for the secondary network. For each time step and percentile the variances of the random error terms ε(y i , t) is estimated from the interpolation error of the distribution functions. This interpolation method is referred to as Kriging using uncertain data (KU).

4 Application and Results
The section describing the application of the methodology is divided into three parts. First the rationale of the assumptions is investigated. As a second step the methodology is applied on a large number of intense precipitation events on different time aggregations using a cross validation approach. This allows an objective judgment of the applicability of the results. Finally the results of the interpolation on a regular grid are shown and compared. was possible. Table 1 shows statistics of the three devices compared to those of the reference station. The table shows that the secondary stations overestimated precipitation amounts by about 20 %. Furthermore, on can observe that the deviation between the reference and the Netatmo gauge are not linear, hence a data correction of the secondary gauges using a linear scaling factor is not sufficient. Figure 3 shows scatter plots of hourly rainfall data and the corresponding percentiles from the three Netatmo 220 gauges and a reference station. Figure 3 shows that for high percentiles their occurrence is the same for the primary and the secondary devices. Although this is only one example with a relatively short time period it does support our assumption that the quantiles between primary and secondary stations are similar for higher precipitation intensities. However, one secondary device (N10) delivered data which deviates substantially from the other measurements. This was caused by an interrupted connection between the rain 225 sensor and the base station. In this case, the total sum of precipitation over a longer time period was transferred at once (i.e. in one single measurement interval) when the connection was established again. This leads to an extreme outlier which falsifies the results. The first filtering procedure can identify such problems effectively.
The secondary measurement devices can lead to very different biases depending on where and how they are installed. This can be seen comparing the distribution functions of hourly precipitation accumulations corresponding to a set of very close 230 primary stations with those of the secondary stations in the same area. Figure 4 shows the distribution functions of three primary and four secondary stations in the city of Reutlingen. While the distribution functions of the primary network are nearly 9 https://doi.org/10.5194/hess-2020-42 Preprint. Discussion started: 12 February 2020 c Author(s) 2020. CC BY 4.0 License.  The second filter was applied for each event individually. The number of removed measurements was below 5 %. The secondary filter did not play an important role in the procedure.

Cross validation results
As there is no ground truth available the quality of the procedure had to be tested by comparing omitted observations and their estimates obtained after the application of the method.

255
The cross validation was carried out for a set of different time aggregations ∆t and a set of selected events. Only times with intense precipitation were selected, as for low intensity cases the interpolation based on the primary network is sufficiently accurate. Table 2 shows some characteristics of the selected events. For short time periods nearly all events were from the summer season, while for longer aggregation the number of winter season events increased, but their portion remained below 30 %. Note the high portion of zeros for all aggregations.

260
The improvement obtained through the use of secondary data is demonstrated using a cross validation procedure. The primary network is randomly split into 10 subsets of 10 or 11 stations each. The data of each of these subsets was removed and subsequently interpolated using two different configurations of the data used, namely a) only other primary network stations (Reference 1) and b) using the other primary and the secondary network stations (Reference 2). For the latter case, the interpolations were carried out using the primary station data and the following configurations: There is no improvement if no filter is applied -except a very slight improvement for 1 hour durations. This is mainly due to the better identification of the wet and dry areas. The use of the filters (and the subsequent transformation of the precipitation values) leads to an improvement of the estimation -the temporal filter being the most important. The spatial filter further improves the correlation while the additional consideration of the uncertainty of the corrected values at the secondary network 280 resulted in a marginal improvement. As the secondary stations are not uniformly distributed over the investigated domain the gain of using them is also not uniform. Highest improvements were achieved in and near urban areas with a high density of secondary stations, less improvement was achieved in forested areas with few secondary stations.
The measured and interpolated results were also compared for each event in space and (r) and (ρ) the observed and the interpolated spatial patterns were calculated as well. Table 4 shows the results for the different configurations C1 to C4 used 285 for the interpolation. The use of secondary stations leads to a frequent improvement of the spatial interpolation even in the unfiltered case. The reason for this is that the spatial pattern is reasonably well captured by the secondary network. With increasing time aggregation the improvement disappears as the role of the bias increases. As in the case of the temporal evaluation the first filter (and the subsequent transformation of the precipitation values) leads to the highest improvement. The spatial filter plays a marginal 290 role, and the consideration of the uncertainty leads to a slight reduction of the quality of the spatial pattern. The improvement is smaller for higher temporal aggregations.
Finally all results were compared in both space and time. Here the root mean squared error (RMSE) was calculated for all events and control stations. Table 5 shows the results for the different configurations used for the interpolation. The improvement is high for each aggregation. The temporal filter is important to improve interpolation quality. The spatial 295 filter and the consideration of the uncertainty of the secondary stations are of minor importance. The improvement is the largest for the shortest aggregation (1 hour) where the RMSE decreased by 20 % and the smallest for the 24 hours aggregation with an improvement of 4 %. Decreasing spatial variability and increasing regularity with increasing time aggregation is the reason for these differences.

300
As the cross validation results were showing improvements, the data transformations and subsequent interpolations were carried out for all selected events. As an illustration four case studies are shown and discussed here.
The first example (Fig. 6)  The panels in the bottom row of Figure 6 show d) the difference between b) and a), and e) the difference between b) and c).

310
The three images a) to c) are similar in their rough structure, but there are important differences in the details. The interpolation using the primary network leads to a relatively smooth surface. The unfiltered secondary station based interpolation is highly variable and shows distinct patterns such as small dry and wet areas. The combination after filtering and transformation is more detailed than the primary interpolation, and in some regions these differences are high. The map of the difference between the primary and the secondary station based interpolation (Fig. 6 e)) shows large regions of underestimation and overestimation by 315 the secondary network. The differences between the primary and the filtered interpolations using transformed secondary data in panel d) is much smaller but in some regions the differences are still quite large, e.g. in the north-eastern part of the study area. In both cases, negative and positive differences occur. Note that for this data the cross validation based on the primary observations showed an improvement of r from 0.36 to 0.77, of ρ from 0.55 to 0.76 and a reduction of the RMSE from 12.5 to 8.2. and c) respectively.
These pictures show a similar behaviour to those obtained for June 11 (Fig. 6).   Another interesting 24 hour event which was recorded on July 28, 2019 is shown in figure 9. The map based on the raw secondary data in panel c) shows very scattered intense rainfall. The combination of the primary and secondary observations changes the structure and the connectivity of these area with intense precipitation. The cross validation for this event showed an improvement of r from 0.32 to 0.75, of ρ from 0.42 to 0.77 and a reduction of the RMSE from 14.77 to 10.21.
340 Figure 9. Interpolated precipitation for the time period for a 24h event from 0:00 to 24:00 on July 28, 2019 (upper panel) and the differences between primary and combination and primary and secondary data based interpolations. Panel a) shows the result after applying the filtering, b) the interpolation from the primary network and c) the one from the secondary network. Panels d) and e) depict the differences between b) and a) and b) and c) respectively.
The results of the filtering algorithm for the other events show a similar behaviour. The differences between primary and combined interpolation can be both positive and negative for all temporal aggregations. In general, the secondary network provides more spatial details, which could be very important for hydrological modelling of meso-scale catchments.
In this study an approach was presented on how data from PWS can be used to improve precipitation interpolation. The 345 fundamental problem hereby is that the precipitation data from PWS are prone to various errors. An individual QC is either time consuming if a larger number of PWS is used or relies on other data sources as reference, such as precipitation estimates from weather radars which have an appropriate spatial and temporal resolution (de Vos et al., 2019). The approach presented in this study based on a combination of a reliable but spatially sparse primary network and a secondary network with numerous but also potentially biased and/or faulty observations. For all temporal resolutions, using the unfiltered secondary network 350 data substantially increased the RMSE values. Hence, a direct application of the raw secondary data leads to a deterioration of the interpolation quality. Therefore, a filtering of data from the secondary network is essential. The applied filters in this study may be conservative by rejecting more stations than absolutely needed, but this is important in order to obtain robust results. The length of times series from the current secondary network will increase and subsequently more observations which were currently discarded due to the uncertainty caused by the short time series may also become useful. Furthermore, it can 355 also be expected that the number of secondary stations will continue to increase, thus one can expect further improvements of the quality of precipitation maps on all temporal aggregations. A comparison of the spatial characteristics of the time series of primary and secondary stations can be used to filter out stations with unreliable data. Observed precipitation values at the remaining secondary stations can be transformed to become unbiased using the observed percentiles and the distributions at the primary stations as shown in Appendix A. This transformation does not require an independent ground truth of best 360 estimation of precipitation at the secondary locations. A second spatial filter can be applied to find occasional faulty values at the used secondary stations. The cross validation results of a large number of different intense precipitation events show that with the presently available secondary stations after application of the two filters and the data transformation one can improve interpolation quality significantly. The improvement is the biggest for hourly time aggregations with a reduction of the RMSE by 20 % , while for daily values the improvement is around 4 %. The spatial precipitation patterns are improved after 365 corrections with the help of secondary network observations, especially for the short time scales. In particular, the spatial extent of precipitation fields are modified by the rainfall/no-rainfall information from the dense secondary network data. Finally, we want to highlight the differences of the approach used in this study compared to precipitation estimation using weather radar, since this type is often used when rainfall fields with a high temporal and spatial resolution are required.
-Secondary stations measure precipitation on the ground whereas radar measures reflectivity in higher elevations. There-370 fore, rain measured by radar may be advected by wind.
-Secondary stations measure point precipitation, radar measures spatial aggregations over large volumes.
-Radar measurements have problems with attenuation, secondary stations do not.
-Radar resolution is relatively uniform, secondary stations form an irregular network.
These differences are not listed here to compete between the two forms of additional information, but to point out that their 375 different behaviour may be used for an effective combination. The method presented here requires a relatively dense primary network. The use of secondary stations in regions with sparse reliable networks seems to be also possible but will require further research.
As precipitation uncertainty is possibly the most important factor for the uncertainty in rainfall/runoff modelling the improvement of precipitation interpolation achieved by this paper may contribute to a reduction of the uncertainty of hydrological 380 modelling. Furthermore, the real time availability of the data of secondary networks may help to improve the quality of flood forecasts. Moreover, this study can help to improve gridded precipitation products for shorter time scales. Other procedures for the efficient use of secondary data may also be considered. Specifically, the interpolation of precipitation amounts using Quantile Kriging (Lebrenz and Bárdossy, 2019) may lead to better results. Due to the large number of zeros occurring for short aggregation intervals however, this procedure has to be modified, for example by combining it with the approach developed by 385 Bárdossy (2011). Traditional geostatistical interpolation methods use values measured at the same time interval only. However, for shorter temporal aggregations where advection can play a role, values measured at previous time steps may also be relevant.
The shorter the time aggregation ∆t, the more important the temporal aspect becomes. This aspect is not treated here in detail and requires additional research.
Data availability. The precipitation data was obtained from the Climate Data Center of the German Weather Service (https://opendata.