Precipitation accumulation analysis – assimilation of radar-gauge measurements and validation of different methods

We investigate the appropriateness of four different methods to produce precipitation accumulation fields using radar data alone or combined with precipitation gauge data. These methods were validated for high-latitude weather conditions of Finland. The reference method uses radar reflectivity only, while three assimilation methods are used to blend radar and surface observations together, namely the linear analysis regression, the Barnes objective analysis and a new method based on a combination of the regression and Barnes techniques (RandB). The Local Analysis and Prediction System (LAPS) is used as a platform to calculate the four different hourly accumulation products over a 6-month period covering summer 2011. The performance of each method is verified against both dependent and independent observations (i.e. observations that are or are not included, respectively, into the precipitation accumulation analysis). The newly developed RandB method performs best according to our results. Applying the regression or Barnes assimilation analysis separately still yields better results for the accumulation products compared to precipitation accumulation derived from radar data alone.


Introduction
The concept of precipitation accumulation is of great importance for various applications in meteorology and hydrology.Climate projections under possible climate change scenarios point to likely higher frequency of storms, with intensified precipitation over Europe.This will most probably have a significant effect on the surface water balance, therefore having a large impact on society and its economical aspects.Hydrological models, which are based on analyzed precipitation accumulation, do need a very high accuracy of the precipitated water amount in order to issue warnings, e.g. for sudden flooding.Fire and weather warnings are another example of products where end-users require high-quality data of precipitation accumulation during the summer period.
Radar-derived precipitation products are generated at high spatial resolution but embed measurement uncertainties.On the other hand, surface precipitation observations, such as standard gauge observations and road-weather measurements, have usually higher accuracy and are essential when used for correcting radar-based precipitation accumulation fields, but have limited spatial representativeness.The literature provides many studies on the benefits one can gain from the combination of radar measurements and surface observations to derive the final accumulated precipitation product (Goudenhoofdt and Delobbe, 2009).Radar reflectivity generates a good first guess for the accumulated precipitation, with the advantage of high spatial resolution, though there are certain inherent inaccuracies when deriving this product from radars (Koistinen and Michelson, 2002).Measurements of precipitation at ground level are performed at point location and the errors associated with the observations are well characterized (Steiner et al., 1999).Different, more or less sophisticated assimilation methods exist, whereby surface point observations are blended together with radar data in order to establish a corrected precipitation accumulation, e.g.: co-kriging (Sun et al., 2000), the statistical objective analysis method (Pereira et al., 1998), combined bias-adjustments method (Overeem et al., 2009)

and bias
Published by Copernicus Publications on behalf of the European Geosciences Union.
adjustments using the Kalman filter (Chumchean et al., 2006;Anagnostou and Krajewski, 1999).A summary of the methods and operational usage in different countries is compiled in the COST-717 report (Gjertsen et al., 2003).Problems linked to radar-gauge bias correction methods have been discussed in, e.g.Seo and Breidenbach (2002).
In this study, we use the Local Area and Prediction System -LAPS (McGinley et al., 1991(McGinley et al., , 1992) ) as a platform for testing and validating 4 different precipitation accumulation analyses: the radar only (hereafter LAPS_radar) and 3 assimilation methods, namely the linear analysis regression, the Barnes objective analysis and a combination of those two methods (hereafter Regression, Barnes and RandB, respectively).Here the RandB is a new method, while the three others are more widely used.Geostatistical methods have shown good results in other studies for daily accumulation sums (e.g.Goudenhoofdt and Delobbe, 2009).However, they are sensitive to networks density, and the density of stations measuring hourly precipitation in Finland is very low.Therefore, in this paper we concentrate on further development of methods already used in LAPS, such as Regression and Barnes.LAPS is applicable for operational usage (Albers et al., 1996;Amy 2003), which is of critical interest for endusers who demand as close to real-time products as possible.
According to the classic Köppen classification, the climate of southern coastal Finland belongs to class Dfb and the rest of the country to Dfc, i.e. a cool and moist continental, subarctic climate of cold and snowy winters and precipitation throughout the year.Summer is warm, not hot, and in the north it is also short (Jylhä et al., 2010).The only mountains are in northern Finland but do not exceed 1350 m, while Finland is embraced by two Gulfs of the Baltic Sea (Gulf of Finland and Bothnian Bay) from two sides.
The aim of this article is to test and validate our new RandB method against three conventional methods, for typical high latitudes summer weather conditions encountered in Finland (extending between 60 and 70 • N) and to provide some guidance in the use of these methods.Section 2 introduces the LAPS model (Sect.2.1), the radar data (Sect.2.2) and the gauge network data (Sect.2.3).The different analysis methods for estimating precipitation accumulation are introduced in Sect.3. The results are presented and analysed in Sect.4, while Sect. 5 provides some conclusions and outlooks.

Methods and material
We describe here the model and data used to determine the gridded background fields involved in the estimation of the precipitation accumulation.

The Local Analysis and Prediction System (LAPS)
The Finnish Meteorological Institute (FMI) operates the Local Analysis and Prediction System (LAPS) for production of 3D analysis fields of different weather parameters (Albers et al., 1996).LAPS uses a data fusion method, in which a high-resolution spatial analysis, using statistical methods, is performed on top of the coarser resolution background fields.Observations are fitted to the coarser first-guess analysis mainly by successive correction method, while highresolution topographical data sets are taken into account when creating the final high-resolution analysis fields.Those analysis products are mainly used for now-casting purposes; i.e. what is currently happening and what will happen in the next few hours.
The coarser background first-guess field is the latest available forecast from the European Centre for Medium-Range Weather Forecasts (ECMWF) model, with a current horizontal grid spacing of approximately 16 km (ECMWF, 2011).The following ECMWF parameters are used at 16 vertical pressure levels: vertical velocity, specific humidity, temperature, geopotential, vectorized winds, surface geopotential, surface pressure, pressure at mean sea level, 2 m temperature and dew-point temperature, vectorized wind at 10 m, sea surface temperature, skin temperature and land-sea mask.
The FMI LAPS setup uses a pressure coordinate system, including 44 vertical levels distributed with a higher resolution (e.g. 10 hPa) at lower altitudes and decreasing with height.The horizontal resolution is 3 km and the domain used in this article covers the entirety of Finland and some parts of the neighbouring countries (see Fig. 1a).
The fine-scale structures in the resulting 3-D analysis are extracted from the observations.Therefore, LAPS highly relies on the existence of high-resolution, both spatial and temporal, observational network and especially on remote sensing data.At present, the LAPS suite implemented at FMI is able to process several types of in situ and remotely sensed observations such as radar reflectivity, weighting gauges, road-weather observations, radar radial winds, soundings, Synop, Metar, air traffic observations, lidars and Meteosat9 satellite data.The first three of these listed measurements are used for calculating the precipitation accumulation within LAPS.The Finnish radar volume scans are read into LAPS as NetCDF format files, thereafter the data is remapped to LAPS internal Cartesian grid and the mosaic process combines data of the different radar stations (Albers et al., 1996).In LAPS the rain rates are calculated from the lowest levels of the LAPS 3-D radar mosaic data, via the standard Z − R equation formula (Marshall and Palmer, 1948), which is then used for precipitation accumulation calculations, either as radar only accumulation, see Sect.3.1, or merged with gauge observations, see Sects. 3.2,3.3 and 3.4.

The radar network
FMI operates eight C-band Doppler radars, which nearly cover the whole country.In southern Finland, the distance between radars is 140-200 km and measurements are made in bins that are 500 m long and 1 • wide, up to 250 km in range.Thus, data from two or three radars are available over most of the study area.The location of the radars and their coverage is shown in Fig. 1a.As Finland has no high mountains, the horizon of all the radars is near 0 • elevation with no major beam blockage, and, in general, the radar coverage is excellent up to 68 • N latitude.
The effective radar reflectivity factor Z e (usually called reflectivity) is derived from the expression where P r is the average received microwave power, r is the measurement range, L is the two-way attenuation in the propagation path (antenna − scatterers − antenna), C is a radar constant (including parameters of the radar hardware) and K is the dielectric factor (depending on the relative fraction of ice and water in the hydrometeors).The reflectivity uses dBZ as a unit, which is expressed as dBZ = 10 • log 10 Z e . (2) The uncertainty factors affecting radar reflectivity are the electronic miscalibration, beam blocking, and attenuation due to both precipitation (Battan, 1973) and wet radome (Germann, 1999).At mid-latitudes, the main source of uncertainty of radar-based rainfall estimates is the vertical profile of reflectivity (VPR), which causes a range-dependent error (Zawadski, 1984).At large distances, the radar probes the upper parts of the cloud, where reflectivity is weaker.In FMI's general radar processing chain, this is compensated with the VPR correction, which also compensates for overestimation in a melting layer when appropriate (Koistinen et al., 2003).The radar ingest to LAPS system, used in this study, processes original 3D volumes and therefore no VPR correction is needed.Before the radar volume data is ingested into LAPS, clutter is removed with Doppler-filtering and any residual clutter with a post-processing procedure based on fuzzy logics (Peura, 2002).
The output of a weather radars is reflectivity, Z, which depends on sum of sixth power of drop diameter.When converting reflectivity to precipitation intensity, one has to assume the size of measured drops.The real drop size distribution is highly variable depending on the type of precipitation, but because it is usually unknown, a default drop size distribution is used (Battan, 1973).This leads to errors when the drop sizes differ from average values.It has been noted, both in literature and in our experiments, that during small-droplet precipitation (drizzle), the gauges usually give larger values compared to radar, with a factor often exceeding values of 30.On the contrary, in large drop situations, typically related to heavy precipitation cases (rain showers with embedded cumulonimbus clouds), the observed gauge-to-radar ratio often gets less than 0.25.This discrepancy is related to the use of the standard Z − R equation formula (Marshall and Palmer, 1948) for all liquid precipitation cases, even though we know that drop size distributions vary from one precipitation case to another.Another well-known factor causing differences between measurements with gauge and radar is the radar beam overshooting in shallow drizzle events.These circumstances could breed a substantial impact on the analysis and therefore the gauge-to-radar ratio has to be controlled carefully (see Sects.3.2 and 3.3).
Comparing radars and gauges, an additional challenge arises from the different sampling sizes of the instruments.Radar measurement volume can be several kilometres wide and thick (one degree beam is ca. 4 km wide at a 250 km distance from antenna), while the measurement area of a gauge is 400 cm 2 (weighting gauges) or 100 cm 3 (optical instruments).The measurements in the FMI network have been designed to use the radar composite in Cartesian grid of 1 km × 1 km.Details of the FMI radar network and processing routines are described in Saltikoff et al. (2010).
In this study, the radar data were used as volume measurements, repeated every 5 min and consisting of 5 elevation angles, typically between 0.4 and 45 • .LAPS processes the radar data directly onto its own gridded coordinate system, which has a resolution of 3 km × 3 km.

Surface observations
For this study, a total of 447 rain gauges, both weighting gauges and optical sensors, provide detailed point information, which is used to correct the radar first-guess field (introduced in Sect.2.2).The verification period ranges from 11 April and 14 October 2011, i.e. by and large the nonwinter season (no-snow-phase precipitation).
The surface precipitation observations are from standard weighting gauges and optical sensors mounted on roadweather masts.Weighting gauges are subject to different sources of random errors such as mechanical malfunction, wind drift (Hanna, 1995) and icing, which all affect the accuracy of measurements.FMI manages 77 stations instrumented with the weighting gauge Vaisala model VRG101.Measurements with this instrument have high cumulative accuracy (0.2 mm) provided that the precipitation event exceeds 0.5 mm.Depending on the station, the gauges measure the accumulated precipitation in intervals of 10 to 60 min.Summing these measurements over a 60 min period yields 1 h accumulation data.
The Finnish Transport Agency (FTA) runs 370 roadweather stations with optical sensor measurements (Vaisala Present Weather Detectors models PWD11 and PWD22), which have a precipitation detection sensitivity of 0.05 mm h −1 or less, within 10 min.The precipitation intensity is measured in intervals ranging between 10 s and 5 min and finally summed up to 1 h precipitation accumulation information.A performance study between PWD22 sensor and VRG weighting gauges against Geonor weighting gauges has been done by Wong (2012).The study shows that the PWD22 has a larger negative mean error (underestimation) and a more than four times larger standard error than the VRG.The Finnish road-weather station sites have not been selected for best meteorological quality or representativeness.Hence they may have additional uncertainties connected to their location in the immediate vicinity of roads with heavy traffic, where splash effects and wind eddies, generated by big vehicles, occasionally affect the resulting accumulation.Such effects would be hard to quantify, and as the FTA mainly need qualitative information of precipitation, they have not published accuracy estimates of these measurements.
Another source of uncertainty in surface accumulation observations results from the limited spatial representativeness of many stations with respect to their surroundings, due to the insufficient density of measuring stations for certain areas (Cherubini et al., 2002).Note that if measurements consistently indicate poor data quality, those stations are blacklisted within LAPS and do not contribute to the precipitation accumulation analysis.Hereafter in this article, the weighting gauges and road-weather measurements are indistinctly called gauges and their distribution in Finland is shown in Fig. 1b.

Description of the four analysis methods
Thanks to its high-resolution reflectivity pattern, weather radar data provide the best first-guess to calculate precipitation accumulation.The radar-based accumulation is calculated in the LAPS routine with the standard Z − R equation formula (Marshall and Palmer, 1948).On the other hand, gauges usually measure the accumulation with higher quality and are consequently used to correct the radar field.In this study, three different assimilation methods have been tested in the LAPS routines as to their capacity to perform the best radar-gauge correction: the Regression, the Barnes and new RandB methods.These methods use the quotient between gauge and radar (hereafter G/R) for their corrections.

LAPS_radar-based accumulation
The reflectivity Z parameter measured by the radar is converted to precipitation intensity R (mm h −1 ) within LAPS accumulation process (see Sect. 2.1), using a pre-selected Z − R equation (Marshall and Palmer, 1948) as of the type where A and b are empirical factors describing the shape and size distribution of the hydro-meteors.In FMI's implementation of LAPS we used A = 315 and b = 1.5 for liquid precipitation, which is relevant in this study carried out during the summer period.This is a gross simplification since the drop size and particle shapes vary according to weather situation (drizzle/convective, wet snow/snow grain), as described in Sect.2.2.Problematic situations include both convective showers with heavy rainfall and the opposite case of drizzle with little precipitation.Although such situations contribute only a fraction of the annual precipitation amount, they might be important during, e.g.flooding events.On the other hand, the same factors have been used for many years in FMI's other operational radar products, and looking at longterm averages, the radar accumulation data match the gauge accumulation values within reasonable accuracy (Aaltonen et al., 2008).After correcting for vertical profile of reflectivity (Sect.2.2), mainly due to major sampling differences between the two sensors, random errors remain at 2-3 dB, which is a typical, reasonably accurate figure in operational radar measurements (Koistinen et al., 2003;Collier, 1986).
In LAPS the intensity field (R in Eq. 3) is calculated every 5 min and the 1 h accumulation is thereafter obtained by summing up over the 5 min intervals.
The linear regression analysis method as described above, in addition to sampling differences, such as accumulation estimates based only on radar data, can differ from gauge observation values either due to radar errors (see Sect. 2.2) or problems with the gauges (Sect.2.3).This is why various statistical methods have been used to address and reduce these differences; for example, a model using a regression method is described in Sokol (2003).In the linear regression analysis method (hereafter Regression method) used in this article, as a first step, the gauge-radar pairs from a given grid point undergo a quality check to prohibit dubious differences between gauge and radar values.The aim is to avoid comparisons involving uncertain radar measurements and spurious surface observations.The selection is performed by discarding gauge-radar pairs exceeding specific thresholds based on the G/R quotient.The thresholds are based on approximately 2 times standard deviation, STDEV (R/G), from LAPS_radar dependent data set (see Table 1).The thresholds used in the Regression method within the LAPS routine are as follows: -if G/R > 2.0 then the gauge-radar pair is discarded; -if G/R < 0.5 then the gauge-radar pair is discarded.
The first threshold handles surface observations that are suspected to be false.The second criteria attempt to avoid cases where the radar gives too high a reflectivity, for example in strong convective precipitation (including hail).Once these criteria are enforced, the remaining data form a data set of representative gauge-radar pairs from which a linear regression can be established, calculated with the least square method, which minimized the errors between the measurement pairs.The outcome are values for k and c in the linear The next step is to calculate the newly corrected radar estimate using Eq. ( 4).Here, Y is the corrected radar estimate, X is the first-guess accumulation from radar and the regression coefficients, with k (the slope) and c (the interception point with the y axis) derived from the regression analysis.
The Regression method has the limitation of requiring a large number of valid gauge-radar pairs in order to fulfil the least square calculations and thereby creating a sufficient linear curve fit between the gauge network and radar observations.If there are not enough valid pairs, or if the criteria for a linear dependency are not fulfilled, then the regression method will not be used and the analysis will fall back to the original LAPS_radar-based initial precipitation accumulation field.The behaviour of the linear curve has to be constrained since the shape of the curve is strongly influenced by the amount of gauge-radar pairs.Criteria for this have been set so to constrain k values between 0.2 and 5.0, and c values between −5 and +5 mm, in Eq. ( 4).These constraints were based on average vertical profile adjustments of reflectivity and relates to ranges of up to 200 km from radar station, during the summer period (Koistinen et al., 2003).The linear function is applied to the whole radar accumulation field, i.e. corresponds to a regional-scale correction.

Barnes objective analysis method
The Barnes interpolation forces the radar field to converge towards gauge accumulation measurements, using an objective multi-pass telescoping strategy (Barnes, 1964, Heimstra et al., 2006) in the LAPS routine.The G/R quotient is used to interpolate the first-guess radar field closer to the observation value and in order to optimize the result, several iteration steps are performed within the Barnes analysis at successively finer scales.For grid points far from any G/R observations, the G/R field tends smoothly towards a value of 1.
Depending on the precipitation pattern, this method can potentially result in a highly overestimated or underestimated reflectivity field being spread to the surroundings.For example, if there is one ground station situated at the border of a convective rain shower (cumulonimbus cloud), where only

E. Gregow et al.: Precipitation accumulation analysis -assimilation of radar-gauge measurements
light precipitation occurs, the G/R quotient would probably exceed the value of 30 in this case, as described in Sect.3.For the station point itself, this quotient gives an adequate correction but spreading this large quotient to the surrounding precipitation pattern could potentially give very large overestimates of the accumulation within, for example in this case, the nearby core of a rain shower with heavy precipitation.Quality checks and thresholds have been set to avoid situations where such over-or underestimations of nearby precipitation areas are likely.If the G/R quotient gives very large (more than 30) or very small (less than 0.25) values, this might still give a signal of an adequate trend, even though the signal is overamplified.This trend has to be maintained and adapted but is given less weight in the resulting accumulation.Consequently, the chosen criteria must incorporate these aspects.The thresholds for the Barnes G/R quotient are based on approximately 2 times standard deviation, STDEV (R/G), from the LAPS_radar-dependent data set (see Table 1).The following thresholds were used: -if 0.25 < G/R < 2.0 then allow the derived quotient; The modified Barnes scheme allows weighting (w 0 ) with distance (d) from the gauge station point with respect to the radius of influence (r), normalized by the instrument error (err 0 ), which is here set to be 1.5 in Eq. ( 5).The G/R increment gives the initial increment (p 0 ) at the first iteration step, and the background weight (w b ), set to 0.02, adjusts the output to be closer to radar value further away from the observation point in Eq. ( 6).
After the first iteration step, the p ij output becomes the new G/R increment (p 0 ) for the next iteration step in Eq. ( 6).The iterations continue with successively decreasing values of r, by a factor of 2 for each iteration, in Eq. ( 5) until the observation increments have been diminished to a preset value in LAPS, in this case RMSE = 0.13 mm, or alternatively after 10 iteration steps in order to minimize the calculation time.

New method, combination of Regression and Barnes methods
This new method combines the above described Regression and Barnes analyses.First, the Regression method is used to correct the overall radar estimate, i.e. a regional-scale correction.The resulting accumulation field is thereafter used as a new first guess, initializing the Barnes analysis, which rectifies the radar field on local scales.Assuming that the new first-guess field from the Regression analysis is closer to the real precipitation accumulation, the Barnes correction method will not need to be too aggressive in its correction, thus minimizing the risk of exaggerating the surrounding precipitation with too low, alternatively too high, G/R quotients.

Results and verification
The performance of the different methods has been verified against surface gauge observations of precipitation accumulation data.The verification period spans from 11 April to 14 October 2011, therefore assuming precipitation is in the form of liquid water, and the time sampling interval is one hour.The observations have been divided into two subsets: (i) one set including observations of all stations (but 7 of them) and (ii) a group of 7 Synop stations (excluded from the former set) used as an objective data set for verification respectively).Accordingly, in the calculation of the 1 h precipitation accumulation, the analysis depends on the station information from the first subset (i), hereafter called "dependent" stations, while the accumulation analysis is independent of the 7 stations in the second subset (ii), hereafter called "independent" stations.As the total number of gauge stations in Finland is low, compared to radar pixels, and the experiment was run using the operational system (i.e.results are used in end-users applications), we could not set more stations aside without risking the quality of the end product.The seven independent stations were selected subjectively from different physiographical areas such as coastline, inland, lake district, and proximity to each other.On average, within a radius of 50 km from the independent station point, there are 11 dependant stations and the average distance to the nearest dependant station is 9.8 km.
The statistical quantification of the validation of the different analysis methods are based on the root mean square error (RMSE Eq. 7), and the mean absolute error (MAE Eq. 8), calculated with these data sets: RMSE is a quadratic scoring rule, which measures the average magnitude of the error.Since the errors are squared before they are averaged, RMSE gives a relatively high weight to large errors.MAE measures the average magnitude of the errors in a set of analyses, without considering their direction.It measures the accuracy for continuous variables.MAE is a linear score, which means that all the individual differences are weighted equally in the average.MAE and RMSE can be used together to diagnose the variation in the errors in a set of analyzes.RMSE will always be larger or equal to MAE.The greater the difference between them (RMSE-MAE), the greater the variance in the individual errors in the sample (see Tables 1 and 2).If RMSE = MAE, then all the errors are of the same magnitude.
Results are shown as density plots with logarithmic scales, where data points less than 0.3 mm h −1 are discarded in order to avoid artificial effects due to different detection sensitivities of the different instruments (criteria applied in Figs.2-5).In Fig. 2 we show, separately for the four different methods, the relationship between the analyzed accumulation data at the LAPS grid point closest to a gauge station and the corresponding gauge observations for the dependent stations.The correlation calculated from the data sets and the statistics of the comparisons are compiled in Table 1.It appears from these comparisons that the new RandB method yields the best agreement for accumulation precipitation compared to gauge observations, though the Barnes method also provides reasonable results.On the other hand, the regression method alone is not very successful but still improves the accumulation analysis to some extent.The LAPS_radar method, which is based on radar information only, gives the poorest results in our study.
In order to investigate the error dependencies between radar and gauges, we use an indicator that describes the hy- drological aspects of the errors (Szturc et al., 2011), namely, the absolute difference between observed and analyzed precipitation accumulation as a function of the magnitude of the observed value (i.e.gauge data).Figure 3 shows that the linear fit has a smaller angle coefficient as one passes from the LAPS_radar, to Regression, Barnes and RandB analysis methods.This shows that the departure between analyzed and observed values decreases and again the RandB analysis performs best of the different methods.
We next investigate the agreement between the analyzed precipitation accumulation values and observations (gauge values) for the independent stations (Table 2).Note that for independent stations, there is much less data available.Through the independent stations we want to prove that the methods also work for areas where there are no observing stations available.Thus, verifying that there are no over-or underamplified accumulation patterns devolving especially from the Barnes method (see Sect. 3.3), but also from the Regression method.The density plots (Fig. 4) indicate less scatter and slightly better agreement, i.e. smaller RMSE, MAE and higher correlation coefficient, compared to the dependent stations analysis (Fig. 2).The linear fitted curves in Fig. 4 are strongly influenced by the small amount of observation points, because the data is not normally distributed, hence the distribution of high accumulation values (i.e.corresponding to over 10 mm h −1 ) have a large impact on the fitted curve.The comparison between the linear fitted curves in Fig. 4a-d gives a clear indication of how the different methods compare to each other.We also plotted the absolute difference between analyzed precipitation accumulation and observation as a function of gauge observations for the independent stations (Fig. 5).The same trend is observed as with dependent station data: less dependence of Barnes and RandB methods, compared with LAPS_radar and Regression methods.
In Sects.2.2 and 3.1 we gave an explanation for the errors that are attributed to radar measurements, such as the range-dependent error and Z − R inaccuracies.These errors are related to the prevailing weather situation (e.g.thunder-storms or warm fronts) and, hence, the type of precipitating hydro-meteors occurring at that time.Such influence was further investigated by dividing the different weather situations into two categories describing their air-mass stability: strong convection (hereafter convective) and light-moderate convection (hereafter non-convective), which relates to thunderstorms and warm fronts, respectively.Each category includes 10 cases of a full 24 h day, also selected from the period 11 April to 14 October 2011.The convective cases were determined by using FMI's lightning location system (Tuomi and Mäkelä, 2008) together with FMI radar archive, while the non-convective (warm front) cases were selected from analyzed frontal passages over southern Finland as tagged by the duty forecaster at FMI.
The data set representing the convective weather situations have fewer data values, compared to warm front cases (see # values in Fig. 6).This is expected since convective precipitation is less likely to hit a gauge measuring device and generally last for shorter time, while large-scale precipitation events occurring during warm fronts, have a much higher probability to come across a gauge station and have a larger temporal and spatial dimension.The results (Fig. 6) clearly show that the convective cases give larger RMSE and MAE values, compared to non-convective cases.This is expected, as convective precipitation situations display more spatial heterogeneity and thus a stronger decoupling from the gauge observations.This categorisation also indicates that the RandB method performs best out of the four different methods, though only slightly better than the Barnes method.

Discussions and conclusions
In this article we compare the results from 4 different analysis methods on how to calculate the hourly precipitation accumulation: LAPS_radar, Regression, Barnes and a new developed method RandB (combination of Regression and Barnes).The LAPS_radar serves as the reference method and since it is based on the common Z − R formula, this method is also similar to what is used at many meteorological services.The LAPS_radar is further used as the firstguess field when merging gauges' data into the analysis routine of the three other methods.As described in Sect.3.2, the Regression method benefits from having many gaugeradar pairs, since it will then create a more robust statistical relationship between the measurements.In cases with no valid pairs, or if the criteria for a linear dependency are not fulfilled, the analysis will become the same as the original LAPS_radar-based accumulation field.The Barnes method will in the same way fall back to the original LAPS_radarbased accumulation field if there are no observations avail-able, or if the radar-gauge pairs do not fulfil the thresholds stipulated for the G/R quotient.The new RandB method encounters the same restrictions as described above, since it is a combination of the Regression and Barnes methods.In order to be meaningful for operational purposes, the studied merging methods should therefore show at least as good a result as the LAPS_radar precipitation accumulation analysis.Figures 2, 4 and 6 confirm that applying an assimilation method improves the overall results.In Figs.3b-d and 5b-d one can see that the density values congregate closer to the zero value along the x axis, indicating a better match between analyzed and observed value.The calculated statistics, including both the dependent, independent, convective and non-convective data sets, also state that agreement is improved by applying a merging method.The error values of RMSE and MAE are decreasing, compared to LAPS_radar values, and for the RandB method with the dependent data set, the corresponding reduction in RMSE and MAE are 29 and 47 %, respectively.The correlation, for RandB dependent data set, increases (41 %) accordingly and the variance (RMSE-MAE) decreases when applying the different assimilation methods.Similar results are seen in the independent, convective and non-convective data sets.
When studying the results from two different stability weather situations, i.e. convective and non-convective, the  main findings are that the RMSE and MAE are considerably higher in convective cases.This indicates that the four accumulation methods adopted in this study are more sensitive to convective situations.We interpret that this is related to the larger spatial variability of convective precipitation as well as different drop size distributions.In convective situations, the real intensity is variable within each radar measurement bin (typically representing several cubic kilometres), and it is a random process, which is only partly captured at a single gauge (orifice diameter of 22.6 cm).Also the Z − R equation used in Finland has been optimized for total rainfall, which in areas of extra-tropical cyclones consists largely of frontal precipitation, e.g.warm fronts.As a consequence, when the discrepancy between radar and gauge observations (i.e.large G/R quotients) is significant for the convective cases, the thresholds (see Sects.3.2 and 3.3) are more frequently exceeded within the Regression, Barnes and RandB analyses.This leads to fewer corrections being done from the gauge measurements and the resulting accumulation analysis is worse for convective weather situations, compared to nonconvective cases.
On the other hand, optimising the Z − R equation for some specific types of precipitation should lead to a more faithful merging, which should be reflected in the agreement between analysed and observed precipitation.When such approach would be performed, using a much larger data set basis, the RMSE and MAE value of the agreement for specific precipitation types should naturally tend towards better performance than without any differentiation between precipitation types, and could be used thus as a test.
The conclusive results from this study are that the newly developed RandB method, i.e. the combination of Regression and Barnes analysis methods, generates the best estimate of 1 h precipitation accumulation.Also, applying either Barnes or Regression methods separately still yields a better result than solely using radar accumulation, i.e.LAPS_radar method.

Fig. 1 .
Fig. 1.(a) The rectangular frame of the map depicts the LAPS analysis domain.The red dots represent the 8 Finnish radar stations and the thick, black curved lines display their coverage.The thin circles surrounding each radars represent the areas where measurements are performed below 2 km height.(b) The Finnish surface gauge network (dots on the map) used to measure precipitation accumulation.The red dots indicate the position of the seven "independent" stations used for the verification.

Fig. 2 .
Fig. 2. Density plots of analyzed precipitation accumulation (y axis) against observed rain-gauge values (x axis) for the dependent stations: (a) LAPS_radar; (b) Regression; (c) Barnes, and (d) RandB.The continuous line is a linear fit to the data set and the dashed line represents the perfect 1 : 1 fit in the plots.

Fig. 4 .
Fig. 4. Density plots of analyzed precipitation accumulation (y axis) plotted against rain-gauge values (x axis), for the seven independent stations: (a) LAPS_radar; (b) Regression; (c) Barnes, and (d) RandB.The continuous line is a linear fit to the data set and the dashed line represents the perfect 1 : 1 fit in the plots.

Fig. 5 .
Fig. 5. Absolute value of the difference between observed and analyzed precipitation accumulation (y axis) plotted against rain-gauge values (x axis), for the seven independent stations: (a) |LAPS radar-Gauge|; (b) |Regression-Gauge|; (c) |Barnes-Gauge|, and (d) |RandB-Gauge|.The continuous line is a linear fit to the data set.

Fig. 6 .
Fig. 6.Statistical verification results for the four different accumulation methods split into two different air-mass stability situations; left panel: convective cases (i.e.thunderstorms) and right panel: non-convective cases (i.e.warm fronts).The symbol # indicates the number of observations used in the calculations.The mean precipitation for each case, calculated from rain-gauge values, is included as a dashed stack.

Table 1 .
Statistical verification results of the different methods for the dependent stations data set.

Table 2 .
Statistical verification results of the different methods for the independent stations data set.