Extracting statistical parameters of extreme precipitation from a NWP model

Precipitation simulations on an 8 ×8 km grid using the PSU/NCAR Mesoscale Model MM5 are used to estimate the M5 and Ci statistical parameters in order to support the M5 map used for flood estimates by Icelandic engineers. It is known a priori that especially wind anomalies occur on a considerably smaller scale than 8 km. The simulation period used is 1962–2005 and 73 meteorological stations have records long enough in this period to provide a validation data set. Of these only one station is in the central highlands, so the highland values of the existing M5 map are estimates. A comparison between the simulated values and values based on station observations set shows an M5 average difference (observed-simulated) of −5 mm/24 h with a standard deviation of 17 mm, 3 outliers excluded. This is within expected limits, computational and observational errors considered. A suggested correction procedure brings these values down to 4 mm and 11 mm, respectively.


Introduction
In this paper the statistical parameters M5 and C i (Eliasson, 2000) for annual precipitation extremes in Iceland are estimated.The estimates are based on a NWP model: The fifthgeneration Pennsylvania State University-NCAR Mesoscale Model-MM5 (Grell et al., 1995).It has been widely used in forecasting and usually found reliable.(Anders et al., 2007) found good agreement between gauge precipitation and cumulative MM5 precipitation simulations for all seasons in Correspondence to: J. Eliasson (jonase@hi.is)their investigation of the small-scale spatial gradients in climatological precipitation on the Olympic peninsula, a geographical region even more mountainous than Iceland.The sum of the 10 largest simulated events compared well with the precipitation gauges, although some of the individual events are significantly over-or undersimulated.In this paper we follow a similar methodology, extract statistical parameters from MM5 computed annual extreme rainfalls, without considering discrepancies in the time histories of computed and observed values, and then compare the results with available statistical parameters based on observations.
Great care has to be taken in selecting the parameterization scheme used in MM5 precipitation simulations.Convective precipitation is one of the most difficult.Here the Grell cumulus parameterization scheme (CPS) and the Reis-ner1 microphysics scheme (Reisner et al., 1998) is used as recommended by (Chien and Jou, 2004).Other combinations were found to lead to a general underforecast.However, some investigations have shown that all microphysical schemes produce a similar precipitation field and none of them perform significantly better than the others (Serafin and Ferretti, 2007).CPS will be discussed in more detail in the next section.
Major precipitation errors for individual storms seem to exist even in model runs with excellent overall performance.(Minder et al., 2008) found MM5 very good in simulating small-scale pattern of precipitation at seasonal time-scales while major errors exist for individual storms.Other analyses clearly show a tendency to form local precipitation maxima in the lee of individual mountain ridges (Zangl et al., 2008) while yet other research indicates exactly the opposite (Rögnvaldsson et al., 2007a).The purpose of this analysis is to review an M5 map presently used by Icelandic engineering hydrologists to estimate peak runoff.The M5 -annual extreme 24 h rainfall with 5 years return period - (Eliasson, 2000) is used as an index variable in these estimations hence a good M5 map is needed.The basic data for the M5 is the uncorrected annual maximum 24 h precipitation.Various correction methods do exist (Crochet, 2007) but these can be applied to the values on the map by the users as the corrections apply to varying wind speeds in the range 0-6.5 m/s but annual maximum precipitation events in Iceland usually occur in storms with wind speeds larger than 6.5 m/s, but above this wind speed the correction factors depend on rain intensity only.The reliability of the correction factors is also an open question in rain intensities larger than 60-80 mm/24 h.
Another parameter is needed for quantile estimation, the C i parameter.Together these two replace the mean value and the standard deviation in the Gumbel distribution, but this distribution is found valid for the Icelandic data (Eliasson, 1997).The map is also used for PMP (Probable Maximum Precipitation) estimation (Eliasson, 1994) so the map is used for a wide range of quantile estimates in engineering design.
The North Atlantic experienced increased cyclonic activity with increased storminess from the early 1960s until the mid nineties after a relatively quiescent period from about 1930 (Hanna et al., 2008).The climatic stability and therefore the justification for using an index parameter extracted from the last 100 years of observations is an open question.It is necessary to bear in mind the complex composition of precipitation extremes and how individual precipitation components in Iceland do differ from those of central Europe.The main difference in extreme precipitation climatology is that orographically enhanced precipitation is the dominating component in Iceland rather than convective precipitation (Hanna et al., 2004).

The MM5 model simulation for 1961 to 2006
An MM5 simulation for the period January 1961 to July 2006 was completed in 2006 based on ERA40 initial and boundary data from the European Centre for Medium-range Weather Forecasts (ECMWF).General results are discussed by (Rögnvaldsson et al., 2009).Prior to this, atmospheric flow over Iceland had been simulated for the period September 1987 through June 2003, using an older version of the PSU/NCAR MM5 mesoscale model driven by initial and boundary data from ECMWF (Rögnvaldsson et al., 2007b).Furthermore, an investigation of the seasonal and interannual variability of the precipitation simulations revealed a negative trend in winter precipitation in W-Iceland, a positive trend in the ratio of lowland precipitation to mountain precipitation in E-Iceland and a substantial inter-annual variability in the ratio of lowland precipitation to precipitation in the mountains.It was found that the mountains contribute to a total increase of precipitation in Iceland of the order of 40%.Because of the good experience with this preliminary run it was decided to extend the simulation period and make a statistical analysis of the precipitation extremes.The calculations were done on an 8×8 km net shown in Fig. 1.
If Fig. 1 is compared to a topographic map of Iceland it reveals that the computational net is rather coarse compared to many landscape features that may be expected to have an effect on the atmospheric flow.This can influence the results significantly.Figure 2, computed in a 1 km grid, shows the simulation results of a storm on 16 June 2008 (simulated with the AR-WRF model 2).Here, local wind speed extremes and high spatial gradients can clearly be seen on the south side of the landmass, which is the westward pointing peninsula at approximately 65 • N in Fig. 1.Increasing the grid size to 3 km made the local features completely disappear.Calculation in a 9 km grid showed even less gradients than the 3 km grid but the difference was greatest between the 1 km and 3 km grid results.These grid-size dependent discrepancies cannot be mended by parameterization, but wrong parameterization can make them considerably worse.Therefore it is possible that spatial gradients in the 8 km MM5 grid are much too small to rely on the results in small-catchment hydrological simulations.Nevertheless, local results that do not depend upon a short time history (like statistical estimates based on annual extremes) can be accurate enough for many applications.
Forecast skills of numerical weather prediction (NWP) models have improved considerably for many variables (e.g.geopotential height and temperature) over the past years and decades but precipitation has remained somewhat elusive (Bozart, 2003).One reason for this is that the physics governing the formation of precipitation are highly complicated and only partly understood, so parameterization is difficult.Another reason is that the distribution of precipitation (particularly solid precipitation) over complex topography, as simulated by NWP models, is very sensitive to the Hydrol.Earth Syst.Sci., 13,[2233][2234][2235][2236][2237][2238][2239][2240]2009 www.hydrol-earth-syst-sci.net/13/2233/2009/ dynamic and thermal characteristics of the impinging winds (e.g.Chiao et al., 2004).
The output files of the simulation were now transformed as follows in Table 2.

Estimation of M5 and Ci
The procedure for estimating M5 and C i is described by (Eliasson, 2000).The stability of the M5 estimate is of great concern.The M5 estimates cannot be taken as scatter free but must be assigned an uncertainty value, just as the model values must be.The common practice is not to use M5 estimates with fewer than 20 annual extremes behind them.One reason for this are the previously mentioned long term fluctuations in the climate.Another reason is statistical uncertainty due to the limited length of the time-series.The influence of the effects of this on the M5 estimate may be clearly seen in Fig. 3.
In the Fig. 3 example it is clearly seen that the number of station years behind an M5 estimate should preferably be greater than 40 in order to achieve reasonable stability.Only 32 meteorological stations have more than 40 station years and of those only 11 have more than 60 years.The station observations considered here are in all cases directly gauged values without wind corrections as previously explained.The MM5 model simulated M5 values used in this study are based on 44 calculated annual extreme values at each grid point and should therefore be reasonably stable.
Another way of assessing the stability in estimated M5 values is to study the differences between a short and a longer period in many points.Figure 4 shows the differences in meteorological M5 station values, between the observation period up to 1990 (Eliasson, 1997) and values covering the period up to 2006.There is a minimum of 20 years behind each M5 value so the data sets of each station overlap by an amount of years that depends upon the period of operation of that station.The difference in M5 is within 10 mm but depends strongly on the number of station years.Above 60 station years this difference seems to be within 5 mm.The average value of the difference is 1 mm but the standard deviation is 3.6 mm.It therefore seems appropriate to assume that the M5 values estimated at the meteorological stations are within ±4 mm for each location.This indicates that the stability of the M5 estimates is good enough so observed and simulated values can be compared, even though the observation periods of the individual stations do not cover exactly the same 44 year period as the simulation does.The 4 mm value may then be taken as an estimate of the uncertainty of the M5 estimate based on station observations caused by the difference in observation periods from the simulation period.On top of this there are instrumental errors and effects of spatial variability that will increase this uncertainty.It must therefore be kept in mind, that the simulation period is the 44 years between 1962-2005 in all grid points, but the observation period for individual meteorological stations is normally different.
The statistical distribution of pooled normalized annual maximum precipitation data in Iceland follows a Gumbel probability distribution rather well (Eliasson, 1997).This distribution was therefore used to estimate the M5 and C i values from the mean and the standard deviation of the station values used in the normalization.
The stability of the C i estimate is also an issue, but the effect of scatter in this parameter is much more limited than the effect of scatter in M5.Most station values of C i in Iceland are below 0.2.The effect of a variability in C i on a quantile estimate can be seen from the following equation (Eliasson, 2000): MT=24 h annual precipitation maximum with return period T years y=Gumbel's parameter=−ln(−ln(1−1/T)) The largest y value used in engineering design is around 7 (T=1000).This will produce the greatest impact of a scatter in C i , but a deviation of 10% in the C i will only produce a 5% deviation in the MT estimate for y=7.For lower T values this effect is smaller and it disappears altogether around the 5 year y value.This relatively little importance of the C i value in practical quantile estimates is the main reason for replacing the mean value and the standard deviation in the Gumbel probability distribution function with M5 and C i .

Comparison with earlier results
Only 1650 of the 11 468 grid-cells are on land.This is a great improvement over the M5 estimates based on station observations, as only 73 stations exist that can be compared to this simulation result.There is no doubt that a substantial improvement can be gained in the model results by using a finer grid and a shorter time step.Such simulations will undoubtedly be produced in the future.
For a qualitative examination it is instructive to study the map in Fig. 5 which clearly shows the strong orographic effect on the precipitation.Areas with M5>120 are seen to be on the glaciers, they are the highest parts of the country, 1000-2000 m a.s.l., while the highland plateau around them is around 600 m a.s.l.The figure shows that the largest precipitation amounts are not found in the lee zones as found by (Zangl et al., 2008).In fact they are located directly on the mountain tops.
The qualitative comparison with the earlier M5 map compiled from precipitation observations until 1990 is shown in Fig. 6.The reader is asked to note, that detailed examination of the maps can be made by zooming the pdf published on the journal's website until the text on the M5 map is clearly readable.Figure 6a is a reproduction of the original map on the referred website, the Icelandic text has no significance to the contents of this paper.
The two isoline maps are not identical, but much closer than might have been expected, especially in the ungauged regions (punctuated lines on the M5 map).The main differences can be qualitatively described as follows: The valley of low values between the high M5 values in the south and the lower values in the north is 60-80 mm/24 h in the earlier map while the simulated values are 40-60 mm/24 h.The line through the high points is along the main water divide between the north and the south parts of the country.
The 120 mm line reaches in between the two glaciers of the south in the earlier map but not in the simulated results.
The low value areas in the north are larger according to the new model.Existing M5 map (http://www2.verk.hi.is/vhi/ vatnaverkfrstofa/Kort/1M5_Yfirlit.pdf)punctuated lines estimated values above, compared to MM5 model M5 (below).Contour lines for each 20 mm/24 h on both maps.(For reading the figures text: Zoom in the picture.) The largest areas based on gauged values in the earlier map (solid lines in the map) are very similar in the simulations.
The qualitative result of this comparison is that the model produces similar M5 values as found from meteorological measurements where they are available, in the ungauged regions the estimated values in the earlier map are higher than the simulated values and this difference is of the order of magnitude 10-20 mm or 20-30%.
The results for the C i coefficients are very much along the same lines.
The computed C i values range from 0.12-0.23,this is the same range as found from data from the meteorological stations.It is impossible to compile an areal distribution comparable to Fig. 7 from the 73 observations because while the punctuated lines in the M5 map could be estimated from reliable M5-AAR (annual average rainfall) relations, no such relation seems to exist for the C i .It was therefore recommended to use the value C i =0.19 with the M5 map, or the value from the closest meteorological station.Figure 7 shows that the MM5 simulations justify this recommendation.The average C i value is closer to 0.17, but a recommended value to be used in practical applications should be a little higher than the average to prevent underdesign.
The simulated annual maxima for individual years show a great scatter when compared with observations as is done in Fig. 8.Besides this scatter in the numerical values, observed and simulated maxima do not usually occur on the same day.It is not anticipated that simulations in a finer grid will mend this scatter.
In the quantitative comparison between the meteorological stations and the simulation the closest gridpoint (NP0) is used together with the 8 neighbor points to NP0 (NP1-NP8).This is because simulated M5's are cell averages while the observations are point values so no gridpoints correspond exactly to the stations.The cluster NP1-NP4 is the closest 4 points (N-S and E-W), NP1-NP8 the cluster of the closest 8 points in the grid.These two clusters were studied in an attempt to find explanations to the larger differences between gauge M5 and NP0.The distance MS-NP0 can be up to 5.7 km and the differences in M5 values between the NP points will show the spatial variation in the computational grid and this variation can explain a part of the gauge-NP0 difference in the M5 values, when the spatial variation in the NP1-NP8 cluster is regular and the distance from the gauge to NP0 is a few kilometers.Various schemes to interpolate and estimate the "best computed value" at the meteorological station in order to compare that value to the observation M5 value may be used.
On top of this "regular" spatial variation there is the precipitation effect due to landscape forms on a scale <8 km that are flattened out by the grid but felt by the meteorological stations.
Figure 9 shows a direct comparison between simulated M5 values at the NP0 points and the M5 based on precipitation measurements from the meteorological stations, again with no corrections applied.The RMS difference of the station and simulated M5 values in Fig. 9 is 17 mm and the average difference is −5 mm (model values higher than the gauges), the correlation coefficient is R=0.78 (black line).If the three red outliers are excluded (see below), the correlation improves somewhat (R=0.9;blue line).Of the 73 gauges 57 are in the range 40-80 mm and 80% of these points (63% of the total) are within 10 mm which is the outer range for the scatter in Fig. 4. Differences between the station and simulated M5 values are given in Tables 3 and 4.
The magnitude of the measurement error depends on the wind-speed and the under-catch is more pronounced for solid (especially snow) than liquid precipitation (Førland et al., 1996).The values of the differences in Table 3 and the estimated underlying causes of the differences are listed in Table 4.The differences marked A and B in Table 3   Course grid effect (0-50% in 4% of points) rms 10 mm a Meteorological Station except that the outliers indicated by the red symbols in Fig. 9 and noted in line B2 of Table 3 need closer examination.
In Fig. 10 we examine the three red outliers in Fig. 9, together with the point directly above them.In all these points the gauge value is approximately the same, (103-106) so this value is represented by a thick green line in Fig. 10.The large cluster NP1-NP2 is used.
In Fig. 10 we see that the "normal" station 615 (yellow columns) has an average deviation within the 17 mm mark, but the spatial variation around the NP0 value is greater than in the other points.The same large spatial variation is seen in the results from station 620, but here the simulated M5 value is only 60% of the gauge value.The two other points are less than 50% of the gauge value and the spatial variation is small with the exception of NP9 for station 234 (red column).This shows that a small spatial variation in the NP values may not imply an accurate result.It is believed that hills in the landscape around stations 103 and 234, that are flattened out in the grid, cause the large deviations at these stations and this effect could also affect the low simulation results at station 620.This cannot be verified except by simulations in a finer grid that have so far not been carried out.Nevertheless, this opens up the possibility that several grid points in the simulation, possibly anywhere in the grid, can be rather inaccurate.Carefully interpolated values to the stations locations in Fig. 8, statistical analysis of the differences and subsequent correction of all of the 1650 cell values does only have a minor chance of improving the simulation results.

Discussion
The simulation has provided M5 results for around 1500 locations in Iceland where no information was available before.
Hydrol.Earth Syst.Sci., 13,[2233][2234][2235][2236][2237][2238][2239][2240]2009 www.hydrol-earth-syst-sci.net/13/2233/2009/Where we have station information, the largest single group (63% of the total gauge values) NP0 and gauge values fall within 10 mm/24 h (Table 3).Of these about 4 mm may be due to different estimation periods (Table 4).Effects of wind and distance between station locations and NP0 can explain differences up to 10 mm (Table 4).The rest of the values (37%) show greater scatter.These discrepancies are presumably due to a combination of all errors listed in point 4 and errors in the precipitation measurements.Due to the strong orographic effect in the precipitation, local landscape features on length-scales <8 km can be felt by the gauges without having any effect in the simulations.Three outliers may show a large effect of this type.There the simulated MM5 precipitation value is only 50% of the gauge value so the total difference is 40-60 mm instead of the maximum 35 mm at the other points.There may be an unknown number of such points in the simulated data set; they can only be identified by more accurate simulations.
The least squares line is M5sim=4+1.05M5MS (outliers excluded), but using the relation M5=(M5sim-4)/1.05 to produce a new M5 map has very little effect and does not mend the real problems.The result of this discussion is therefore, that a general trend function that can be applied to the new simulated M5 values for use in ungauged regions cannot be seen.The simulated values are already so good that differences between gauge values and simulated results falls within the range to be expected when the model grid inaccuracy and the accuracy of the estimation of the gauge M5's on one hand, and the general MM5 model inaccuracy on the other hand are combined.Such differences are generally not randomly distributed, as least square lines assume.

Conclusions
This paper describes the M5 parameter, as computed from the annual precipitation maxima, simulated by the MM5 atmospheric model.The simulated M5 values were compared to all meteorological stations where estimates of observed M5 values were available.The results can be summarized as follows.
-The observed values show sufficient stationarity so the comparison does not have to be restricted to observations within the simulation period 1962-2005.
-The comparision reveals a few outliers (∼4%) where the difference between simulated and observed values is large and of uncertain origin.
-The difference between simulated and observed values is within 10 mm (for ∼2/3 of the values) in the range 20-160 mm/24 h -There are no systematic deviations that can be mended by a trend function.Other areas (30 meteorological station points available): Correction by expert opinion.
For ungauged regions the following procedure is recommended All regions where the original map and the M5sim value is <60 mm: No correction.
Other regions, original map value up to 80 mm: Correction 0-20, linearly increasing.
In regions with original map value >80: Add 20 to the simulated values.
The suggested procedure is believed to be more consistent than the flat trendline.It brings the overall differences down to the average −4 and rms 11 instead of the −5 and 17 in Table 3.
Future research on M5 and the basis of flood estimation in Iceland will be concentrated in three main areas: 1. Checking the probability distribution function of the annual precipitation maxima region for region in order to find if there are discrepancies in the a priori assumption that they follow the 2-parameter General Extreme Value distribution as previously found (Eliasson, 1997).
2. Searching for statistically significant M5-AAR (Average annual rainfall) and C i -AAR relations.
3. Working towards a new simulation 1961-2007 in as fine a grid as possible.

Fig. 1 .
Fig. 1.Elevation data of the MM5 simulation area (color scale in meters), geographical longitude (Degrees East) on horizontal axis, latitude (Degrees North) on vertical axis.

Fig. 2 .Fig. 3 .
Fig. 2. Local wind anomalies (small blue spots in the lee zone) in Snaefellsnes, only found in a 1 km grid, not 9 or 3 km.Red figures in Squares: Meteorological station names and wind speed in m/s.Colour scale: Computed wind speed in m/s.

Fig. 4 .
Fig. 4. The difference between M5 data in the 1990 and 2006 data sets.

Fig. 5 .
Fig. 5. Surface map of the MM5 model values for M5 showing the orographic effect.Colour bar and z scale: M5 in mm/24 h.

Fig. 7 .Fig. 8 .
Fig. 7. Surface map of the C i values from the MM5 run, smaller orographic effect than in Fig. 5. Colour bar and z scale: C i , dimensionless value.

Fig. 9 .
Fig. 9. Simulated NP0 point values for M5 (vertical axis) compared to observed values at the 73 meteorological stations (horizontal axis).Trend lines are for all points (black) and outliers excluded (blue).

Fig. 10 .
Fig. 10.Outlier points in Fig. 9 (red).Meteorological stations number 103, 234 and 620 compared to "normal" difference station 615.Numbers on the horizontal axis are the NP point numbers.Vertical axis: M5 values in mm/24 h
i values computed 11 464 have a cause marked A and B in Table 4.They speak for themselves

Table 3 .
Differences between meteorological stations and NP0 values.

Table 4 .
Order of magnitude values of possible causes of the differences in Table3.