On the uncertainties associated with using gridded rainfall data as a proxy for observed

Gridded rainfall datasets are used in many hydrological and climatological studies, in Australia and elsewhere, including for hydroclimatic forecasting, climate attribution studies and climate model performance assessments. The attraction of the spatial coverage provided by gridded data is clear, particularly in Australia where the spatial and temporal resolution of the rainfall gauge network is sparse. However, the question that must be asked is whether it is suitable to use gridded data as a proxy for observed point data, given that gridded data is inherently "smoothed" and may not necessarily capture the temporal and spatial variability of Australian rainfall which leads to hydroclimatic extremes (i.e. droughts, floods). This study investigates this question through a statistical analysis of three monthly gridded Australian rainfall datasets – the Bureau of Meteorology (BOM) dataset, the Australian Water Availability Project (AWAP) and the SILO dataset. The results of the monthly, seasonal and annual comparisons show that not only are the three gridded datasets different relative to each other, there are also marked differences between the gridded rainfall data and the rainfall observed at gauges within the corresponding grids – particularly for extremely wet or extremely dry conditions. Also important is that the differences observed appear to be non-systematic. To demonstrate the hydrological implications of using gridded data as a proxy for gauged data, a rainfall-runoff model is applied to one catchment in South Australia initially using gauged data as the source of rainfall input and then gridded rainfall data. The results indicate a markedly different runoff response associated with each of the different sources of rainfall data. It should be noted that this study does not seek to identify which gridded dataset is the "best" for Australia, as each gridded data source has its pros and cons, as does gauged data. Rather, the intention is to quantify differences between various gridded data sources and how they compare with gauged data so that these differences can be considered and accounted for in studies that utilise these gridded datasets. Ultimately, if key decisions are going to be based on the outputs of models that use gridded data, an estimate (or at least an understanding) of the uncertainties relating to the assumptions made in the development of gridded data and how that gridded data compares with reality should be made.


Introduction
Rainfall data is a crucial component in many engineering applications.It is required, for example, to carry out rainfall/runoff modelling to estimate inflows into a reservoir, determine the size of rainwater tanks for water sensitive urban design, or calculate the size of levees for flood mitigation strategies.Similarly, those in the climate community use rainfall data to develop and test seasonal forecasting schemes, perform climate attribution studies and verify climate model outputs.However, it is often the case, particularly in Australia due to low population densities and the relatively short history of observational recordings (especially away from the eastern seaboard), that observational rainfall data does not exist at the specific location of interest for such hydrological or climatological investigations.The sparseness of the rainfall observation network means that the gauge closest to the point of interest may be several kilometres away and therefore not representative of the climate patterns at the required location (Jeffrey et al., 2001).
Published by Copernicus Publications on behalf of the European Geosciences Union.
In order to overcome this problem, and also due to the increasing development and popularity of Geographical Information Systems (GIS) software and grid based climate and hydrological models, significant efforts have been made into spatially interpolating data so as to fill the "gaps" in the observational network (e.g.Jeffrey et al., 2001;Hapuarachchi et al., 2008;Kiem et al., 2008;Jones et al., 2009).There are currently two Australia-wide monthly gridded rainfall datasets available.These are the Australian Water Availability Project (AWAP) dataset (http://www.eoc.csiro.au/awap/)and the SILO dataset (www.longpaddock.qld.gov.au).The AWAP dataset superseded the Bureau of Meteorology's former operational gridded rainfall dataset (referred to as the BOM dataset henceforth) in early 2010.While the BOM, AWAP and SILO gridded datasets were developed with the same objective in mind (i.e.complete spatial coverage of rainfall data for Australia), the methods used to produce the various gridded datasets differ in many aspects (refer to Sect.2.1 for details).All three datasets have been used in recent hydrological and climatological studies.For example, the BOM gridded rainfall data was used in recent studies undertaken by Verdon-Kidd and Kiem (2009Kiem ( , 2010) ) and Evans et al. (2009) and the SILO dataset was used in hydrological modelling for the Murray Darling Basin Sustainable Yields and Tasmania Sustainable Yields projects (see Chiew et al., 2008;Viney et al., 2009) and is widely used in industry (e.g.environmental consulting, government agencies, water authorities).AWAP data was used in the modelling undertaken for the Climate Futures for Tasmania project (Grose et al., 2010) and also in several projects being undertaken as part of the South Eastern Australian Climate Initiative: Phase 2 (www.seaci.org).
As discussed, due to the spatially and temporally incomplete nature of the observational network in Australia (and many other places in the world), continuous century (or greater) long, monthly and daily rainfall data that covers the whole of Australia is immensely attractive -as demonstrated by the widespread use of various sources of gridded data.However, it must be remembered that gridded data is in essence "virtual data" and that numerous assumptions underlie the spatial interpolation techniques used to produce the gridded data and that these assumptions differ across the various gridded data products.Given the "virtual" nature of gridded data, and the different techniques used to produce it, differences between (a) gridded data and observed gauge data and (b) the different gridded datasets will exist.The Bureau of Meteorology (2011) acknowledges that "data smoothing" occurs in the production of gridded data such that the gridded values will likely differ from the rainfall recorded at the contributing gauges.Whilst Beesley et al. (2009) have reviewed the AWAP and SILO error statistics (on a daily scale) and Jones et al. (2009) and Fawcett et al. (2010) have compared AWAP and BOM error statistics, comparisons of the three existing gridded datasets with individual gauges have not been made.
The aim of this study is therefore to identify where and when the differences between gridded and gauged monthly, seasonal and annual data occur and to quantify the magnitude of the disagreements (Sect.4).SA is chosen as the case study as it is a region with limited gauged data and its water resources have also recently become a research focus with the establishment of the Goyder Institute (http: //www.goyderinstitute.org/index.php).Studies into SA's water resources require rainfall data but limited work has been done on analysing the pros and cons of various sources of rainfall data for SA.Of particular interest are the potential implications of using gridded rainfall data as a proxy for gauged data in hydrological modelling in SA (Sect.5).It should be noted that this study does not seek to identify which gridded dataset is the "best" -it is unlikely this is even possible given that all data sources, including observed gauge data (e.g.Lavery et al., 1997;Jeffrey et al., 2001), have their strengths and weaknesses.Rather, the intention here is to quantify differences between various gridded data sources, and how they each compare with observed point data, such that these differences can be considered and accounted for in the increasing number of studies that utilise gridded data.

Gridded rainfall data
The BOM, AWAP and SILO Australia-wide gridded datasets provide spatially interpolated monthly (and daily, in the case of AWAP and SILO) rainfall grids at a resolution of 0.05 • × 0.05 • (i.e.approximately 25 km 2 ).The original BOM gridded dataset was produced using the Barnes successive correction technique (Jones and Weymouth, 1997).In this technique, grid values are derived from nearby observation gauges whose influence on a grid cell is determined based on the distance between the two points.Several iterations are performed to decrease the difference between the grid cells and the observed data until a high resolution grid is produced (Jones and Weymouth, 1997).In this case, grids at a 0.25 • longitude-latitude resolution were produced (Jones and Weymouth, 1997;Fawcett et al., 2010).
The BOM gridded dataset used in this analysis supersedes the original BOM dataset described above and is a pre-release of the AWAP dataset (Bureau of Meteorology National Climate Centre, personal communication, 4 April 2012).The BOM gridded dataset used here is produced using the Barnes successive correction technique (Jones and Weymouth, 1997) but has a grid resolution of 0.05 • (compared with the original 0.25  (Fawcett et al., 2010), and therefore all analyses are restricted to this period.Again, it is important to note that the dataset referred to in this paper as the BOM dataset is not the same as the original BOM gridded rainfall dataset (Jones and Weymouth, 1997) which had a 0.25 • longitude-latitude resolution and did not employ spline interpolation.
The AWAP dataset is produced as part of the Australian Water Availability Project, a joint initiative of the BOM and the Commonwealth Scientific and Industrial Research Organisation (CSIRO).In the daily/monthly AWAP dataset the observed daily/monthly rainfall from gauges within the BOM gauging network (i.e. up to approximately 7500 gauges, both open and closed) is decomposed into a monthly average and associated anomaly (Jones et al., 2009).Anomalies are used as they tend to be weakly related to topography, which is important given that the current BOM gauging network does not adequately resolve high elevation areas in Australia (Jones et al., 2009).The daily/monthly anomalies are interpolated using the Barnes successive correction technique, described above, and the monthly climatological averages are interpolated using three dimensional smoothing splines (Jones et al., 2009).The rainfall grids are produced by multiplying the monthly climate average grids and daily/monthly anomaly grids.An unexplained microscale variance term is used in AWAP to allow for observational or measurement error, such that exact reproduction of gauged values at each gauge location is not expected (Jones et al., 2009).AWAP rainfall grids are freely available from 1900 onwards at http://www.bom.gov.au/jsp/awap/.It is noted that the AWAP product is undergoing constant improvement and development -in this study AWAP Version 3 daily interpolated and monthly interpolated datasets were used (CSIRO March 2010 reformat of the Bureau of Meteorology AWAP version 3 monthly rainfall surfaces).
The SILO dataset is produced by the Queensland Department of Environment and Resource Management.Three SILO products are available: interpolated data grids, the data drill product (gridded data extracted at any chosen location in Australia) and the patched point data product (gauged data with missing values infilled with gridded data).In this analysis the daily and monthly data drill product and data grids, available from 1890 onwards from www.longpaddock.qld.gov.au/silo/, were used.To generate the monthly gridded SILO rainfall dataset, the observed data from over 5000 BOM gauges is normalised and then interpolated using ordinary kriging.The observed data is cross validated and gauges with high residuals are removed.The updated dataset is reinterpolated using ordinary kriging and the monthly rainfall surfaces are generated by reversing the normalisation (Jeffrey et al., 2001;Jeffrey, 2006).It is important to note that the process used to create the SILO datasets is set to accurately reproduce the observed data (i.e.exact interpolation).

Observed rainfall data
The gridded datasets described in the previous section are all based on interpolations of rainfall observed at BOM gauging gauges.The BOM rainfall gauge network in SA is comprehensive, with close to 1600 gauges situated across the state.Many gauges offer over 100 yr of daily rainfall data and 65 of the BOM gauges are considered "high quality", as defined by Lavery et al. (1997), who developed a list of 379 BOM rainfall gauges across Australia that did not show inhomogeneities or spurious trends in their data records (Lavery et al., 1997;Gallant and Karoly, 2010).BOM gauged monthly rainfall totals were extracted from http://www.bom.gov.au/climate/data/ for 16 BOM gauges across SA (see Fig. 1) that encompass a range of elevations and locations (coastal and inland) and cover various timeframes.BOM gauge details are provided in Table 1.Five of the 16 gauges are "high quality" and have records greater than 70 yr.The additional 11 gauges, five "long record" (>70 yr) and six "short record" (<50 yr), although not rated as "high quality" based on the Lavery et al. (1997) definition, are on average 91 % complete and therefore suitable for this analysis.
Despite being "dependent" data (i.e.used in the development of the three gridded datasets), the BOM gauged data allows an insight into how the gridded datasets compare to gauges with long records and records for remote gauges.However, to achieve the objective of quantifying differences between various gridded data sources, and how they compare with observed point data, an "independent" reference source is needed.Hence, gauged data, not used in the development of the three gridded datasets (i.e.independent gauged data) was provided by SA's Department for Water (DFW).The DFW manages 80 rainfall gauges in South Australia, with the majority situated in the south-eastern portion of the state (see Fig. 1).Around half of the gauges have less than 10 yr of rainfall data and many gauges have missing data.After an assessment of gauge locations, data record lengths and quality, 10 DFW gauges were selected for the gauged and gridded data analysis (see Sect. 3 for a description of analyses undertaken).The selected gauge record lengths range from 21 to 31 yr, with most gauge records commencing in the 1980s.The gauge records are on average 92 % complete.Figure 1 indicates all DFW gauges, as well as those selected for the analysis.Selected gauge details are also provided in Table 1.While the DFW gauges do not provide a complete spatial and temporal picture of rainfall in South Australia they do allow independent assessment to be made and, importantly, allow performance assessments to be made in locations not covered by BOM gauges.2) and mean monthly areal potential evapotranspiration (from maps provided at http://www.bom.gov.au/climate/averages/climatology/evapotrans/) were used to calibrate the hydrological model (see Sect. 5 for details).

Intercomparison of gridded rainfall datasets
Given that the three gridded datasets (BOM, AWAP, and SILO) are produced using different methods, some differences between them are to be expected.However, the gridded datasets are all intended to represent the same situation (i.e.reality) and therefore it is hoped that these differences are minimal.As a first step in understanding how the three gridded datasets compare, the percentage differences in annual averages for the 1900-2008 period between SILO and AWAP gridded datasets and BOM and AWAP gridded datasets for the whole of SA as well as the differences in annual totals for the years 1900, 1930, 1960 and were determined.The comparison of annual averages was also undertaken at five randomly selected ungauged point locations within SA (see Fig. 1 for point locations).Note that AWAP was used as the reference point here as this dataset is widely used by Government agencies and the general public due to its free availability on the BOM website.

Legend
2. Comparison of BOM, AWAP, and SILO grid cells that correspond with each BOM and DFW gauged site (from Table 1) with the actual gauged data on a monthly, seasonal and annual basis using the Nash Sutcliffe Efficiency (NSE) (Eq.2).The NSE gives an indication of the agreement between the observed and gridded data (see Eq. 1) with a NSE value of 1 indicating that the gridded data exactly matches the observed data (Chiew, 2006;Peel et al., 2000).Note only seasonal NSE results are shown in Sect. 4. (2) 3. Comparison of the number of months with less than 1 mm (i.e."no rain" months) recorded by each data product and comparison of the number of months (for each data product) that are greater than the gauged 99th percentile rainfall (for both BOM and DFW gauges).
Note that months that have less than 1 mm of rainfall are recorded as "no rain" months (Bureau of Meteorology Climate Services, personal communication, 9 December 2011).The results of these analyses are shown in Sect. 4.
4. Comparison of the total rainfall (in mm) for each data product that corresponds to the number of BOM and DFW gauged "no rain" months and comparison of the total rainfall (in mm) for each data product that corresponds to the number of gauged months greater than the gauged 99th percentile rainfall.The results of these analyses are shown in Sect. 4.
The above analyses compare the rainfall recorded at a single gauge with the rainfall produced, using either the BOM, AWAP or SILO process, for the grid cell within which the gauge sits.Some grid cells, however, encompass several rainfall gauges that have recorded data over the same period.This situation presented an opportunity to determine how the rainfall recorded at each gauge compares with the gridded rainfall produced, particularly given that the rainfall recorded at one gauge is likely to be different to the rainfall recorded at another gauge within the grid cell.To investigate this, three grid cells in SA which contain multiple BOM rainfall gauges and one grid cell that contains multiple DFW gauges (with overlapping records) were selected (see Fig. 3).Box and whisker plots of the annual rainfall for each gridded dataset and each gauge within the selected grid cells were then produced and assessed.

Hydrological modelling implications
To investigate the hydrological modelling implications of using gridded rainfall data as a surrogate for gauged data, a simple rainfall runoff model was developed.This modelling exercise aimed to highlight the sensitivities of hydrological modelling to changes in the rainfall input and therefore the potential issues associated with using gridded data for such an application.A daily rainfall-runoff model was developed using SIMHYD (i.e. the SIMple HYDrology model) for the upper Finniss River catchment in SA (Fig. 2).The catchment has an area of 192 km 2 and fits within 13 grid cells.SIMHYD is a daily rainfall-runoff model which uses daily rainfall and areal potential evapotranspiration data to estimate daily stream flow.The model estimates runoff generation from three sources: infiltration excess runoff, interflow (and saturation excess runoff) and baseflow through the optimisation of nine model parameters (refer to Peel et al., 2000, for further information on SIMHYD).The model was initially calibrated for the period 1970 to 1986 and verified from 1987 to 2002 using gauged daily rainfall data from BOM gauge 23808.A reasonable calibration was obtained, noting that natural streamflow data is particularly difficult to obtain for SA due to diversions, extractions and interbasin transfers (B.Murdoch, personal communication, 22 September 2010) with the monthly NSE values achieved for the calibration and verification periods were 0.89 and 0.85 respectively.Following calibration and verification of the model and the resulting simulation of flow from 1970 to 2002 (i.e. the extent of the gauged daily rainfall data at gauge 23808) using gauged rainfall data, flow was also simulated for the period 1970 to 2009 using (i) AWAP daily gridded rainfall extracted at the gauge location, and (ii) SILO daily gridded rainfall extracted at the gauge location.The BOM gridded dataset was excluded from this analysis as daily data was not available.The SIMHYD model was then recalibrated using the AWAP daily gridded rainfall data and the resulting simulated flow compared with flows simulated using the gauged calibrated model.Monthly NSE values of 0.90 and 0.87 respectively, were achieved for the calibration and verification of this model.The calibration process was repeated again using SILO daily gridded rainfall with resulting NSE values of 0.92 for the calibration period and 0.88 for the verification.

Intercomparison of gridded rainfall datasets
To give an indication of the rainfall range across South Australia, Fig. 4a shows the annual average rainfall for the period 1900-2008 for the AWAP dataset.Figure 4a also provides context for Fig. 4b and c which show the percentage differences in annual average rainfall in SA for the period 1900 to 2008 between the SILO and AWAP and BOM and AWAP gridded datasets respectively.Figure 4b shows that the SILO dataset tends to have a lower annual average compared to AWAP for most of the State for the period 1900 to 2008 with differences generally in the order of −5 % to −20 % .The SILO dataset is noticeably drier than AWAP in the northern portion of the state, particularly the north-west.Comparing this result to the BOM gauge distribution seen in Fig. 1, it is evident that the areas where the difference between the SILO and AWAP datasets are greater coincide with areas of low gauge density.The BOM and AWAP datasets appear to be more similar with most of the differences ranging between 0 and −10 %, with the BOM dataset tending to be slightly drier   SILO and AWAP for 1900, 1930, 1960and 1990.(e) Percentage difference in annual rainfall totals between BOM and AWAP for 1900, 1930, 1960and 1990.for most of the state except for small regions in the north, east and west.Again, the greater differences appear to be in areas of low gauge density.Overall, Fig. 4b and c confirm that the three gridded datasets are indeed different for SA.
To explore the time variability of the results, the percentage difference in annual rainfall totals between SILO and AWAP (Fig. 4d) and BOM and AWAP (Fig. 4e) at four points in time (i.e. 1900, 1930, 1960 and 1990) were determined.These points in time were chosen as they are representative of different periods within the evolution of gauge density and distribution, illustrated by Fig. 5a, which indicates the spatial distribution of the BOM rainfall gauges in 1900, 1930, 1960and 1990 and Fig. 5b and Fig. 5b, which shows changes in the number of BOM rainfall gauges in SA from 1900 to 2009.
The results clearly show that the differences between SILO and AWAP (Fig. 4d) and BOM and AWAP (Fig. 4e) vary across the four years.Reviewing the differences between SILO and AWAP (Fig. 4d) in relation to Fig. 5a and b, there does not appear to be an obvious reduction over time in the differences in areas where there has been an increase in the number of gauges (e.g. the southern third of the state).Similarly, Fig. 4d shows large differences that vary markedly over time between the SILO and AWAP datasets in the northern half of the state where there has been minimal changes to gauge density.This suggests that the differences between the AWAP and SILO datasets are mainly due to differences in the methodologies used to create both datasets, rather than changes in gauge density and distribution.Nevertheless it is still probable that changes to gauge quality and density contribute in some way to the observed differences between SILO and AWAP and further investigation is required to definitively quantify this.
In regards to BOM and AWAP, it appears that the differences between the two datasets are greater in the northern half of the state (with differences ranging from less than −50 % to greater than 50 %) compared with the southern half of the state (differences between −10 % and 5 %), where the majority of gauges are located.Reviewing the differences between BOM and AWAP (Fig. 4e) in relation to Fig. 5a and  b, it is clear that increases to gauge density is related to a decrease in the differences between the two datasets.This result suggests that the differences between the BOM and AWAP gridded datasets are related to both differences in methodology used to develop the datasets and gauge density and distribution.
Figure 6a and b compare the three gridded datasets at five randomly selected ungauged points in SA (see Fig. 1 for point locations).Figure 6a shows the difference (in mm) between annual totals for each gridded dataset extracted at each ungauged location and Fig. 6b shows the differences between annual rainfall totals as a percentage of AWAP annual average rainfall for SILO and AWAP and BOM and AWAP.From Fig. 6 it can be seen that the differences across the three gridded datasets vary.The percentage differences between the three datasets at point three are relatively small whereas at other points the differences in annual rainfall are quite large, ranging between −60 % and 75 %.Referring to the locations of the five random points shown in Fig. 1, it is evident that point 3 is closer to more BOM gauges relative to the other four ungauged locations.This may be a reason for the lower differences between the three gridded datasets at this point.It should be noted that there is no way of telling which of the gridded datasets is most representative of the real rainfall data at each of these points, since the random points were deliberately chosen so as not to overlap with an observation gauge.It is clear that for the selected locations, the three gridded datasets (BOM, SILO, AWAP) rarely agree (i.e. the lines indicating the level of difference in Fig. 6a and b rarely coincide with zero).Importantly, there does not seem to be any systematic pattern to the disagreement (i.e. the differences appear to be random), though as mentioned gauge density could play a role.This raises the questions, what is the true rainfall timeseries at the chosen point since 1900?Which (if any) of the gridded datasets is a suitable representation of the observed data, which itself is an approximation of the actual climate conditions?

Annual rainfall totals
The RMSE for each gridded data product as a percentage of annual gauged data as well as the annual average rainfall for each gauge are presented as a percentage of the annual gauge mean in Table 2.Note that the analyses have been undertaken for the period over which the gauged data commences and ceases (indicated in Table 1 for each gauge) or up to 2008 (when the BOM gridded data ceases).
It is evident from Table 2 that for the BOM gauges, the RMSE values determined for SILO tend to be lower than those calculated for AWAP and BOM.For fourteen of the sixteen BOM gauges, the SILO dataset records the lowest annual RMSE.Indeed, ten of the sixteen BOM gauges have RMSE values of less than 5 % for SILO.Importantly this result is not repeated for the DFW gauges where either the AWAP or BOM dataset were found to have lower annual RMSE values than SILO at nine of the ten DFW gauges.Therefore, the differences between SILO and BOM gauged data are low (as expected because SILO is fitted to BOM gauged data; see Sect.2.1), however this does not appear to be the case at non-BOM gauges (i.e.gauges that SILO is not fitted to).

Annual rainfall in grid cells with multiple gauges
Figure 3 shows the location of the four SA grid cells and the gauges within each grid cell that were investigated.These grid cells were selected as they contain several rainfall gauges with overlapping records.Note that each grid cell was analysed over a different time period (indicated in Fig. 7) with the periods selected to maximise the number of overlapping gauged records.Figure 7 shows box and whisker plots of the annual rainfall recorded at each gauge and the corresponding gridded rainfall data for the same time period for the grid cell within which the gauges sit.Note that each box represents 50 % of the data and the median value is indicated by the line through the box.The lines that extend from the box represent the minimum and maximum values within the dataset that fall within an acceptable range (typically 1.5 times the width of the box).Circles (only seen in grid cell 4 in Fig. 7) represent values outside the acceptable range (i.e.outliers in the dataset).The large range in annual rainfall that occurs in a single grid cell is evident in Fig. 7, particularly in grid cell 1, where the maximum annual totals for the gauges range from the maximum of 770 mm at gauge 23014 to the maximum of 950 mm recorded at gauge 23746.This highlights the short range spatial correlation of rainfall (e.g.Jeffrey et al., 2001) and the flaw in assuming that gridded rainfall is representative of rain everywhere within a given grid cell.Conversely, it could also suggest that gauged rainfall cannot adequately represent rainfall over a grid cell.
It is evident that the rainfall range of the gridded datasets does not fully encompass the gauged range (for both the BOM and DFW gauges) and therefore does not capture the highs and lows of the gauged rainfall data.It would appear, particularly for Grids 2 and 4, that the gridded datasets tend to capture the midpoint of the gauged data.This is perhaps no surprise in regards to the BOM and AWAP datasets given that the methods used to create these datasets aim to capture the areal average (Jones and Weymouth, 1997).Of particular interest is that although the SILO interpolation method is set to exactly interpolate the gauged data (Jeffrey et al., 2001), as evident from the Fig. 7, it is obviously not possible for the SILO method to match the data exactly at all locations simultaneously, especially non-BOM gauges.Ultimately the results of this analysis indicate that the gridded datasets do not accurately capture the spatial variability within a grid cell or, in most cases, the gauged wet and dry extremes.This is further analysed in the following sections.

Seasonal rainfall totals
The seasonal NSE values calculated in the comparison of the gridded and gauged datasets at each BOM and DFW gauge selected are presented in Table 3.As with the annual analyses (discussed in Sect.4.2.1), the seasonal analyses have been undertaken for the period over which the gauged data commences and ceases (indicated in Table 1 for each gauge) or up to 2008 (when the BOM gridded data ceases).
The results for the BOM gauges show that at the seasonal timescale SILO is a better match to gauged data compared to AWAP and BOM (a result consistent with the RMSE analysis of the BOM gauged annual rainfall totals presented in Sect.4.2.1), with NSE values generally close, if not equal to 1.This is generally expected given that SILO data is produced through exact interpolation of the observed data (Jeffrey et al., 2001).It is evident however that although the SILO data is "fitted" to the observed data, it is not an exact match.This is shown in the annual RMSE results as well as in the seasonal NSE results, particularly at the high elevation gauge, 23736.Another interesting result is that AWAP records higher NSE values than BOM during spring at most gauges, yet during summer higher NSE values are obtained for the BOM dataset.Interestingly, AWAP and BOM both perform very poorly in autumn and winter for gauges 17125 and 20050 respectively, however there is no clear reason for the very low NSE values calculated.

GRID 8_UpdatedMarch2012
Range  In contrast to these results the seasonal NSE values calculated in the comparison of the gridded datasets and DFW gauges are much more scattered, which is made obvious by the scattering of green cells (i.e. cells that indicate the highest NSE value determined for that gauge).It is evident that none of the gridded datasets match (i.e.records a high NSE value) the DFW gauged data consistently.Of interest is that for all seasons, the NSE values recorded for the SILO dataset for DFW gauges ranges from −0.30 to 0.96 whereas for the BOM gauges this range is 0.73 to 1.00, with the majority of NSE values around 1.00.This result supports the findings of Sect.4.2.1 and suggests that the performance of SILO in mimicking observed data at non-BOM gauges is variable (i.e. the performance of SILO depends on which gauged data it is fitted to).
The under-representation of high elevation areas in gauging networks is a known cause of interpolation errors (Jeffrey et al., 2001;Beesley et al., 2009) and may explain the poor annual RMSE and seasonal NSE results obtained at high elevation gauge 23736.To investigate the performance of gridded data in relation to elevation, seasonal NSE values calculated for each gauge for each gridded dataset were plotted against gauge elevation.The results for each season are shown in Fig. 8 with the seasonal NSEs calculated for BOM gauges in the left column and NSEs determined for the DFW gauges in the right column.Initially referring only to the BOM gauges, it is evident that the values determined for SILO for all seasons congregate around an NSE value of 1, whereas the values determined for AWAP and BOM tend to be much more scattered.There is no obvious trend in NSE value and elevation; however there is a clear outlier, which corresponds again to the highest gauge, 23736 (elevation 727 mAHD).All three data sets perform relatively poorly at this location in all seasons.However, it is unclear from this analysis whether the high elevation, and associated low gauge density, is responsible for the low NSE values at this site, or the fact that the gauge record ceases in 1956.In any case, it is clear that issues associated with temporal and spatial completeness (i.e.gauge density, e.g.Jones and Trewin, 2000) and the impact that has on the accuracy of gridded datasets needs further investigation and quantification.
The increase in the range of NSE values obtained for SILO for the DFW gauges relative to the BOM gauge results (mentioned earlier) is clearly evident in Fig. 8.There is no obvious pattern in seasonal NSE value and elevation (e.g. an increase in elevation does not necessarily result in a decrease in NSE value as one might expect), with NSE values for all gridded datasets for the DFW gauges very scattered in all seasons.Of note, however, are the relatively high NSE values recorded for all seasons and all gridded datasets for the highest elevation DFW gauge (A5040552, 686 mAHD).This result may give weight to the suggestion that the low NSE values recorded for the highest elevation BOM gauge (23736) may be due to the fact that the gauge record ceases in 1956 rather than the high elevation.Although an interesting result, further assessment of high elevation gauges (of which SA has very few) is required to substantiate this observation.Nevertheless, the key result from this analysis is that there is no obvious seasonal pattern to the differences between the gridded and gauged datasets.

Monthly rainfall extremes
An important aspect required of gridded data is the ability to capture high and low rainfall extremes, since it is often the extremes that are of interest in hydrological, climatological, and agricultural studies.To explore this issue further, Table 4 indicates both the number of months with less than 1 mm (i.e."no rainfall" months) recorded by each data product and the number of months (for each data product) that are greater than the gauged 99th percentile rainfall.Note that the gauged 99th percentile rainfall is provided in brackets in Table 4.To complement this information, Table 5 indicates the total accumulated rainfall (in mm) for each dataset that corresponds to the number of gauged "no rain" months, and the total rainfall (in mm) for each dataset that corresponds to the number  of gauged months that are greater than the gauged 99th percentile rainfall.For example, Table 4 indicates that there are 187 recorded "no rain" months at BOM gauge 16031.The gauged rainfall accumulation of these 187 months is 13.7 mm (Table 5), whereas for the same months, the AWAP, BOM and SILO accumulations are 98.7 mm, 85.0 mm and 14.1 mm respectively.Note that where there is more than one green cell highlighted in Tables 4 and 5, it indicates that more than one of the gridded datasets was closest to the gauged result.
In regards to the "no rain" months, if the gridded dataset underestimates the number of gauged "no rain" months, it suggests that the interpolation process is not capturing the gauged dry periods.This appears to be the case for 13 of the 16 BOM gauges where the number of "no rain" months recorded for the BOM and AWAP gridded datasets underestimates the number of gauged "no rain" months.SILO more closely matches the number of gauged "no rain" months but there are still some marked differences.The implications of this are clear from Table 5 where it is shown that all gridded datasets (except for SILO at gauge 19001 and 23808) overestimate the level of rainfall recorded at each location during months observed to have "no rain".In some cases, the overestimation is significant.For example, for gauge 16083, SILO overestimates the gauged rainfall recorded in "no rain" months by approximately 15 times and BOM and AWAP overestimate by 49 and 56 times respectively.For gauge 17052 the overestimation is even higher for BOM and AWAP.BOM and AWAP record similar accumulations for all gauges except 18146 where BOM overestimates the gauged accumulations by approximately 100 times and AWAP a relatively low 20 times.There is no obvious reason for this discrepancy.
The results for the DFW gauges again appear a lot more variable.At 7 of the 10 DFW gauges, the three gridded datasets reasonably match the gauged accumulation.However, for the remaining three gauges (i.e.A4260639, A5100516 and A5130505), the overestimation is substantial.For example, for DFW gauge A5130505, BOM overestimates the gauged "no rain" accumulation by approximately 400 times, and AWAP and SILO overestimate by approximately 435 and 470 times respectively.Overall, these results, particularly the BOM gauge results, indicate that the gridded datasets tend to overestimate the dry gauged periods and there does not appear to be any systematic pattern to the overestimations.
On the other end of the scale, if the gridded dataset underestimates the number of months greater than the gauged 99th percentile rainfall, it suggests that the interpolation method underestimates the wet periods.general, the number of months greater than the gauged 99th percentile recorded for each gridded dataset is close to the number of gauged months (for both BOM and DFW gauges) greater than the gauged 99th percentile rainfall.However, when looking at the accumulated rainfall in these months some key differences emerge.Table 5 indicates that for months where the BOM gauged rainfall was greater than the gauged 99th percentile SILO and BOM gridded datasets underestimate the gauged accumulated rainfall at 14 of the 16 gauges, and AWAP, 12 of the 16 gauges.Importantly, when compared to the annual average rainfall at each of the BOM gauges (Table 2), the underestimations are highly significant amounts.
Again the results for the DFW gauges are more variable, with all gridded datasets underestimating the gauged accumulation at 6 of the 10 gauges, but not necessarily the same gauges.For the remaining 4 gauges, the gridded datasets overestimate the gauged accumulations.
Ultimately, these results demonstrate that in addition to overestimating the amount of rainfall that occurs during dry (i.e.<1 mm monthly rainfall) conditions the gridded datasets also tend to underestimate the amount of rainfall that occurs when it is extremely wet.
These results are in line with the findings of Beesley et al. (2009) who, in their review of daily AWAP and SILO error statistics, found that there is a negative/positive bias in the gridded datasets for higher/lower rainfall areas.Similar results were found by Silva et al. (2007) and Ensor and Robeson (2008) in their comparisons of gridded and gauged data in Brazil and Midwestern USA respectively, which indicates that the "smoothing out" of extreme rainfall events is a common issue with gridded datasets.5 Implications of using gridded rainfall data as a surrogate for observed in hydrological modelling  using AWAP and SILO as inputs ceases in 2009).It is evident from Fig. 9 that the flow simulated using the AWAP gridded rainfall data for the model calibrated to gauged (blue line) tends not to reach the high flow extremes of the observed gauged flow (purple line) or the flow simulated using gauged rainfall data (red line).This is particularly obvious during the period 1985 to 1990.On the other end of the scale the low gauged flows tend to be overestimated by the AWAP simulated flow.The flow simulated using SILO data (green line) was a much closer match to observed gauge flow which was to be expected given results from Section 4 which show a closer agreement between SILO gridded rainfall data and BOM gauged rainfall.It should also be noted, that from 1997 to 1999 the flow simulated using AWAP rainfall data for the model calibrated to gauged (blue line) is closer to the gauged flow relative to the flow simulated using gauged rainfall.This is a curious result that we cannot explain other than to speculate that this period, which coincided with extreme drought conditions across south-east Australia (e.g.Verdon-Kidd and Kiem, 2009Kiem, , 2010)), may have been associated with diversions or extractions within the catchment that were not properly accounted for in the "naturalised" flow record or were not adequately represented or parameterised in the calibration period.
Also included in Fig. 9 is the flow simulated using AWAP rainfall and the SIMHYD model that was calibrated using AWAP rainfall (dashed orange line).The fact that this flow simulation is so different to the flow simulated with the same input data but a model calibrated on gauged data (blue line) reinforces the points made in Sect. 4 -that gridded rainfall data is different to gauged data.The implications of this are stark given the large differences in flow simulations (blue line versus dashed orange line) that are totally dependent on whether the hydrological model is calibrated using gauged or gridded data -again highlighting the point that gridded data is sometimes quite different to gauged data.Similar conclusions can be made if AWAP is replaced with SILO, as shown in Fig. 9.
Annual, seasonal and monthly NSE statistics for (1) AWAP rainfall data compared with gauged rainfall data (BOM gauge number 23808), (2) flow simulated using AWAP data and the gauged flow data (A4260504) (for the model calibrated to gauged rainfall) and ( 3 23808) calibrated to gauged rainfall (red line), flow simulated using AWAP rainfall calibrated to gauged rainfall (blue line), flow simulated using SILO rainfall calibrated to gauged rainfall (green line), flow simulated using AWAP rainfall calibrated to AWAP rainfall (dashed orange line), flow simulated using SILO rainfall calibrated to SILO rainfall (grey dashed line).
using AWAP data and gauged flow data (for the model calibrated to AWAP rainfall) for the period January 1970 to December 2002 are presented in Table 6.The results clearly demonstrate that although the NSE values for rainfall are relatively high, they markedly decrease in the flow comparison.Table 6 also shows that similar conclusions are obtained when the analysis is repeated to compare flow simulated using SILO data and the gauged flow data (A4260504) (for the model calibrated to gauged rainfall) and flow simulated using SILO data and gauged flow data (for the model calibrated to SILO).The results of the analysis shown in Fig. 9 and Table 6 align with suggestions that a change (or error) in rainfall will lead to a greater change (or error) in streamflow (Chiew, 2006).SIMHYD, the model used in this analysis, is one of the most commonly used hydrological models in Australia (Chiew and Siriwardena, 2005).Although it has been used with gauged rainfall data in the past (e.g.Chiew and McMahon, 1994;Chiew et al., 1996), more recently it appears common practice to input catchment average rainfall into the model (e.g.Peel et al., 2000;Chiew et al., 2008;Viney et al., 2009).Indeed, there are many questions about the representativeness of point observations to estimate the mean areal rainfall and the spatial variability of the rainfall field (i.e. the variables of interest for hydrological modelling) and therefore the appropriateness of using gauged data as an input to hydrological models in Australia (Seed et al., 2000;Jordan and Seed, 2002).However, there are also suggestions that catchment average rainfall may not be the best option for streamflow prediction (Thyer et al., 2007).As an alternative, Thyer et al. (2007) suggest that inputting gauged rainfall from the most productive runoff generating area within the catchment of interest will produce better streamflow predictions than catchment average rainfall.This suggestions fits with the results shown in Sect.4.2.2where it is evident that the processes used to create the gridded datasets (i.e.areal averages) do not capture the large rainfall range present within grids that contain multiple rainfall gauges.
Ultimately, the point of the investigation presented here is to illustrate the differences between gridded and gauged data and to test the assumption that gridded data can be used as a proxy for observed point data.If it were the case that gauged data was well represented by gridded data then simulated flow should also be similar, regardless of which data source was used for calibration.Clearly this is not the case, hence implying that currently available gridded rainfall datasets may not be a suitable proxy for gauged rainfall data if used for hydrological modelling in SA unless the differences outlined in this study are accounted for.

Discussion
The results of this study have shown that the SILO, AWAP and BOM gridded datasets vary, sometimes significantly, from gauged rainfall datasets, and importantly often do not capture gauged extreme events.Dependent (BOM) and independent (DFW) gauges at different elevations and spatial scales were tested at different temporal scales (monthly, annual and seasonal) and the differences between the gridded datasets and between each gridded data set and gauged observations do not appear to be systematic.SILO is a much better fit to the BOM gauged data but this is to be expected as the method used to develop the SILO database involves a step that directly "fits" the gridded data to the gauged observations.On the other hand, the methods used to create the AWAP and BOM datasets aim to produce an accurate picture of the area average rainfall, not necessarily rainfall at individual points (Jones and Weymouth, 1997).Thus it is no surprise that the AWAP and BOM gridded datasets do not match the BOM gauged data as closely as the SILO dataset.Nor is it a surprise that SILO performance at non-BOM gauge locations is relatively poor.Importantly, even though the procedure used to create the SILO dataset is set to enforce exact interpolation of the observed data (Jeffrey et al., 2001), this does not mean it is possible that a SILO grid value will be identical to the rainfall at all sites within that grid cell, as shown by the results presented here (Fig. 7).There are also suggestions that exact interpolation can be misleading as it falsely assumes that there is no error in the observed data (e.g. that results from errors in measurement) (Hutchinson, 1993;Jeffrey et al., 2001).Furthermore, there is no simple way of assessing how accurately any gridded dataset captures the rainfall in ungauged areas.
Another point to consider is how the processes used to produce the gridded rainfall datasets account for a changing rainfall observation network and the potential biases this  may introduce as it has been suggested that the strongest control on interpolation analysis accuracy is gauge density (Jones and Trewin, 2000).In their error analysis of the AWAP dataset, Jones et al. (2009) note that there is little difference in the quality of the AWAP and BOM gridded dataset prior to the 1950s, however after the 1950s, AWAP shows improvement, which coincides with an increase in rainfall observation gauges across Australia.More specifically, Fawcett et al. (2010) show that the BOM gridded rainfall dataset is subject to network driven inhomogeneities in Tasmania (i.e. the BOM grids show artificial changes in rainfall as the number of gauges providing data each month/year changes) but that the AWAP gridded dataset substantially reduces the inhomogeneities.Although the focus of Fawcett et al. (2010) was on western Tasmania, it is likely that similar issues apply in areas where there are few rainfall gauges or where significant changes in the gauge network occur over time.This is the case for SA where, as illustrated in Fig. 5, the gauge network has changed markedly over time and there are large ungauged areas.Future analysis should be focused on assessing inhomogeneities in the gridded data resulting from these network issues and quantifying the uncertainties that emerge.
The point of this investigation, however, is not necessarily to compare the performance of the gridded datasets in how they mimic gauged data or compare them to each other, but rather to highlight that gridded data is interpolated gauged data and should be considered as such (and made explicitly clear in studies that use this data) and the implications of this considered (i.e.gridded data is not observed data and is not necessarily indicative of the real situation, particularly with respect to extremely wet and dry conditions).The results of the hydrological modelling (Sect.5) demonstrate one implication of the issues associated with using the gridded data as a surrogate for observed (i.e. that a change, or error, in rainfall will lead to a greater change, or error, in streamflow; Chiew, 2006).
Of major concern is that gridded data is being used as a proxy for observed data in studies aiming to downscale and/or bias-correct climate model outputs with the expectation that this brings the climate model outputs closer to "reality".However, this is not the case at all, rather such exercises just force the climate model outputs to more closely match the gridded data which, as demonstrated here, is sometimes significantly different to the observations (particularly with respect to extremes) which themselves may or may not be "real".Another issue is the use of gridded rainfall data in studies seeking to attribute patterns or trends to physical climate mechanisms -the attribution studies will be flawed if the data the patterns or trends are based on do not accurately reflect reality, especially in relation to climatic extremes.Ultimately, if key decisions are going to be based on the outputs of models that use interpolated data, an estimate, or at least an understanding, of the uncertainties relating to the assumptions made in the development of gridded data and how that gridded data compares with reality should be made (Jeffrey et al., 2001).There should be (a) error analysis between observed and gridded data undertaken as a matter of course, and (b) ensembles of gridded data surfaces with associated stochastic uncertainty quantification in both space (e.g.due to limited gauged information at the grid cell location) and www.hydrol-earth-syst-sci.net/16/1481/2012/ Hydrol.Earth Syst.Sci., 16, 1481-1499, 2012 time (e.g.due to variable quality and completeness of observed records in the past and the potential non-stationarity in the relationships between rainfall gauges within a grid cell).

Conclusions
The attractiveness of spatially and temporally complete data coverage provided by gridded data (such as BOM, SILO, AWAP) for use in hydroclimatological research, modelling and analysis is obvious.However, it must be remembered that despite the fact that the various gridded datasets are "based" on observed data the spatial interpolation methods employed to produce the gridded data (a) will always introduce some artificiality and (b) make it difficult to verify the "realness" of the gridded data in areas or epochs with no or sparse observation gauges.This is not to say that gridded data should not be used, rather, the fact that gridded data will not always accurately represent "real" spatial and temporal variability should be acknowledged and the uncertainties associated with this should be quantified and accounted for in any study that uses "virtual" data (e.g.BOM, AWAP, SILO).
There is no such thing as bad data -just poor uncertainty quantification.

Fig. 1 .
Fig.1.Indication of all BOM rainfall gauges in SA (light blue dots), BOM gauges selected for analysis (large dark blue dots), DFW rainfall gauges in SA (pink squares), DFW gauges selected for analysis (large dark pink squares), and the random ungauged points (green star).

Fig. 5 .
Fig. 5. (a) Indication of the number of rainfall gauges open in South Australia in 1900, 1930, 1960 and 1990.(b) Evolution of rainfall gauges in South Australia from 1900 to 2009.

Figure 6 .Fig. 6 .
Figure 6.(a) Difference in annual rainfall totals between SILO and AWAP and BOM and AWAP datasets at five random locations (marked 1, 2, 3, 4 and 5) in SA.(b) Difference between annual rainfall totals as a percentage of AWAP annual average rainfall for SILO and AWAP and BOM and AWAP datasets at five random locations in SA.

Figure 7 .
Figure 7.A comparison of annual BOM (red boxes), SILO (green boxes) and AWAP (blue boxes) gridded rainfall data and gauged (grey boxes) annual rainfall data in four grid cells in SA.Note that grid cells 1, 2, and 3 feature only BOM gauges and grid cell 4 includes only DFW gauges.The time period over which the analysis was undertaken is given in the figure.

Fig. 7 .
Fig. 7.A comparison of annual BOM (red boxes), SILO (green boxes) and AWAP (blue boxes) gridded rainfall data and gauged (grey boxes) annual rainfall data in four grid cells in SA.Note that grid cells 1, 2, and 3 feature only BOM gauges and grid cell 4 includes only DFW gauges.The time period over which the analysis was undertaken is given in the figure.

Figure 8 .Fig. 8 .
Figure 8.Comparison of seasonal NSE values and elevation for (a) Bureau of Meteorology 2 (BOM) gauges and (b) Department for Water (DFW) gauges .3

Figure 9
Figure9shows a comparison of gauged flow (aggregated to an annual timestep) with the flow simulated using the SIMHYD model calibrated to BOM gauged rainfall (gauge 23808) and inputs from gauged rainfall, AWAP rainfall and SILO rainfall.Note that because the gauged rainfall ceases in December 2002, the flow simulated using gauged rainfall also ceases at the end of 2002 (whereas the flow simulated

Figure 9 .
Figure9.Comparison of the gauged flow (A4260504) (purple line), flow simulated using gauged rainfall (23808) calibrated to gauged rainfall (red line), flow simulated using AWAP rainfall calibrated to gauged rainfall (blue line), flow simulated using SILO rainfall calibrated to gauged rainfall (green line), flow simulated using AWAP rainfall calibrated to AWAP rainfall (dashed orange line), flow simulated using SILO rainfall calibrated to SILO rainfall (grey dashed line).

Fig. 9 .
Fig. 9.Comparison of the gauged flow (A4260504) (purple line), flow simulated using gauged rainfall (23808) calibrated to gauged rainfall (red line), flow simulated using AWAP rainfall calibrated to gauged rainfall (blue line), flow simulated using SILO rainfall calibrated to gauged rainfall (green line), flow simulated using AWAP rainfall calibrated to AWAP rainfall (dashed orange line), flow simulated using SILO rainfall calibrated to SILO rainfall (grey dashed line).
• ) that is produced via spline interpolation analysis.Only open rainfall gauges were used in producing the BOM gridded dataset which limited the amount of data used (Bureau of Meteorology National Climate Centre, personal communication, 4 April 2012).

Table 2 .
Annual average rainfall for each gauge and the corresponding annual Root Mean Square Error (RMSE) for each gridded data product in mm and (in brackets) as a percentage of annual gauged data.Gray highlighted cells indicate the closest match to gauged.

Table 4 .
Number of months with <1 mm rainfall ("no rain" months) and the number of months > the gauged 99th percentile rainfall (with the gauged 99th percentile rainfall given in brackets).Gray highlighted cells indicate the closest match to gauged.

Table 5 .
Rainfall accumulation during gauged "no rain" months and gauged months greater than the gauged 99th percentile.Gray highlighted cells indicate the closest match to gauged.

Table 6 .
NSE values for the comparison of AWAP rainfall (extracted at 23808) and gauged rainfall (23808) data, AWAP simulated flow (gauged calibration) and gauged flow (A4260504), AWAP simulated flow (AWAP calibration) and gauged flow (A4260504) for the period 1970 to 2002.The bottom three rows show results of the same comparisons but using SILO instead of AWAP.