Downscaling of surface moisture flux and precipitation in the Ebro Valley (Spain) using analogues and analogues followed by random forests and multiple linear regression

Ibarra-Berastegi, G.; Saénz, J.; Ezcurra, A.; Elías, A.; Diaz Argandoña, J.; Errasti, I.

doi:https://doi.org/10.5194/hess-15-1895-2011

Articles | Volume 15, issue 6

https://doi.org/10.5194/hess-15-1895-2011

© Author(s) 2011. This work is distributed under
the Creative Commons Attribution 3.0 License.

https://doi.org/10.5194/hess-15-1895-2011

© Author(s) 2011. This work is distributed under
the Creative Commons Attribution 3.0 License.

Articles | Volume 15, issue 6

Research article

|

21 Jun 2011

Research article |

| 21 Jun 2011

Downscaling of surface moisture flux and precipitation in the Ebro Valley (Spain) using analogues and analogues followed by random forests and multiple linear regression

G. Ibarra-Berastegi, J. Saénz, A. Ezcurra, A. Elías, J. Diaz Argandoña, and I. Errasti

Abstract. In this paper, reanalysis fields from the ECMWF have been statistically downscaled to predict from large-scale atmospheric fields, surface moisture flux and daily precipitation at two observatories (Zaragoza and Tortosa, Ebro Valley, Spain) during the 1961–2001 period. Three types of downscaling models have been built: (i) analogues, (ii) analogues followed by random forests and (iii) analogues followed by multiple linear regression. The inputs consist of data (predictor fields) taken from the ERA-40 reanalysis. The predicted fields are precipitation and surface moisture flux as measured at the two observatories. With the aim to reduce the dimensionality of the problem, the ERA-40 fields have been decomposed using empirical orthogonal functions. Available daily data has been divided into two parts: a training period used to find a group of about 300 analogues to build the downscaling model (1961–1996) and a test period (1997–2001), where models' performance has been assessed using independent data. In the case of surface moisture flux, the models based on analogues followed by random forests do not clearly outperform those built on analogues plus multiple linear regression, while simple averages calculated from the nearest analogues found in the training period, yielded only slightly worse results. In the case of precipitation, the three types of model performed equally. These results suggest that most of the models' downscaling capabilities can be attributed to the analogues-calculation stage.

Received: 28 Jan 2011 – Discussion started: 21 Feb 2011 – Revised: 09 Jun 2011 – Accepted: 10 Jun 2011 – Published: 21 Jun 2011