<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v3.0 20080202//EN" "https://jats.nlm.nih.gov/nlm-dtd/publishing/3.0/journalpublishing3.dtd">
<article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="research-article" specific-use="SMUR" dtd-version="3.0" xml:lang="en">
<front>
<journal-meta>
<journal-id journal-id-type="publisher">HESSD</journal-id>
<journal-title-group>
<journal-title>Hydrology and Earth System Sciences Discussions</journal-title>
<abbrev-journal-title abbrev-type="publisher">HESSD</abbrev-journal-title>
<abbrev-journal-title abbrev-type="nlm-ta">Hydrol. Earth Syst. Sci. Discuss.</abbrev-journal-title>
</journal-title-group>
<issn pub-type="epub">1812-2116</issn>
<publisher><publisher-name></publisher-name>
<publisher-loc>Göttingen, Germany</publisher-loc>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.5194/hess-2022-348</article-id>
<title-group>
<article-title>Machine Learning and Committee Models for Improving ECMWF Subseasonal to Seasonal (S2S) Precipitation Forecast</article-title>
</title-group>
<contrib-group><contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Elbasheer</surname>
<given-names>Mohamed Elneel Elshaikh Eltayeb</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
<xref ref-type="aff" rid="aff3">
<sup>3</sup>
</xref>
</contrib>
<contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Corzo</surname>
<given-names>Gerald Augusto</given-names>
<ext-link>https://orcid.org/0000-0002-2773-7817</ext-link>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
<xref ref-type="aff" rid="aff3">
<sup>3</sup>
</xref>
</contrib>
<contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Solomatine</surname>
<given-names>Dimitri</given-names>
<ext-link>https://orcid.org/0000-0003-2031-9871</ext-link>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
</contrib>
<contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Varouchakis</surname>
<given-names>Emmanouil</given-names>
<ext-link>https://orcid.org/0000-0002-0023-3598</ext-link>
</name>
<xref ref-type="aff" rid="aff2">
<sup>2</sup>
</xref>
</contrib>
</contrib-group><aff id="aff1">
<label>1</label>
<addr-line>Hydroinformatics department, IHE Delft Institute for Water Education, Delft, Netherlands</addr-line>
</aff>
<aff id="aff2">
<label>2</label>
<addr-line>Technical University of Crete, Crete, Greece</addr-line>
</aff>
<aff id="aff3">
<label>3</label>
<addr-line>These authors contributed equally to this work.</addr-line>
</aff>
<pub-date pub-type="epub">
<day>03</day>
<month>11</month>
<year>2022</year>
</pub-date>
<volume>2022</volume>
<fpage>1</fpage>
<lpage>37</lpage>
<permissions>
<copyright-statement>Copyright: &#x000a9; 2022 Mohamed Elneel Elshaikh Eltayeb Elbasheer et al.</copyright-statement>
<copyright-year>2022</copyright-year>
<license license-type="open-access">
<license-p>This work is licensed under the Creative Commons Attribution 4.0 International License. To view a copy of this licence, visit <ext-link ext-link-type="uri"  xlink:href="https://creativecommons.org/licenses/by/4.0/">https://creativecommons.org/licenses/by/4.0/</ext-link></license-p>
</license>
</permissions>
<self-uri xlink:href="https://hess.copernicus.org/preprints/hess-2022-348/">This article is available from https://hess.copernicus.org/preprints/hess-2022-348/</self-uri>
<self-uri xlink:href="https://hess.copernicus.org/preprints/hess-2022-348/hess-2022-348.pdf">The full text article is available as a PDF file from https://hess.copernicus.org/preprints/hess-2022-348/hess-2022-348.pdf</self-uri>
<abstract>
<p>&lt;p&gt;The European Centre for Medium-Range Weather Forecasts (ECMWF) provides subseasonal to seasonal (S2S) precipitation forecasts; S2S forecasts extend from two weeks to two months ahead; however, the accuracy of S2S precipitation forecasting is still underdeveloped, and a lot of research and competitions have been proposed to study how machine learning (ML) can be used to improve forecast performance. This research explores the use of machine learning techniques to improve the ECMWF S2S precipitation forecast, here following the AI competition guidelines proposed by the S2S project and the World Meteorological Organisation (WMO). A baseline analysis of the ECMWF S2S precipitation hindcasts (2000&amp;ndash;2019) targeting three categories (above normal, near normal and below normal) was performed the ranked probability skill score (RPSS) and the receiver operating characteristic curve (ROC). A regional analysis of a time series was done to group similar (correlated) hydrometeorological time series variables. Three regions were finally selected based on their spatial and temporal correlations. The methodology first replicated the performance of the ECMWF forecast data available and used it as a reference for the experiments (baseline analysis). Two approaches were followed to build categorical classification correction models: (1) using ML and (2) using a committee model. The aim of both was to correct the categorical classifications (above normal, near normal and below normal) of the ECMWF S2S precipitation forecast. In the first approach, the ensemble mean was used as the input, and five ML techniques were trained and compared: k-nearest neighbours (k-NN), logistic regression (LR), artificial neural network multilayer perceptron (ANN-MLP), random forest (RF) and long&amp;ndash;short-term memory (LSTM). Here, we have proposed a gridded spatial and temporal correlation analysis (autocorrelation, cross-correlation and semivariogram) for the input variable selection, allowing us to explore neighbours&amp;rsquo; time series and their lags as inputs. These results provided the final data sets that were used for the training and validation of the machine learning models. The total precipitation (tp), two-metre temperature (t2m) and time series with a resolution of 1.5 by 1.5 degrees were the main variables used, and these two variables were provided as the global ECMWF S2S real-time forecasts, ECMWF S2S reforecasts/hindcasts and observation data from the National Oceanic and Atmospheric Administration (Climate Prediction Centre, CPC). The forecasting skills of the ML models were compared against a reference model (ECMWF S2S precipitation hindcasts and climatology) using RPSS, and the results from the first approach showed that LR and MLP were the best ML models in terms of RPSS values. In addition, a positive RPSS value with respect to climatology was obtained using MLP. It is important to highlight that LSTM models performed quite similarly to MLP yet had slightly lower scores overall. In the second approach, the committee model (CM) was used, in which, instead of using one ECMWF hindcast (ensemble mean), the problem is divided into many ANN-MLP models (train each ensemble member independently) that are later combined in a smart ensemble model (trained with LR). The cross-validation and testing of the CMs showed positive RPSS values regarding climatology, which can be interpreted as improved ECMWF on the three climatological regions. In conclusion, ML models have very low&amp;mdash;if any&amp;mdash;improvement, but by using a CM, the RPSS values are all better than the reference forecast. This study was done only on random samples over three global regions; a more comprehensive study should be performed to explore the whole range of possibilities.&lt;/p&gt;</p>
</abstract>
<counts><page-count count="37"/></counts>
</article-meta>
</front>
<body/>
<back>
</back>
</article>