Evaluation of Asian summer precipitation in different configurations of a high-resolution GCM at a range of decision-relevant spatial scales

High-resolution general circulation models (GCMs) can provide new insights into the simulated distribution of global precipitation. We evaluate how summer precipitation is represented over Asia in global simulations with a grid length of 14 km. Three simulations were performed: one with a convection parametrization, one with convection represented explicitly by the model’s dynamics, and a hybrid simulation with only shallow and mid-level convection parametrized. We evaluate the mean simulated precipitation and the diurnal cycle of the amount, frequency and intensity of the precipitation against 5 satellite observations of precipitation from the Climate Prediction Center morphing method (CMORPH). We also compare the high-resolution simulations with coarser simulations that use parametrized convection. The simulated and observed precipitation is averaged over spatial scales defined by the hydrological catchment basins; these provide a natural spatial scale for performing decision-relevant analysis that is tied to the underlying regional physical geography. By selecting basins of different sizes, we evaluate the simulations as a function of the spatial scale. A new BAsin10 Scale Model Assessment ToolkIt (BASMATI) is described, which facilitates this analysis. We find that there are strong wet biases (locally up to 72 mm day−1 at small spatial scales) in the mean precipitation over mountainous regions such as the Himalayas. The explicit convection simulation worsens existing wet and dry biases compared to the parametrized convection simulation. When the analysis is performed at different basin scales, the precipitation bias decreases as the spatial scales increase for all simulations; the lowest-resolution simulation has the smallest root mean squared 15 error compared to CMORPH. In the simulations, a positive mean precipitation bias over China is primarily found to be due to too frequent precipitation for the parametrized convection simulation, and too intense precipitation for the explicit convection simulation. The simulated diurnal cycle of precipitation is strongly affected by the representation of convection: parametrized convection produces a peak in precipitation too close to midday over land, whereas explicit convection produces a peak that is closer to the late 20 afternoon peak seen in observations. At increasing spatial scale, the representation of the diurnal cycle in the explicit and hybrid convection simulations improves when compared to CMORPH; this is not true for any of the parametrized simulations. Some of the strengths and weaknesses of simulated precipitation in a high-resolution GCM are found: the diurnal cycle is improved at all spatial scales with convection parametrization disabled; the interaction of the flow with orography exacerbates

2019). Scale interactions between large-scale atmospheric modes of variability can be modelled without artificial boundaries introduced when a large-scale GCM provides boundary conditions for a regional climate model.
The aim of this study is to evaluate the precipitation produced by different configurations of a high-resolution GCM over Asia, by comparing simulations against satellite observations. We do this by varying the resolution and representation of 95 convection in order to learn about how the precipitation is affected by these, and how the 14 km simulations perform. We assess both the mean seasonal precipitation and diurnal cycle. Using BASMATI, we carry out the analysis over a hierarchy of spatial scales defined by catchment basins. Our results evaluate the new capabilities offered by high-resolution models in representing precipitation and its diurnal cycle over Asia at different spatial scales. Additionally, our results should inform land surface and hydrological modelling, as the spatial distribution of precipitation and its diurnal cycle are key drivers of these, 100 although evaluating such coupled setups is beyond the scope of this study.

CMORPH
To evaluate the representation of precipitation in the simulations, we use the Climate Prediction Center morphing method 115 (CMORPH; Joyce et al., 2004) observational dataset. This dataset uses high-quality precipitation estimates from microwave satellite data, which are modified (morphed) by the information from geostationary infrared satellites by using a time-weighted linear interpolation to provide a higher temporal resolution. The morphing also allows the construction of a spatially and temporally complete precipitation dataset.
Due to the high spatial and temporal resolution (8 km and 30 minute respectively), CMORPH is well suited to analysis 120 of the diurnal cycle of precipitation. For example, Dai et al. (2007) found that the use of infrared data to improve sampling does not significantly affect the mean precipitation amount, frequency or intensity. Furthermore, satellite products such as CMORPH that include infrared data have smaller biases in the phase of the diurnal cycle. They note that CMORPH has a wet bias over warm season land areas compared with the Global Precipitation Climatology Project (GPCP; Adler et al., 2003). They summarize by saying that the satellite products they evaluate: "capture much of the sub-daily variations in precipitation amount, 125 frequency, and intensity, although quantitative differences in the diurnal phase and amplitude exist among the different products and with surface observations". They also note that the diurnal cycle in these products is biased towards convective precipitation instead of detecting total precipitation, which is due to the microwave frequency detecting larger hydrometeors and the infrared frequency picking up cold cloud tops. Our simulations do not distinguish between convective and total precipitation, so this bias must be borne in mind when we compare to CMORPH. 130 We use CMORPH to produce a 21 y climatology of JJA precipitation over Asia from 1998-2018. We produce climatologies of the mean precipitation and of the diurnal cycle of precipitation, against which we evaluate the representation of precipitation in the simulations. To ease comparison with the high-resolution simulations, we upscale the spatial resolution of CMORPH to match their resolution, using area-weighted interpolation. We note that, although we use the full available timeseries for CMORPH , a shorter duration of 4 y (2006-2009) produces very similar values of amount, frequency and intensity 135 (Sect. 2.3) as the full timeseries (Figs. S5 and S6 :: S1 ::: and ::: S2 : in the Supplement). We therefore infer that our results are robust with respect this choice of analysis period, because there is not much difference between the mean precipitation in these two durations and we are interested in the average precipitation or its diurnal cycle over many seasons.

APHRODITE
Some results shown in the Supplement :::  1988-2015. We describe this dataset here for completeness. As with CMORPH, we do not expect that the results are sensitive to the exact choice of the multi-year analysis period.

145
With increasing computing power, it is possible to run global simulations that have grid lengths of O(10 km), comparable to that of regional weather simulations run two decades ago (Golding, 2000) or regional climate models from one decade ago (Hurkmans et al., 2010). These high-resolution simulations are listed in Table 1 Williams et al., 2018). At these resolutions, it is insightful to run simulations both with and without parametrized convection (N1280-PC and N1280-EC respectively, see Table 1), as previous studies have shown that the convection scheme can have a drastic effect on the diurnal cycle , and the spatial and temporal variability of the precipitation (Klingaman et al., 2017;Martin et al., 2017). Additionally, 155 a simulation with a hybrid representation of convection (N1280-HC) was run. These global simulations are performed for 4 y, allowing us to sample some interannual variability and to assess their representation of the mean precipitation and the diurnal cycle of precipitation over Asia. As shown in Figs. S5 and S6 :: S1 :::: and ::: S2 in the Supplement, a 4 y timeseries of CMORPH observations is very similar to the full 21 y timeseries. From this, we infer that our 4 y simulations will be long enough to provide a representative duration which we can compare against observations.

160
The N1280-PC simulation with parametrized convection is based on the standard Global Atmosphere (GA) 7.1 settings (Walters et al., 2019), which is the atmospheric component of HadGEM3-GC3.1 with some modifications to allow it to be used at a higher resolution. It broadly follows the HighResMIP protocol (Haarsma et al., 2016), and in particular uses the the different resolutions. Note, even though they are coarser than the high-resolution simulations described above, the N512-PC simulation is still a similar resolution to what is currently considered "high resolution" for climate simulations (Haarsma et al., 2016).

Amount, frequency and intensity analysis
Following many other studies, we partition the precipitation into three measures: amount (A), frequency (F ), and intensity (I).
This requires the use of a threshold, for which we use 0.1 mm h −1 in line with previous studies (e.g., Li et al., 2018). A is the thresholded precipitation over a particular period. It is very close to the mean over a particular period, although precipitation rates lower than the threshold do not contribute to it. F is the percentage of time for which the threshold is exceeded. I is a 205 measure of precipitation intensity for events above the threshold. The three measures are related by: Each measure of precipitation can be calculated over a sub-daily window, e.g. for each hour of all days during JJA as is done here for the simulations. Thus, the diurnal cycle of each measure can be computed at every grid cell, where the diurnal cycle is calculated by using the mean value of the measure over a particular time period. For example, for the simulations, the diurnal cycle for A is calculated as the mean amount over each 1 h window across the 24 h day. Furthermore, the diurnal 210 cycle can be summarized by two quantities: the phase of the peak and the amplitude of the cycle. We use harmonic analysis to compute this information, as in e.g. Dai and Wang (1999), where these are given by the phase and amplitude of the first harmonic respectively. The phase information is converted to Local Solar Time (LST) using the longitude of each grid cell.
In the diurnal cycle figures below (Figs. 6, 7 and 10), we partition the amplitude of the diurnal cycle into three: strong (top third), medium (middle third) and weak (bottom third). Each of these is defined relative to the amplitude of the diurnal 215 cycle separately for each dataset, over the complete Asia analysis domain shown in Fig. 2

HydroBASINS dataset
We use the HydroBASINS catchment basin dataset (Lehner, 2014), which is a subset of the Hydrological data and maps based on SHuttle Elevation Derivatives (HydroSHEDS) dataset (Lehner and Grill, 2013). HydroBASINS is based on a high-resolution (15 arc-second) digital elevation model (DEM), which is stored in a raster (gridded) data format. The basins are generated from the DEM using the scheme of Verdin and Verdin (1999), which works by calculating the steepest-descent direction from the 225 DEM, and then calculating how many tributaries drain into a given grid cell. An area threshold of 100 km 2 is applied to delineate the basin boundaries. The resulting raster basins are then vectorized and stored in a vector format.
The basins are represented by 12 different levels, with the lowest levels representing the largest-scale features (Lehner and Grill, 2013), and with levels 1, 2 and 3 assigned manually. Level 1 distinguishes continents: there are 9 of these. We only use the Asian basins, which are denoted by a 4 at the top level using the Pfafstetter coding system (Verdin and Verdin, 1999). Level 230 2 splits each continent into nine large sub-units; level 3 splits these into the largest river basins. Beyond that, it follows the traditional Pfafstetter coding system, with minor modifications for islands, endorheic basins, coastal basins and sub-basin size consistency. There is, however, no guarantee that basins at the same level will have similar sizes.

BASMATI
To facilitate the basin-scale analysis, and to make similar analysis easier for other studies in the future, we developed the 235 BAsin-Scale Model Assessment ToolkIt (BASMATI -available from https://github.com/markmuetz/basmati). BASMATI is written in Python 3 and uses some key libraries to interact with the underlying data: pandas, geopandas and rasterio.
BASMATI simplifies downloading and interacting with the HydroBASINS dataset (Sect. 2.4.1), which provides the underlying data about catchment basins.
BASMATI adds some key capabilities to the HydroBASINS dataset. As already stated (Sect. 2.4.1), the dataset is split into 240 different levels, where the largest basins are at the lowest levels (e.g. the top level is level 1). However, basins at a given level are not all the same size. For example, basins at level 4 range in size from 4 to 1000000 km 2 . As we want to compare basins that are of similar sizes, it is necessary to select basins from different levels that fall within a given size range. To that end, we implemented a simple area selection algorithm. This selects basins within a given size range, e.g. 2000-20000 km 2 . It works by starting at the top level of basin size, and if the basin is larger than the upper size limit, splitting the basin into its sub-basins.

245
It does this iteratively until all the basins are below the upper size limit. It then removes basins if they are lower than the lower limit, meaning that the total area covered by the top-level HydroBASINS region is not completely covered (see e.g. Fig. 4).
However, most of the area is covered (at least 92 % of the total area); the variation in area between the different basin scales is small (Table 2).     Table 2 is shown in each row, and each column shows the different simulation resolutions used in this study. The small, medium and large basins shown have areas of 5040, 54600 and 553000 km 2 respectively.
The HydroBASINS catchment basins are stored in the Environmental Systems Research Institute (ESRI) shapefiles (ESRI, 1998) vector format. The output from a GCM or other atmospheric model is typically on a latitude/longitude grid. To convert from a vector format to a gridded format, the basins must be rasterized. This produces a gridded field of weights for each grid 255 cell that can be used to produce e.g. the mean precipitation over a given basin at any resolution. We do this by rasterizing the vector data onto a grid that is 10 times finer in both the latitudinal and longitudinal direction than the resolution for which we want to produce a weighted raster, and using this to produce weights (accurate to 1 %) for each resolution. The weights for a given basin are zero outside the basin and one inside, with a fractional value on the boundary. These weights are shown in Fig. 1 for all resolutions for the median-sized basin at each of the basin scales in Table 2. 260 It is clear that, at lower resolutions, the smallest basins cover substantially less than one grid cell (e.g. Fig. 1a). We choose to perform the analyses at these scales for the coarse resolutions because, even though each basin is poorly represented by any one grid cell, the statistical picture that is created by aggregating over many such basins should still be accurate. This choice is somewhat validated by the evaluations between CMORPH and the simulations shown in Figs. 5 and 8, which show that the different resolution simulations exhibit plausible relationships, even at the finest basin scale.

Spatial averages over basins
We complete the analysis by computing spatial means over each of the basins. For JJA mean precipitation, we use an areaweighted mean over each basin, also using the basin weights: Here, P j is the precipitation in the j th cell, and P i is the mean precipitation in the i th basin. The summation is over all N 270 grid cells, and the area weights are the same for all cells at a given latitude. The area weights for each grid cell, W area,j , are calculated to ensure that grid cells further north, which will be smaller on a latitude-longitude grid, contribute less to the mean.
The basin weights for each basin i and each grid cell j, W i basin,j , are as described above (Sect. 2.4.2). Producing the basin-weighted diurnal cycle requires producing a composite diurnal cycle over the weighted grid cells that comprise each basin. We do this by taking the spatial mean of the diurnal cycle in each grid cell, weighted by the basin weights 275 and an area weighting as above. For example, for the basin-mean diurnal cycle of amount, A i (t), at each time t over the day, where A j (t) is the amount at time t in grid cell j. The diurnal cycle of frequency and intensity have the same equation, replacing A with F or I. From the diurnal cycle of A over a basin, the phase of the peak and the amplitude of the diurnal cycle are calculated using harmonic analysis as above (Sect. 2.3).

Basin-scale error statistics
To compare the simulations with the CMORPH observations, we use basin-level error metrics. For example, we use the root mean squared error (RMSE) to compare the mean JJA precipitation between CMORPH and each of the simulations, as defined by: Here, there are N basin basins, and so the summation is over all basins. P i obs is the mean precipitation in basin i in the observations, and P i sim is the mean precipitation in basin i for the simulation. When comparing the phase φ of the peak of A, F or I between simulations and observations, it is necessary to take into account the fact that this is a circular quantity. That is, a phase in LST of 2300 should be 2 h away from 0100, not 22 h away. This is achieved by applying a base-24 circular difference (circular diff 24 (x)) operation to the difference between observations 290 and simulations to calculate a circular RMSE: Basin can also be distinguished, where the diurnal cycle of precipitation exhibits a clear phase propagation (Li et al., 2020).

Mean precipitation
In Fig. 3, the mean JJA precipitation over Asia from, observations and simulations, is shown. In the following, we mainly discuss the features over land. From the observations, several prominent features are visible. There is a strong band of precipitation off the western coast of India, which is related to the ISM flow. Precipitation falls at a rate of 2-4 mm day −1 over southern 305 India east of the Western Ghats mountains, with substantially higher rates over northern India. Both of these are consistent with other studies of the ISM (e.g., Mitra et al., 2013). Along the Himalayas, there are very high local maxima of precipitation of 12 mm day −1 . This is due to the ISM flow interacting with the high orography and associated steep gradients of the Tibetan Plateau and the Himalayas (Fig. 2). Some scattered points with high precipitation rates over the body of the Tibetan Plateau are an artefact of the satellite retrieval method and due to the snow cover in this region (Joyce et al., 2004).

310
Over south-eastern China, high precipitation rates of 6-12 mm day −1 are seen. This is associated with the EASM, and the propagation of the Meiyu front, both of which are active during JJA. Further inland, there is a small secondary maximum at around 28°N, 102°E, which is related to the orography of the Tibetan Plateau on the north-western boundary of the Sichuan Basin.
mum in seasonal JJA precipitation is seen over and near to the Korean Peninsula. Over the Gobi and Taklamakan deserts, less than 1 mm day −1 occurs in JJA.
Over the ocean, the western tongue of the intertropical convergence zone (ITCZ) is clearly visible. Precipitation rates are high to the west of the Philippines, off the southern coast of Japan, and over a large area of the Bay of Bengal.
The simulations broadly reproduce the observed Asian summer precipitation distribution. The low precipitation rate over the 320 deserts is well matched. The magnitude and extent of precipitation in north-eastern China is accurately reproduced, although N1280-EC produces some localized areas of higher precipitation and N1280-PC fails to produce the maxima over the Korean Peninsula. However, at lower latitudes there are substantial differences. Over China, the simulations produce too much precipitation, with maximum rain rates of over 12 mm day −1 over much larger areas than in the observations. Precipitation rates over India are much lower in the simulations than the observations -a known bias of the UM (Bush et al., 2015). N1280-EC 325 in particular has very low rain rates over India (particularly north-eastern India), with no signal of the higher rates seen at 20°N in observations. An intriguing possibility is that this is due to a lack of Bay of Bengal depressions forming in N1280-EC, although we have not investigated this here. Clearly the lack of a convection parametrization affects the simulation to a large degree over this region. There are signs of a large, spurious maximum of precipitation over the Indian Ocean in all simulations, which has been linked to the dry bias over India in the UM (Bush et al., 2015) and also in multiple other GCMs (e.g., Bollasina 330 and Ming, 2013; Levine et al., 2013). Over land, N1280-EC and N1280-HC closely resemble each other.
All simulations produce high rates of precipitation over the Himalayas, although N1280-PC produces a band of precipitation which is too wide. Both N1280-EC and N1280-HC produce maximum precipitation rates which are too high, particularly close to 100°E. This is consistent with Willetts et al. (2017), who found that shorter-duration explicit simulations with 12 km grid lengths produced excessive precipitation over the Himalayas.

335
Over the western Pacific, the N1280-PC simulation produces far too heavy precipitation over far too large an area, whereas the opposite is true for N1280-EC. N1280-HC has smaller precipitation biases against observations. All simulations produce heavier precipitation west of the Philippines, consistent with observations. All simulations produce less precipitation than observed over the Bay of Bengal.  Table 2), resembles Fig. 3a. As noted in Sect. 2.4.2, the basin-selection algorithm cannot pick basins in a given scale range that completely cover the Asian land, hence there are gaps at each basin scale. As is clear from Figs. 4a-c, averaging over larger basin scales reduces the maximum precipitation rates at that scale, since high precipitation rates are averaged with lower rates. Note, as N1280-EC and N1280-HC produce similar JJA precipitation 345 patterns over land (Figs. 3 and S1 in the Supplement), this simulation is not shown here, but the minor differences are discussed below.
The small basin scale column provides some more details of what was seen in Fig. 3. Both simulations produce too little precipitation over India; this is particularly the case for N1280-EC at around 20°N. Both simulations produce too much precipitation on the Himalayas, with N1280-EC producing up to 72 mm day −1 more precipitation at 95°E than CMORPH. The 350 basins with the maximum precipitation rates in N1280-PC and N1280-EC have rates that are respectively 7 and 11 times higher than CMORPH. These biases dominate the differences between the simulations and observations; the increased precipitation over south-eastern China in the simulations is present but is smaller in magnitude. Similarly, the differences in precipitation rates between the simulations and CMORPH in north-eastern China are barely visible, as the magnitude of the precipitation in this region is typically low to begin with in both observations and simulations (Figs. 3 and 4a, dand g , :: g ::: and : j). There are few 355 differences between N1280-EC and N1280-HC(see Fig. S1 in the Supplement). Both simulations produce a dry bias over the Indochina Peninsula, although the spatial extent of this is larger for N1280-EC. Likewise, the dry bias over the Sichuan Basin in N1280-EC is larger than that in N1280-HC.
At the small basin scale, the HadGEM3-GC3.1 simulations at coarser resolutions (N512-PC, N216-PC and N96-PC) produce mean JJA precipitation over Asia that most closely resembles N1280-PC (not shown). The highest-resolution of these 360 simulations, N512-PC, is most similar to N1280-PC, although it has slightly enhanced precipitation (1 mm day −1 ) over India, and slightly reduced precipitation (1-4 mm day −1 ) over China. N216-PC is less similar to N1280-PC than N512-PC. N216-PC produces less precipitation than N512-PC and N1280-PC over the Western Ghats on the west coast of India, presumably due to under-resolved orography, and produces more precipitation than N512-PC and N1280-PC over north-eastern China. N96-PC is least similar to N1280-PC. Again, there is less precipitation compared to N1280-PC over the Western Ghats, presumably For medium and large basin scales, there are smaller discrepancies between the simulations and observations as precipitation is averaged over larger basin scales. This is partly because averaging over a larger area smooths out the signal of localized maxima in precipitation, as mentioned above. We also expect to extract useful information about how the model represents pre-370 cipitation at different spatial scales by calculating statistics about the agreement between the simulations and observations. In  (Table 1). For all simulations, averaging over a larger basin scale improves the error statistics by reducing the 375 RMSE as the basin scale increases.
We note that the N96-PC simulation performs best by these metrics. This is because N96-PC produces much lower precipitation maxima, and therefore is penalized less than the other five simulations for producing too much precipitation over e.g. the Himalayas. Indeed, the error increases with resolution for all spatial scales. The spread between the three N1280-PC ensemble members is typically smaller than the difference between that simulation and the other simulations. When the three high-380 resolution simulations are compared, N1280-EC and N1280-HC perform worse than N1280-PC, with N1280-HC performing slightly worse than N1280-EC. Again, this is due to excessive precipitation in the simulations with explicit deep convection.
This shows that disabling the convection parametrization has an important bearing on performance for climatological JJA precipitation.
We have performed identical analysis comparing the simulations to APHRODITE (Sect. 2.1.2). The equivalent figure to the difference for this metric between the simulations and observations is larger than the differences between the observational products. Figure 6 shows the diurnal cycle over Asia in the CMORPH observations. As also seen below in Sect. 4.2 for south-eastern 390 China, the amount and intensity of precipitation are highly similar. For both of these, there is a marked difference between land and ocean, both in the phase and amplitude of the diurnal cycle: over land the amplitude is larger and the phase is later.
The phase over the ocean tends to be either early morning (0300-0700 LST) over the ITCZ and off the coast of Japan, or close to midday off the coast of China. There is an interesting phase delay off the east coast of India over the Bay of Bengal, as the phase of the diurnal cycle goes from 0800 to 1700 LST. This feature has been noted in previous studies (e.g., Yang and 395 Slingo, 2001); it is thought to be related to the coupling between gravity waves and convection, leading to long-lived mesoscale convective systems (Houze Jr., 2004). interesting phase delay from north to south, with a late-night peak in precipitation close to the Plateau top progressing to an early morning peak (0800 LST) further south.
The diurnal cycle field for intensity of precipitation is noisier. In general, late-night peaks in amount, frequency and intensity 405 are co-located. We speculate that this could be due to the activity of mesoscale convective systems, which are associated with both convective and stratiform precipitation. The convective precipitation would likely affect the intensity of precipitation, whereas the combined convective and stratiform precipitation would affect the amount, frequency and intensity of precipitation.
For intensity, some areas show broadly similar signals as for amount and frequency, such as the phase delay over the Bay of Bengal, the pattern over the ocean and the phase delay on the southern flank of the Tibetan Plateau. However, some regions 410 show substantial differences. At higher latitudes over land, the peak of intensity is 4-6 h later than the peaks of amount and frequency. This potentially indicates that convective precipitation is dominant later during the day in these regions.
We can compare the simulations to the observations (Fig. 6), noting again that the observations span 21 years, and the simulations span four years. This leads to the simulations being noisier than the observations. Before presenting a detailed analysis, we note that N1280-EC produces a more-realistic diurnal cycle in most regards than either N1280-PC or N1280-HC.

415
In N1280-EC, over land the phase of the diurnal cycle better matches the observations for both amount and frequency, and is marginally better for intensity. The land-sea contrast is better represented, for example off the coast of south-eastern China.
The similarity of the diurnal cycles of amount and frequency in N1280-EC is closer to that of the observations. Thus, in terms of producing a realistic diurnal cycle, the lack of parametrized convection is clearly an advantage.
Perhaps the most striking differences between the simulations and observations are for N1280-PC frequency and amount.
For frequency, this simulation shows a peak over almost all of Asia that is far too uniform, and too early, being close to local midday. It is well known that convection parametrization schemes respond to the peak in insolation forcing by producing  Figure 6. The diurnal cycle of the amount, frequency and intensity of precipitation over Asia for CMORPH and the three high-resolution simulations, showing the phase in LST in colour, and the amplitude by the opacity of the colour. The diurnal cycle is considered strong if its amplitude is in the top third of the Asia-wide amplitudes over the whole domain (including oceans), weak if it is in the bottom third and medium otherwise (Sect. 2.3). Note, the strength of the diurnal cycle is calculated separately for each dataset. Thus, it is possible to say that all datasets have a weak diurnal cycle for the amount of precipitation over north-western region of the domain, but not that the magnitude of the amount is similar where the diurnal cycle is considered strong. An absolute comparison is done in Fig. 8. convective precipitation (e.g., Yang and Slingo, 2001;Stirling and Stratton, 2012;Bechtold et al., 2014), which is certainly one of the reasons for this signal. :::::::: Evidence :: for :::: this :: is :::: also :::: seen :: in :::::::::::::::::::::::: Bougeault and Geleyn (1989) : , :::: who ::::: found ::: that ::: no ::::::::::: precipitation ::: was ::::::: detected :::::: before :::: local :::::: midday :: in :: a :::::::: simulation :::: with ::::::: explicit :::::::: convetion, ::::::: whereas :: in ::::::::::: parametrized :::::::::: simulations ::: too ::::: much :::::: rainfall 425 : is :::::::: produced :::::: before ::::::: midday. For intensity, the peak over almost all of Asia is close to local midnight. For N1280-PC, the peak in precipitation amount is also clearly biased to be too early, although it does show greater spatial variation than the frequency or intensity over this region. Additionally, there are differences between amount and frequency, even though this is not seen in the observations. There are few differences between the different N1280-PC ensemble members (not shown). The largest difference is for the phase of precipitation amount over eastern India, although in general the phase and amplitude are very similar among ensemble members.
For N1280-EC over land, the phase of the peaks for amount and frequency broadly agree, although this simulation appears to produce too little late night precipitation. Additionally, the phases for all three precipitation measures over the ocean match the observed phases quite closely, although they are noisier for the simulation. Likewise, the amount and frequency are fairly similar for this simulation, although the similarity is not as strong as it is for the observations. N1280-EC does capture some as-

435
pects of the phase of the intensity field in the observations, such as the late-night peaks over India and the Indochina Peninsula.
However, there are pronounced differences between N1280-EC and the observations. From the peak in amount of precipitation over India, the amplitude of the diurnal cycle is too weak. This is probably due to the dry bias in this region (Fig. 3c).
N1280-EC produces a peak in intensity of precipitation that is too noisy, which is probably related to the shorter duration of the simulation. The peak in intensity is also weak, particularly over the Tibetan Plateau. However, this is a region with known Over the ocean, the N1280-PC simulation does not match the observations particularly closely, whereas the N1280-EC simulation performs better, with N1280-HC falling in between. All simulations show hints of a phase delay of the peak in amount and frequency of precipitation over the Bay of Bengal, indicating that the simulations might capture the important aspects of the coupling between convection and gravity waves. However, all show significant biases: N1280-PC is too early 455 close to the coast of India, and N1280-EC and N1280-HC are too weak and too late. N1280-PC produces diurnal cycles which have substantial biases in phase and amplitude over all other areas of the ocean, for all three of amount, frequency and intensity. N1280-EC in general produces more realistic phases, and produces slightly more realistic peaks in amount and frequency over the western Pacific, and between the Philippines and the Indochina Peninsula. However, it is noisier than the observations, which is probably due to its shorter duration. The phase of the intensity of precipitation in N1280-EC does 460 not match observations particularly closely, being stronger, too uniform, and in general occurring later in N1280-EC than the observations. N1280-HC is again closer to N1280-EC in amount, and N1280-PC in frequency and intensity. It has a particularly strong diurnal cycle near midnight for frequency over the western Pacific that is not evident in the observations, which is perhaps related to its wet bias at that location (Fig. 3).  Diurnal :::: cycle :: of :::::: amount : of precipitation over Asia averaged over different basin scales (columns) for CMORPH, N1280-PC : , :::::::: N1280-HC : and N1280-EC (rows). As described in Sect. 2.3 and as in Fig. 10, a visual representation of the amplitude is given by its strength, dependent on whether it is strong, medium or weak. This is calculated separately for each dataset and scale (i.e. for each panel).

Diurnal cycle over catchment basins 465
As with the mean precipitation (Sect. 3.1), the phase and amplitude of the diurnal cycle can be averaged over catchment basins of different spatial scales (see Sect. 2.4.2 for details). Figure 7 shows this over Asia for CMORPH, N1280-PC : , ::::::::: N1280-HC : and N1280-EC over small, medium and large basin scales for the amount of precipitation. Figure 7a, the CMORPH observations, is similar to Fig. 6a, as the basins are small, and likewise for the simulations (Figs. 7dand g). Note, as can be seen from Fig. 6 : , g and jfor amount over land N1280-HC resembles N1280-EC and is not shown here. : ). :

470
For the observations, averaging the diurnal cycle over larger basin scales yields useful information about the phase and amplitude of the diurnal cycle at different scales. This information is similar in spirit to that presented in Covey et al. (2016), although the method they used relied on vector averaging, whereas the method in this study uses direct averaging of the diurnal cycle at each grid point (Sect. 2.4.2). They recommend using their method to compare CMIP5 simulations with observations, to give a sense of how well the diurnal cycle is represented. However, whereas they average over all land and ocean grid points, 475 we use a much finer-grained approach of averaging over catchment basins over land. This allows us to distinguish between the phase and amplitude of the diurnal cycle over specific regions, and allows us to compare simulations and observations as a function of spatial scale.
For the observations, as the diurnal cycle is averaged over larger scales, certain information comes to the fore and at the same time location-specific detail is lost. For example, over northern India and Nepal, from Fig. 7a there is fine-scale detail in terms 480 of the phase delay on the southern flank of the Tibetan Plateau, whereas this is no longer detectable over medium-sized basins - where it averages to around 0600 LST. Furthermore, comparing Figs. 7b and c, the detail of the late-night phase over southern India is clearly lost at the large scales. However, at the large scale, a continent-wide pattern emerges that is hard to discern at the finer scales -that over southern India on average the diurnal cycle is strong and occurs at 1900 LST. Thus, the large basin scale yields information about the overall behaviour of the diurnal cycle, which can be compared against its behaviour 485 in simulations -similar to Covey et al. (2016). Additionally, averaging over larger spatial scales will reduce the noise in the shorter-duration simulations, and so should produce a fairer comparison between the simulations and the observations. of N1280-EC(not shown). Some differences are evident at the smallest basin scales. The phase of the diurnal cycle is earlier over coastal south-eastern China, which means that N1280-HC matches CMORPH more closely. However, the phase is also earlier over the Indochina Peninsula, which means that it matches CMORPH less closely.
The N96-PC, N216-PC and N512-PC simulations all strongly resemble N1280-PC (not shown), suggesting that the convection parametrization scheme, and not model resolution, is responsible for producing the diurnal cycle in the simulations. For the phase, the clear signal is that for amount, frequency and intensity N1280-EC performs best, with N1280-HC performing almost as well. Indeed, for amount of precipitation, all the simulations with parametrized convection perform worse as 505 spatial scale increases to the large basin scale; only N1280-EC and N1280-HC improve as spatial scale increases. For frequency and intensity, all simulations improve as spatial scale increases, although again N1280-EC shows the most improvement (particularly for frequency). It is remarkable that the parametrized convection simulations span a range of resolutions from 14 to 180 km, yet their error statistics are very similar across all three measures of precipitation (particularly so for frequency).
Where there is some resolution sensitivity, the performance improves as resolution increases -e.g. amount at all spatial scales.

510
The main difference between N1280-EC and N1280-HC is that the latter performs better for the amplitude of the intensity.
For the amount of precipitation, all simulations perform better at larger spatial scales. N1280-EC performs worst at the finer scales. The same is broadly true for intensity, although the improvement at larger scales is less pronounced. For frequency, all the parametrized simulations perform poorly.

515
In this section, we focus on south-eastern China (Fig. 2, black rectangle) for straightforward comparison with Li et al. (2018).
In Sect. 4.1, we analyse the amount, frequency and intensity of precipitation, and in Sect. 4.2 we analyse their diurnal cycles.

Amount, frequency and intensity of precipitation
The amount, frequency and intensity of precipitation are shown in Fig. 9 for JJA in south-eastern China. The threshold for the analysis is 0.1 mm h −1 . The amount, averaged over one day, is very similar to the mean precipitation, although the thresholding 520 means that the values are not identical. Thus, CMORPH amount, Fig. 9a, is effectively the same as Fig. 3a but for China instead of Asia, and likewise for the N1280-PC and the N1280-EC amount. The results shown here are directly comparable to those in Li et al. (2018), who analyse precipitation in two regional UM simulations against gauge-based observations. Their simulations For the CMORPH amount, there are localized maxima near south-facing coasts, indicating that the moist EASM flow in JJA produces precipitation when it passes over land. These are linked to higher intensity precipitation near the coast (Fig. 9c).

530
There is a maximum near 23°N, 104°E, which appears to be related to particularly frequent precipitation. For amount, there is generally a decreasing gradient in precipitation going further inland, with local inland maxima and minima typically related to the orography.
Amount, frequency and intensity in CMORPH over south-eastern China are similar to Zhou et al. (2008), Figs. 2c, f and i, which show the Tropical Rainfall Measuring Mission (TRMM) satellite product. This is despite the fact that they use a different 535 satellite product, a threshold of 0.2 mm h −1 (which will affect frequency and intensity), and a shorter time period of 2000-2004. The similarity indicates three things: that amount, frequency and intensity in CMORPH are broadly similar to those in TRMM; that a shorter time period is able to represent the mean of these precipitation measures; and that the quantitative values are affected by the threshold, but the qualitative conclusions are broadly the same for different thresholds.
From N1280-PC and N1280-EC amount, both simulations produce too much precipitation over the majority of south-eastern 540 China. In the Sichuan Basin, both simulations produce amounts of precipitation that are too low, and this is pronounced in N1280-EC. This is due to less frequent precipitation in this region (Figs. 9e and h). The overestimation of precipitation in N1280-PC is due to too frequent precipitation, as can be seen by comparing Figs. 9b and e. However, in N1280-EC the frequency matches CMORPH more closely east of 104°E (notwithstanding the bias in the Sichuan Basin), and the bias in precipitation rates is due more to the intensity bias (Figs. 9c and i). West of 104°E, the wet bias in N1280-EC appears to be 545 related to both frequency and intensity being too high. N1280-EC produces very intense precipitation over the sea, and there is a marked land-sea contrast that is not present in either CMORPH or N1280-PC. and correspondingly the intensity is less in this study. This could be due to the method they used to turn the point gauge observations into a continuous field, as both studies use the same precipitation threshold. The similarity between the explicit and parametrized simulations in both studies is striking, despite the difference in resolution between the explicit simulation in this study and that in Li et al. (2018). In both studies, both the parametrized and explicit simulations clearly overestimate  Li et al. (2018) very closely. This is despite the differences in simulation design and duration, which demonstrates that this is a robust bias of the UM.  Figure 10. Diurnal cycle of amount, frequency and precipitation over south-eastern China. Layout as in Fig. 9. A visual representation of the amplitude of the diurnal cycle is given by the opacity. This is calculated separately for each dataset over the entire Asian domain (Sect. 2.3), not for the subregion shown here.

Diurnal cycle 560
As in Sect. 4.1, we focus on south-eastern China (Fig. 10) to investigate in detail the diurnal cycle of precipitation. In the observations, the phases of peak amount and frequency are again very similar, as in e.g. Zhou et al. (2008) and Li et al. (2018). The coastal region shows a peak in precipitation in the early evening for amount, frequency and intensity. The region of early-evening peak precipitation covers a larger area for amount and frequency. A phase delay is evident going south-west to north-east across the Sichuan Basin, with a peak in amount and frequency at 2300 LST in the south-west of the basin shifting to 565 1000 LST in the north-east. This has been observed in other studies (e.g., Li et al., 2018Li et al., , 2020. Potential mechanisms include the interaction between the mean wind and the orography, and the steering-level winds at 700 hPa affecting the propagation of mesoscale convective systems. As in Yu et al. (2007), their Fig. 3, andChen et al. (2010), their Fig. 2a, a phase delay is also evident along 28°N, from 100-120°E for the diurnal cycle of the amount of precipitation. At the western end, the phase is in the late evening to midnight 570 and the diurnal cycle is strong. Further east, between 107-113°E, the amplitude weakens and the phase is around 0600-0900 LST. At the eastern end, the phase is around 1800 LST and the diurnal cycle is strong. There is a clear divide between the diurnal cycle over land and over ocean for all of amount, frequency and intensity, with oceanic precipitation peaking much closer to midday.
As in Sect. 3.3, N1280-PC produces a diurnal cycle of precipitation that is poorly matched with the observations across all 575 three precipitation fields over both land and ocean. The frequency and intensity are both too uniform, and generally have the wrong phase. For amount, there is more spatial variation but the phase rarely matches that observed, being too close to midday.
A phase delay at 28°N is difficult to discern.
N1280-EC bears a stronger resemblance to the observations for amount and frequency, although it produces too much late-night precipitation near the coast. The contrast between land and ocean is closer to the observed contrast than it is for 580 N1280-PC. There are some signs of a phase delay going south-west to north-east across the Sichuan Basin, particularly in the frequency field, but it is not as clear as in the observations. This could be because N1280-EC produces too little precipitation in the Sichuan Basin (Fig. 9). Again, there is little clear sign of a phase delay at 28°N.
We can again compare directly with Li et al. (2018), their Fig. 3. We note that they do not show the strength of the diurnal cycles. Comparing the observations, their diurnal cycles of amount and frequency of precipitation are similar, and so are those 585 in this study. However, the phase of the peak is later over coastal China using CMORPH. This is consistent with Dai et al. (2007), who found that CMORPH had a delayed peak compared to gauge observations. Furthermore, the general patterns of amount, frequency and intensity are similar for both studies, although their fields are noisier (particularly frequency), which is to be expected given the shorter duration of their analysis.
Comparing the simulations to those of Li et al. (2018), the parametrized simulations produce diurnal cycles of amount, 590 frequency and intensity that are very similar to each other (Figs. 10d, e and f here; their Figs. 3c, f and i). They see a slightly later peak in intensity in coastal China; however, this could be due to the particular year they have analysed or the boundary conditions provided by their coarser driving model. For the explicit simulations, more differences between this study  are evident. Although both studies show a strong similarity between amount and frequency, the peak of the diurnal cycle of these occurs often later in this study over coastal regions than in Li et al. (2018). A similar comparison holds for 595 intensity: there is a later peak in this study than in Li et al. (2018). Thus, even though both explicit simulations produce a similar results for amount, frequency and intensity of precipitation (Sect. 4.1), this does not hold as strongly for their diurnal cycles. This could be due to the different resolutions, the shorter simulations or the boundary conditions they imposed on their regional model.
Extending the simulation length would permit further analysis. Extremes of precipitation are difficult to analyse over only four summer seasons -with a longer duration more robust statistics on these could be generated, as in Schiemann et al. (2018).

650
A longer simulation would also allow a better characterization of the climatology by sampling more interannual variability.
However, previous works suggests that a longer simulation might not change the biases that we have identified, as the systematic errors which affect climate simulations develop after only a few days (Martin et al., 2010). The computational cost of such simulations would be high, however the benefits would include an improved understanding of how various processes improve with increasing resolution, and could also lay the foundations of the next generation of climate models.

Summary and conclusions
We have compared the precipitation produced by new high-resolution GCM simulations against the observed precipitation from the CMORPH satellite dataset. The simulations were performed using the HadGEM3-GC3.1 Met Office Unified Model 665 6.1 Mean summer precipitation over Asia We compared how the mean JJA precipitation over Asia was represented in the observations and simulations. We found that the simulations broadly reproduce observed Asian summer precipitation distribution. N1280-PC exhibited substantial biases compared with CMORPH, producing too much precipitation over the Indian Ocean, too little precipitation over India, and too much precipitation over south-eastern China. This is similar to biases seen in the UM at coarser resolutions, for example Bush 675 et al. (2015) found similar biases over India at N96 resolution. The N1280-PC simulation produced a band of precipitation on the Himalayas, indicating that it was representing some aspects of the interaction between the monsoon flow and orography, although the band was too wide and the precipitation rates were too high. The N1280-EC simulation worsened the existing biases in the UM, producing very little precipitation over India, and maximum precipitation rates over the east end of the Himalayas that were far in excess of observations. The N1280-HC simulation performed similarly to N1280-EC over land.

680
Using the newly developed BAsin-Scale Model Assessment ToolkIt (BASMATI), we averaged the precipitation field over hydrological catchment basins. The basins were chosen so that they were within a given size range, ranging from 2000-20000 km 2 to 200000-2000000 km 2 for the small and large basin scales respectively. This allowed for the mean summer precipitation in the simulations to be compared against the observed precipitation from CMORPH as a function of spatial scale. We found that all simulations improved as the spatial scale of analysis was increased, and that the lowest resolution simulation (N96-PC) 685 produced the smallest error statistics, due mainly to its lack of very high precipitation rates.

Diurnal cycle of summer precipitation over Asia
Diurnal cycles of amount, frequency and intensity of precipitation were produced. We found that, for the summer diurnal cycle of precipitation over Asia, there were substantial differences between the observations and the simulations, with N1280-EC and N1280-HC generally performing far better than N1280-PC. For N1280-PC, the representation of the diurnal cycle is poor 690 for all three precipitation measures. Over land across Asia, the peak in amount is too early at close to local midday, while the peak in frequency is too early at close to local midday and too uniform. The intensity is too late at local midnight and too uniform. For N1280-EC, the phase of the peak is more realistic for all three precipitation measures. However, the amplitude of the diurnal cycle is too weak over India, which we attributed to dry biases in this region. N1280-HC produced diurnal cycles that resemble both of the other simulations, yielding useful information about which aspect of the representation of convection 695 is responsible for each of the diurnal cycles of amount, frequency and intensity. For the diurnal cycle of amount, N1280-HC closely resembles N1280-EC, indicating that the explicit deep convection is primarily responsible for producing this signal.
For frequency, N1280-HC resembles N1280-PC more closely, indicating that the shallow and mid-level parametrizations of convection are responsible for this signal. However, in N1280-HC there is a clear difference between the diurnal cycles of amount and frequency, whereas in observations these fields are very similar.

700
Using BASMATI, diurnal cycles of amount, frequency and intensity were compared against CMORPH as a function of scale. For phase, N1280-EC and N1280-HC perform best for all three precipitation measures. These are the only simulations that perform better at larger spatial scale for the amount of precipitation, which we attributed to their lack of convection parametrization scheme. For the other simulations, higher resolution slightly improves the phase of the amount of precipitation.
For the amplitude of amount, N1280-EC performs worst at small scales, but improves more rapidly than the others as scale is increased, so that at the largest scales all simulations perform similarly well. N1280-EC performs worst for intensity at all scales. The simulations which use a convection parametrization scheme at resolutions between N96 and N1280 (180 and 14 km grid length at 30°N respectively) perform almost identically and similarly poorly compared to CMORPH for frequency. This is consistent with the overestimation of precipitation frequency in parametrized simulations, as shown in Fig. 9 and e.g. Martin et al. (2017).

Summer precipitation over south-eastern China
Focusing on south-eastern China, the three precipitation measures in CMORPH matched similar analysis using gauge data in Li et al. (2018). The similarity was greatest for amount, which is least sensitive to the choice of threshold, whereas the frequency and intensity were generally lower and higher than that in Li et al. (2018), respectively. Li et al. (2018) used a regional version of the UM with both parametrized and explicit convection, and the explicit convection simulation had a finer 715 resolution than the resolution used here. Similar to Li et al. (2018), N1280-PC and N1280-EC overestimated the amount of precipitation, which we attributed to the overestimations of frequency for N1280-PC, and to the overestimation of intensity for N1280-EC, consistent with Li et al. (2018). The similarities are apparent despite the differences in setup between this study and Li et al. (2018), which indicates that these are robust biases of the UM.
In N1280-PC, the land-sea contrast of the phase in maximum precipitation is unrealistic, whereas it is more realistic in 720 N1280-EC. N1280-EC produces phase and amplitude for diurnal cycles of amount and frequency that are closer to the observed values, although the intensity is too weak : .