Coalescence of bacterial groups originating from urban runoffs and artificial infiltration systems among aquifer microbiomes

The invasion of aquifer microbial communities by aboveground microorganisms, a phenomenon known as community coalescence, is likely to be exacerbated in groundwaters fed by stormwater infiltration systems (SISs). Here, the incidence of this increased connectivity with upslope soils and impermeabilized surfaces was assessed through a metaanalysis of 16S rRNA gene libraries. Specifically, DNA sequences encoding 16S rRNA V5-V6 regions from free-living and attached aquifer bacteria (i.e., water and biofilm samples) were analysed upstream and downstream of a SIS and compared with those from bacterial communities from watershed runoffs and surface sediments from the SIS detention and infiltration basins. Significant bacterial transfers were inferred by the SourceTracker Bayesian approach, with 23 % to 57 % of the aquifer bacterial biofilms being composed of taxa from aboveground sediments and urban runoffs. Sediments from the detention basin were found more significant contributors of taxa involved in the buildup of these biofilms than soils from the infiltration basin. Inferred taxa among the coalesced biofilm community were predicted to be high in hydrocarbon degraders such as Sphingobium and Nocardia. The 16S rRNA-based bacterial community structure of the downstream-SIS aquifer waters showed lower coalescence with aboveground taxa (8 % to 38 %) than those of biofilms and higher numbers of taxa predicted to be involved in the N and S cycles. A DNA marker named tpm enabled the tracking of bacterial species from 24 genera including Pseudomonas, Aeromonas and Xanthomonas, among these communities. Several tpm sequence types were found to be shared between the aboveground and aquifer samples. Reads related to Pseudomonas were allocated to 50 species, of which 16 were found in the aquifer samples. Several of these aquifer species were found to be involved in denitrification but also hydrocarbon degradation (P. aeruginosa, P. putida and P. fluorescens). Some tpm sequence types allocated to P. umsongensis and P. chengduensis were found to be enriched among the tpm-harbouring bacteria, respectively, of the aquifer biofilms and waters. Reads related to Aeromonas were allocated to 11 species, but only those from A. caviae were recovered aboveground and in the aquifer samples. Some tpm sequence types of the X. axonopodis phytopathogen were recorded in higher proportions among the tpm-harbouring bacteria of the aquifer waters than in the aboveground samples. A significant coalescence of microbial communities from an urban watershed with those of an aquifer was thus observed, and recent aquifer biofilms were found to be significantly colonized by runoff-opportunistic taxa able to use urban C sources from aboveground compartments. Published by Copernicus Publications on behalf of the European Geosciences Union. 4258 Y. Colin et al.: Urban runoff bacteria among recharge aquifers


Introduction
Urbanization exerts multiple pressures on natural habitats and particularly on aquatic environments (Konrad and Booth, 2005;McGrane, 2016;Mejía and Moglen, 2009). The densification of urban areas, combined with the conversion of agricultural and natural lands into urban land use, led to the replacement of vegetation and open fields by impervious urban structures (i.e., roads, rooftops, sidewalks and parking lots) (Barnes et al., 2001). These impervious structures reduce the infiltration capacity of soils. They also exacerbate the speed and volume of stormwater runoff that favour soil erosion and flooding events and affect adversely natural groundwater recharge processes (Booth, 1991;Shuster et al., 2005). Due to these consequences, stormwater infiltration systems (SISs) or managed aquifer recharge (MAR) systems have been developed over the last decades and are gaining more interest in developed countries (Pitt et al., 1999). Such practices reduce direct stormwater discharges to surface waters and alleviate water shortages (Barba et al., 2019;Dillon et al., 2008;Marsalek and Chocat, 2002). However, stormwater represents a major source of nonpoint pollution, and its infiltration into the ground may have adverse ecological and sanitary impacts (Chong et al., 2013;Pitt et al., 1999;Vezzaro and Mikkelsen, 2012).
The vadose zone of a SIS can act as a natural filter capturing pollutants (hydrocarbons and heavy metals) and microorganisms washed off by runoffs (e.g., Murphy and Ginn, 2000;Tedoldi et al., 2016). Nevertheless, the effectiveness of a SIS in preventing the migration of contaminants towards aquifers is not always optimal (Borchardt et al., 2007;Lapworth et al., 2012;Arnaud et al., 2015;Voisin et al., 2018). The filtering properties of a SIS are influenced by various abiotic factors such as the nature of the media (rocks, sand and other soil elements), the physical properties (e.g., granulometry, hydrophobicity index and organization) and the runoff water flow velocity (Lassabatere et al., 2006;Winiarski et al., 2013). These constraints will impact not only the water transit time from the top layers to the aquifer but also the biological properties of these systems including the plant cover, root systems, worm population and composition of microbiota (Barba et al., 2019;Bedell et al., 2013;Crites, 1985;Pigneret et al., 2016). The thickness of the vadose zone was found to be one of the key parameters explaining chemical transfers such as phosphate and organic-carbon sources (Voisin et al., 2018). The situation is much less clear regarding the microbiological communities that flow through these systems (e.g., Barba et al., 2019;Voisin et al., 2018).
According to the concept of microbial community coalescence conceptualized by Tikhonov (2016) and adapted to riverine networks by Mansour et al. (2018), urban aquifers fed by a SIS should harbour microbiota reflecting the coalescence (community assemblages and selective sorting) of aboveground microbial communities with those of the aquifer. Indeed, during rain events, microbial communities will be resuspended through runoff-driven surface erosion processes, favouring detachment of microorganisms from plant litter, waste, soil and other particles (Mansour et al., 2018). These resuspended communities will merge and generate novel assemblages. The resulting community will initially match the relative contributions of the various subwatersheds to the overall microbiological complexity of the assemblages (Mansour et al., 2018). The prevailing ecological constraints among the downward systems will then gradually drive this coalescence towards the most fit community structures. These resulting communities might be highly efficient at degrading urban pollutants trapped among a SIS but could also disturb the ecological equilibria of the connected and more sensitive systems like those of deep aquifers as suggested by Voisin et al. (2018).
This study explores the impact of a SIS with a thick vadose zone (>10 m) on the coalescence of urban runoff microbial communities in a connected aquifer. Two hypotheses were tested (1) highly specialized taxa (often termed K-strategists; e.g., Vadstein et al., 2018) of an aquifer outcompete the intrusive community members of aboveground taxa, and (2) nutrient inputs from runoffs and pollutants drive changes among communities and favour environmental opportunists (often termed r-strategists; e.g., Vadstein et al., 2018). The targeted SIS is part of a long-term experimental site (http://www.graie.org/portail/dispositifsderecherche/ othu/, last access: May 2020) that records both physicochemical and biological properties. This SIS is connected to the eastern aquifer of Lyon (France), which is fed by three low hydraulic conductivity corridors (10 −5 -10 −8 m s −1 ) separated by moraine hills (Foulquier et al., 2010). The average vadose-zone thickness of the SIS is 15 m, and the delay between a rainfall event and the impact on the aquifer waters was estimated at 86 ± 11 h (Voisin et al., 2018). A 16S rRNA gene metabarcoding dataset was assembled for this site to investigate bacterial community coalescence from the top compartments into the connected aquifer waters but also the biofilm communities developing on inert surfaces. This investigation was also built on the hypothesis that a less significant microbial community coalescence is expected in aquifer waters than biofilms. This is supported by previous reports which suggested the occurrence of transient free-living bacteria among aquifers acting as a travelling seed bank (Griebler et al., 2014). For these monitoring approaches, water grab samples were previously found to give access to snapshots of the diversity found within an aquifer (Voisin et al., 2018), whereas aquifer biofilms developing on artificial surfaces (clay beads) were shown to be more integrative and informative of the groundwater microbiological quality . Clay bead biofilms were found to capture the most abundant aquifer taxa and taxa that could not be detected from grab samples. However, some bacterial taxa were still not detectable by this approach because of a poor ability at colonizing clay beads over short time periods. A field-based investigation was thus per-formed to further explore the relative contributions of a set of sources such as runoffs and urban soils on the observed biofilm assemblages recovered from an aquifer. A Bayesian methodology, named SourceTracker , was used to investigate community coalescence from 16S rRNA gene-based DNA metabarcoding datasets. Complementary datasets were then assembled from an additional DNA marker named tpm (encoding EC:2.1.1.67, which catalyzes the methylation of thiopurine drugs) (Favre-Bonté et al., 2005). This genetic marker enables finer taxonomic allocations down to the species level to explore the coalescence of a set of bacterial species and subspecies, including plant and human pathogens, within the aquifer microbial community.
2 Material and methods

Experimental site
The Chassieu urban catchment is located in the suburbs of Lyon (France). It has a surface of 185 ha and hosts mainly industrial and commercial activities (i.e., wholesaling, recovery and waste management, metal surface treatment, car wash, and repair services). The imperviousness coefficient of the catchment area is approximately 75 %. No significant modifications impacting the urban watershed were recorded during the investigation. Stormwater and dry-weather flows from industrial activities are drained by a network separated from the sewer. This network transfers waters into the Django-R (Django Reinhardt) SIS, which is a part of the OTHU (Observatoire de Terrain en Hydrologie Urbaine) long-term experimental observatory dedicated to urban waters (http://www.graie.org/othu/, last access: May 2020). This SIS contains an open and dry detention basin (DB) (32 000 m 3 ) built on a concrete slab, with edges impermeabilized by a thick plastic lining. This DB allows for a settling of coarse-and medium-size particles, resulting in sedimentary deposits which favour plant cover development. The DB water content is delivered within 24 h into an infiltration basin (IB) (61 000 m 3 ), which favours the recharge of the connected aquifer (AQ). This infiltration basin had a vadose zone of about 11 m during the investigation, and its geology, hydrology, ecology and pollution levels had been previously investigated (e.g., Barraud et al., 2002;Le Coustumer and Barraud, 2007;El-Mufleh et al., 2014).
The Chassieu watershed, the Django-R SIS and the Lyon aquifer were investigated in this study ( Fig. 1 and Table S1). Watershed (WS) runoff waters have been collected from sampling points spread over the catchment (21 subwatersheds over three sampling periods, n = 64 samples). Sediments from the detention basin (hereafter DB) were recovered from a 50 cm 2 area covering the full sediment column down to the concrete slab (n = 20 samples). These sediments (or urban soils) often had an herbaceous plant cover and were sampled in four areas defined according to the hydrological forces prevailing in the basin (e.g., Marti et al., 2017;Sébastian et al., 2014). Infiltration basin (hereafter IB) soil samples were collected from three zones (the area receiving the inflow waters, the bottom area of the basin and an upper zone of the basin exposed to inflow waters only during heavy-rain events) (n = 5 samples per zone) at a 0-10 cm depth covering a surface of 50 cm 2 . The aquifer samples were recovered from piezometers located upstream (up; in a zone of the aquifer not influenced by water recharge) and downstream (dw; in a zone of the aquifer influenced by water recharge) of the SIS of the Django-R site at a depth of 2 m below the water table (e.g., Barraud et al., 2002;Voisin et al., 2018) (Fig. 1). Groundwater samplings (n = 6; named AQ_wat) were performed with an immerged pump, used at a pumping rate of 6-8 L min −1 (PP36 inox, SDEC, Reignacsur-Indre, France) and previously cleaned with 70 % ethanol. The first 50 L were used to rinse the sampling equipment and were subsequently discarded. The following 6 L was used for the microbiological analyses. The biofilm samples (AQ_bio) from the aquifer were recovered from clay beads incubated in the aquifer over 10 d using the piezometers described above (n = 6 samples). Clay beads were used as physical matrices to sample groundwater biofilms according to Voisin et al. (2016).
2.2 DNA extractions, 16S rRNA gene qPCR (quantitative polymerase chain reaction) assays and PCR product DNA sequencing Approximately 600 mg of sediments or soils or up to 5 L of aquifer or runoff water samples filtered using 0.22 µm polycarbonate filters was used per DNA extraction. Total DNA was extracted from soils and sediments or filters using the FastDNA SPIN ® Kit for Soil (MP Biomedicals, Illkirch, France). For clay bead biofilms, microbial cells were detached by shaking at 2500 rpm for 2 min in 10 mL of 0.8 % NaCl. These suspensions were then filtered, and their DNA contents were extracted as indicated above. Blank samples were performed during these extractions for both the soils and sediments or filtered cells. DNA was quantified using a NanoDrop UV-Vis (ultraviolet-visible) spectrophotometer. Blank DNA extracts showed values below the detection limit. DNA extracts were visualized after electrophoresis at 6 V cm −1 using a TBE buffer (89 mM tris, 89 mM boric acid and 2 mM ethylenediaminetetraacetic acid -EDTA; pH 8.0) through a 0.8 % (w/v; weight by volume) agarose gel and DNA staining with 0.4 mg mL −1 ethidium bromide. A Gel Doc XR+ System (Bio-Rad, France) was used to observe the stained DNA and confirm their relative quantities (between 20 and 120 ng µL −1 ; median value around 40 ng µL −1 ) and qualities. DNA was kept at −80 • C and shipped on ice within 24 h to the DNA sequencing services when appropriate. Quantitative PCR assays were performed on the DNA extracts to estimate their relative content in 16S rRNA gene copies. These assays were performed on a Bio-Rad CFX96 real-time (RT) PCR instrument with Bio-Rad CFX Manager software, version 3.0 (Marnes-la-Coquette, France). The 16S rRNA gene primers 338F and 518R described by Park and Crowley (2006) were used, together with the Brilliant II qRT-PCR SYBR Green Low ROX Master Mix for SYBR Green qRT-PCR. The melting temperature was 60 • C. Linearized plasmid DNA containing a 16S rRNA gene was used as a standard and obtained from Marti et al. (2017). Presence of inhibitors in the DNA extracts was checked by spiking a known amount of plasmids harbouring int2 (10 7 copies of plasmid per microlitre) in the PCR mix. A number of cycles needed to get a PCR signal was compared with wells where only plasmid DNA harbouring int2 was added to the qPCR mix. When a high number of cycles was needed to observe a signal, a 5-or 10-fold dilution of the DNA extract was done, and another round of tests was performed to confirm the absence of PCR inhibitions. Each assay was triplicated on distinct DNA extracts, and technical triplicates were performed. The 16S rRNA gene qPCR datasets are presented in Fig. S1 in the Supplement. These assays confirmed the high number of bacterial cells per compartment ( Fig. S1 and Table S2): (1) soils from the infiltration basin (IB) had a median content of 1.32 × 10 11 16S rRNA gene copies per gram dry weight; (2) sediments from the detention basin (DB) had 1.83 × 10 11 16S rRNA gene copies per gram dry weight; (3) the runoff waters (WS) had a median content of 4.75 × 10 8 16S rRNA gene copies per millilitre; (4) the aquifer waters (AQ_wat) had 3.10 × 10 6 16S rRNA gene copies per millilitre; and (5) the aquifer clay bead biofilms showed 1.35 × 10 7 16S rRNA gene copies per square centimetre.
Sequencing of V5-V6 16S rRNA gene (rrs) PCR products were performed by MR DNA sequencing services (Shallowater, Texas, USA) on an Illumina MiSeq v3. The PCR products were generated using DNA primers 799F (barcode + ACCMGGATTAGATACCCKG) and 1193R (CRTC-CMCACCTTCCTC) reported by Beckers et al. (2016). PCR amplifications were performed using the HotStarTaq Plus Master Mix Kit (QIAGEN, USA) using the following temperature cycles: 94 • C for 3 min, followed by 28 cycles of 94 • C for 30 s, 53 • C for 40 s and 72 • C for 1 min, with a final elongation step at 72 • C for 5 min. PCR products and blank control samples were verified using a 2 % agarose gel and following the electrophoretic procedure described above. PCR products obtained from field samples showed sizes around 430 bp (base pair), but blanks did not show detectable and quantifiable PCR products. Dual-index adapters were ligated to the PCR fragments using the TruSeq ® DNA Library Prep Kit which also involved quality controls of the ligation step (Illumina, Paris, France). Illumina MiSeq DNA sequencings of the PCR products were paired-end and set up to obtain around 40 000 reads per sample.
The tpm DNA libraries were also sequenced by the Illumina MiSeq v3 technology but by the Biofidal DNA sequencing services (Vaulx-en-Velin, France). PCR products were generated using the following mix of degenerated PCR primers: ILMN-PTCF2 (5'-P5 adapter tag + universal primer + GTGCCGYTRTGYGGCAAGA- and ILMN-PTCR2m (5'-P7 adapter tag + universal primer + ATGAGBGCTGCCCTGTCRTA-'3) targeting conserved regions defined by Favre-Bonté et al. (2005). The universal primer was 5'-AGATGTGTATAAGAGACAG-'3. The P5 adapter tag was 5'-TCGTCGGCAGCGTC-'3. The P7 adapter tag was 5'-GTCTCGTGGGCTCGG-'3. PCR reactions were performed using the 5X Hot BIOAmp ® master mix (Biofidal, France) containing 12.5 mM MgCl 2 and 10 % DMSO -dimethyl sulfoxide -and 50 ng sample DNA final concentrations. PCR cycles were as follows: (1) a hot start at 94 • C for 5 min, (2) 35 cycles consisting of 94 • C for 30 s, 58 • C for 30 s and 72 • C for 30 s, and (3) a final extension of 5 min at 72 • C. The mix had two carefully optimized enzymes, the HOT FIREPol ® DNA polymerase and a proofreading polymerase. This enzyme blend has both 5' → 3' exonuclease and 3' → 5' proofreading activities. This mix exhibits an increased fidelity (up to 5-fold) compared to a regular Taq polymerase. PCR products and blank control samples were verified using a 2 % agarose gel and following the electrophoretic procedure described above. PCR products obtained from field samples showed sizes around 320 bp, but blanks did not show detectable and quantifiable PCR products. Index and Illumina P5 or P7 DNA sequences were added by Biofidal through a PCR procedure using the same Hot BIOAmp ® master mix and the above temperatures but limited to 15 PCR cycles. Indexed P5-and P7-tagged PCR products were purified using the SPRIselect procedure (Beckman Coulter, Roissy, France). PCR products and blank control samples were verified using the QIAxcel DNA kit (QIAGEN, France), and band sizes around 400 bp were observed but not in the blank samples. Quantification of PCR products by the PicoGreen approach using the Quan-tiFluor dsDNA (double-stranded) kit (Promega, France) and a Qubit ® 2.0 Fluorometer (Thermo Fisher Scientific, France) was performed and showed low values among the blanks, which were at the limit of detection (around 0.07 ng µL −1 ). Still, with tpm-harbouring bacteria being in low number among a bacterial community (about 2 %-3 %), these controls were run during the MiSeq DNA sequencing of the PCR products. Illumina MiSeq DNA sequencings of the tpm PCR products were pair-ended and set up to obtain around 40 000 reads per sample. Blank samples generated low numbers of tpm reads (blank 1 = 24 reads; blank 2 = 3 reads, blank 4 = 1028 reads and blank 5 = 1 read), and these have been listed in Table S3. These reads mainly belonged to unknown species (86 %). However, reads from Pseudomonas fluorescens (from OTUs -operational taxonomic units -not found in the field samples), P. xanthomarina (17 reads over all blanks) and P. fragi (n = 3 reads over all blanks) were recovered but did not have any impact on the coalescence analysis. The 16S rRNA and tpm gene sequences reported in this work are available at the European Nucleotide Archive (https://www.ebi.ac.uk/ena, last access: May 2020).

Bioinformatic analyses
All paired-end MiSeq reads were processed using Mothur 1.40.4 by following a standard operation protocol (SOP) for MiSeq-based microbial community analysis (Schloss et al., 2009;Kozich et al., 2013), the so-called MiSeq SOP available at http://www.mothur.org/wiki/MiSeq_SOP (last access: May 2020). Due to the large number of sequences to be processed, the "cluster.split" command was used to assign sequences to OTUs. For the 16S rRNA (rrs) gene sequences, reads were filtered for length (>300 bp), quality score (mean ≥ 25), number of ambiguous bases (= 0) and length of homopolymer runs (<8) using the "trim.seqs" script in Mothur, and singletons were discarded. The 16S rRNA gene sequences passing these quality criteria were aligned to the SILVA reference alignment template (release 128; https:// www.arb-silva.de, last access: May 2020). Unaligned sequences were removed. Chimeric sequences were identified using the "chimera.uchime" command and removed. Variability in the number of cleaned reads per sample was observed but not correlated with variations in the number of 16S rRNA gene sequences (Table S2). These variations were thus considered to be due to the DNA sequencing process. Therefore, a subsampled dataset (20 624 reads per sample, with the exclusion of samples with total reads below this threshold) was used to mitigate the artifact of sample library sizes. Operational taxonomic units (OTUs) were defined using a 97 % identity cutoff as recommended by several authors in order to collapse sequences into groups that reduce the incidence of sequence errors on the datasets (e.g., Eren et al., 2013;Johnson et al., 2019). It is to be noted that amplicon sequence variants (ASVs) could also be used to build contingency tables (e.g., Callahan et al., 2016;Karstens et al., 2019). However, exact sequence variants can generate uncertainties when using 16S rRNA gene sequences because of variations among species and strains due to the presence of multiple copies per genome (Johnson et al., 2019). Figure S2 shows the OTU rarefaction curves for the full and the subsampled datasets. The subsampled dataset was used for all downstream analyses except those of the SourceTracker Bayesian approach (see below). OTUs were affiliated to taxonomic groups by comparison with the SILVA reference alignment template if a bootstrap p value over 80 % was obtained. FAPROTAX (Functional Annotation of Prokaryotic Taxa; Louca et al., 2016) functional inferences were performed on the MACADAM (MetAboliC pAthway DAtabase for complex Microbial community function analysis) Explore website (http://macadam.toulouse. inra.fr/, last access: May 2020) according to Le Boulch et al. (2019). For the tpm gene sequences, chimeric sequences and primers, barcodes were removed, and the dataset was limited to sequences of a minimum length of 210 bp (average length = 215 bp). These reads were aligned against the tpm database (BD_TPM_Mar18_v1.unique_770seq). Unaligned sequences were removed. The number of sequences per sample was then subsampled (4636 sequences per sample, with the exclusion of samples with total reads below this threshold). Operational taxonomic units (OTUs) were defined at a 100 % identity cutoff. The BD_TPM_ Mar18_v1.unique_770seq database (http://www.graie.org/ othu/donnees, last access: May 2020) was used to classify the sequences using the Wang text-based Bayesian classifier (Wang et al., 2007) and a p bootstrap value above 80 %. Local BLAST analyses were performed on the BD_TPM_Mar18_v1.unique_770seq database using the NCBI (National Center for Biotechnology Information) BLAST (Basic Local Alignment Search Tool, version X) program to check the quality of the taxonomic affiliations.

Statistical analyses
All statistical analyses were performed in R (v3.5.1). For the 16S rRNA gene sequences, alpha-diversity estimates were computed using the function "rarefy" from the "vegan" package (Oksanen et al., 2015). Richness (S obs ) was computed as the number of observed OTUs in each sample. The diversity within each individual sample was estimated using the nonparametric Shannon index. To estimate whether the origin of the samples influenced the alpha diversity, an ANOVA test with Tukey post hoc tests was performed. Shared and unique OTUs were depicted in Venn diagrams with the "limma" package (Ritchie et al., 2015). Concerning the beta-diversity analyses, a neighbour-joining tree was constructed with a maximum-likelihood approximation method using FastTree (Price et al., 2009). Weighted UniFrac distances were calculated for all pairwised OTU patterns according to Lozupone et al. (2011) and used in a principal coordinates analysis (PCoA) (Anderson and Willis, 2003). Permutation tests of distances (PERMANOVAs) (Anderson, 2001) were performed using the vegan package (Oksanen et al., 2015) to establish the significance of the observed groupings.

Analyses of bacterial community coalescence
The SourceTracker computer package  was used to investigate community coalescence. Source-Tracker is a Bayesian approach built to estimate the most probable proportion of user-defined "source" DNA reads in a given "sink" community. In the present analysis, various scenarios of community coalescence were investigated such as the coalescence of bacterial taxa from the watershed runoff waters and sediments from the detention and infiltration basins with those of the downstream-SIS aquifer water samples or of recent biofilms developing on clay beads incubated in the aquifer. SourceTracker was run with the default parameters (rarefaction depth = 1000 reads from the original cleaned dataset of 16S rRNA gene reads; Fig. S2a; burn-in: 100, restart: 10) to identify sources explaining the OTU patterns observed among the aquifer samples (waters and clay bead biofilms, n = 12). Alpha values were tuned using crossvalidation (alpha 1 = 0.001 and alpha 2 = 1). The relative standard deviation (RSD) based on three runs was used as a gauge to evaluate confidence on the computed values (Henry et al., 2016;McCarthy et al., 2017).

16S rRNA V5-V6 gene sequence distribution biases and profiling
The analysis of the 16S rRNA V5-V6 gene libraries yielded 2 124 272 high-quality sequences distributed across 103 samples as described in Table S2. Subsampling-based normalization was applied (20 624 reads per sample), and sequences were distributed into 10 231 16S rRNA gene OTUs (with >97 % identity between sequences of an OTU). The rarefaction curves are shown in Fig. S2. At all sampling sites, bacterial communities were dominated by Proteobacteria, Bacteroidetes and Actinobacteria (WS = 95 % of total reads, DB = 84 %, IB = 71 %, AQ_bio = 99 %; AQ_wat = 59 %), but 10 other phyla with relative abundances greater than 0.5 % were also detected ( Fig. 2a and Table S4). Alphadiversity estimates showed that aquifer samples harboured a microbiome with a significantly lower richness (AQ_bio: S obs = 278 OTUs ±106; AQ_wat: S obs = 490 OTUs ±333) and a less diverse bacterial community (AQ_bio: H' = 2.9 ± 0.3 and AQ_wat: H' = 4.3 ± 0.7) than the ones of the upper compartments (S obs − WS = 1288 OTUs ±232; S obs − DB = 1566 OTUs ±245, S obs − IB = 1503 OTUs ±177 and H' WS = 5.0 ± 0.5; H' DB = 5.4 ± 0.5 and H' IB = 5.7 ± 0.4) (ANOVA, p<0.001) (Fig. 2b and Table S5). Among the surface samples, a greater diversity was observed among the soil samples from the infiltration basin than from samples of watershed runoff waters and sediments recovered from the detention basin (ANOVA, p<0.05). In the aquifer, water grab samples were more diverse and showed higher 16S rRNA gene OTU contents than biofilms recovered from the clay beads incubated over a 10 d period (ANOVA, p<0.05) (Fig. 2b and Table S5). The structure of bacterial communities inferred from V5-V6 16S rRNA gene sequences changed markedly along the watershed. A PCoA ordination of the OTU profiles based on weighted UniFrac distances showed samples to be clustered according to their compartment of origin (i.e., WS, DB, IB, AQ_bio and AQ_wat) (Fig. 3). These changes in community structures between compartments were supported by PER-MANOVA statistical tests (F = 20.7, p<0.001). Bacterial communities per compartment were found to contain core and flexible (defined as not conserved between all sampling periods) bacterial taxa. Within the same compartment, similarities between bacterial community profiles ranged from 65 % (AQ_wat) to 82 % (IB), whereas similarities across compartments ranged from 48 % (DB vs. AQ_bio) to 66 % (DB vs. IB) (Fig. S3). Bacterial community profiles of the aquifer waters were found closer to the ones of the detention basin deposits (57 %) and soils of the infiltration basin (61 %) than those of the aquifer biofilms (48 % and 49 %, respectively). However, more than 89 % of the 16S rRNA gene OTUs (n = 8284) identified above the aquifer (WS, DB and IB) were not detected in groundwater samples (AQ_bio and AQ_wat) (Fig. S4). This large group of OTUs was made of minor taxa which accounted for 37 %, 44 % and 47 % of the total reads recovered from the WS, DB and IB samples, respectively.

Coalescence of surface and aquifer bacterial communities
3.2.1 Source-tracking analyses from the 16S rRNA gene sequences A SourceTracker analysis was performed to estimate the coalescence of bacterial taxa inferred from V5-V6 16S rRNA gene reads from the watershed and SIS down into the aquifer waters and biofilm bacterial communities. This analysis indicated significant coalescence between the bacterial commu- nities of the runoffs, the soils and sediments of the SIS, and the aquifer samples. The aquifer water microbial community upstream of the SIS was found to explain about 40 % of the downstream water microbial community (Table 1), while 16S rRNA gene reads from the runoff waters were found to explain about 5 %, and those of the DB were found to be around 8 % of the observed patterns (Table 1). The infiltration basin explained about 7 % of the observed diversity among the SIS-impacted aquifer water community. The aquifer biofilm bacterial communities were also found to be assemblages of communities from the surface environments. The origin of more than 94 % of the SIS-impacted aquifer biofilms could be explained by the SourceTracker. Main sources of taxa were inferred to be the upstream aquifer waters (59 %), the sediments of the detention basin (22 %) and the runoff waters (8.5 %) ( Table 1). Soils from the infiltration basin did not appear to have contributed substantially to the taxa recovered from these aquifer biofilms (<4 %) ( Table 1). Aquifer biofilms recovered upstream of the SIS showed a high proportion of taxa related to those observed among the runoff waters (44 %) and the aquifer waters (49 %). This was not considered surprising because runoff infiltration can occur in several sites upstream of the SIS (although no direct relation with other SISs could be made so far).

16S rRNA gene-inferred bacterial taxa undergoing coalescence in the aquifer
To identify the bacterial taxa involved in the coalescence process, OTUs of the 16S rRNA gene dataset were allocated to taxonomic groups using the SILVA reference alignment template. These taxonomic allocations indicated that (1) 14 genera were only recorded in the aquifer samples, (2) 421 genera were only recorded in the upper-surface compartments of the watershed and (3) 219 were recorded among aboveground and aquifer compartments (Table S6). The following bacterial genera were exclusively associated with the aquifer bacterial communities: Turicella, Fritschea, Metachlamydia, Macrococcus, Anaerococcus, Finegoldia, Abiotrophia, Dialister, Leptospirillum, Omnitrophus, Campylobacter, Sulfurimonas, Haemophilus and Nitratireductor. These bacterial genera were recovered from all water samples, and five were also detected in biofilms (Table S6). These genera were associated with 926 16S rRNA gene OTUs that accounted for, respectively, 48 % and 1.8 % of the total reads recovered from aquifer waters and aquifer biofilms developing on clay beads. FAPROTAX functional inferences indicated some of these genera to be host-associated such as Fritschea, Metachlamydia, Finegoldia, Campylobacter and Haemophilus, with the latter two being well known to contain potential pathogens.
Campylobacter and Sulfurimonas cells have also been associated with nitrogen and sulfur respiration processes, and Leptospirillum has been associated with nitrification.
Regarding the bacterial taxa of the aboveground communities matching those of the aquifer samples, a total of 1021 16S rRNA gene OTUs was found to be shared between these compartments (Table 2 and Fig. S4). These OTUs consisted of abundant taxa, as they accounted for 9.7 %-39.4 % of the total reads for the samples recovered from the surface compartments and for 33.6 %-83.4 % and 95.0 %-99.4 % of the total reads of the water and biofilm aquifer samples, respectively ( Table 2). The Beta-and Gammaproteobacteria dominated this group. It is noteworthy that aquifer samples collected upstream of the SIS shared fewer OTUs with the surface compartments (125 OTUs ±41) than samples under the influence of the infiltration system (332 OTUs ±85) (Table 2 and Fig. S4). The shared OTUs between aquifer samples and the upper compartments represented a higher fraction of bacterial communities in samples recovered downstream of the SIS (81.3 % ± 22.8 of total reads) compared to those collected upstream (68.9 % ± 30.9 of total reads) ( Table 2). Reads from Pseudomonas, Nitrospira, Neisseria, Streptococcus and Flavobacterium were the most abundant (>1 %) of the shared OTUs recovered in the aquifer water samples, whereas those allocated to Pseudomonas, Duganella, Massilia, Nocardia, Flavobacterium, Aquabacterium, Novosphingobium, Sphingobium, Perlucidibaca and Meganema were the most abundant (>1 %) among the aquifer biofilms (Table S6). Most of these aquifer water taxa (except Streptococcus) were found to be involved in denitrification or nitrification as inferred from FAPROTAX. The biofilm taxa were most often associated with hydrocarbon degradation (Novosphingobium, Sphingobium and Nocardia) by FAPROTAX. Several of these biofilm bacterial genera were also found to be containing potential human pathogens (Duganella, Massilia, Nocardia and Aquabacterium) by FAPROTAX (and published clinical records). A set of 14 potentially hazardous bacterial genera was selected from Table S6 and used to illustrate the coalescence of bacterial taxa among the aquifer samples in Fig. 4. The 16S rRNA gene reads from Flavobacterium prevailed in all upper compartments (WS = 6.9 % of total reads, DB = 13.4 % and IB = 8.3 %) and were a significant amount in the connected aquifer (AQ_wat = 1.1 % and AQ_bio = 3.1 %) ( Fig. 4b and Table S6). Pseudomonas 16S rRNA gene reads were in relatively lower numbers in the upper compartments (WS = 0.4 % of total reads, DB = 0.4 % and IB <0.05 %) than the aquifer (AQ_wat = 8.4 % and AQ_bio = 35.5 %) ( Fig. 4b and Table S6). Similar trends were observed for Nocardia and Neisseria OTUs (Fig. 4b). Notably, OTUs exclusively recovered from the upper compartments were mainly allocated to Gemmatimonas (0.2 %-1.6 % of total reads), Geodermatophilus (0.1 %-1.8 %) and Roseomonas (0.1 %-1.0 %) (Table S6).

Coalescence of Pseudomonas and other
tpm-harbouring bacterial species DNA sequences from tpm PCR products generated according to Favre-Bonté et al. (2005) allowed for a further ex- (1) reads from WS, DB, IB and aquifer waters from upstream of the SIS were considered as the sources of taxa for the aquifer samples downstream of the SIS; (2) reads from WS and the aquifer waters upstream of the SIS were considered as the sources of taxa for the aquifer biofilms recovered upstream of the SIS. SourceTracker was run three times using the 16S rRNA gene OTU contingency table and the default parameters. Relative contributions of the sources were averaged. Relative standard deviations (%RSD) are indicated and used as confidence values. RSD >100 % indicates low confidence of the estimated value. WS: watershed runoff waters; DB: detention basin sediments; IB: infiltration basin sediments; AQ_wat: aquifer waters. Sequences that could not be attributed to one of the tested sources were grouped under the term unknown.  Fig. 1 for the sampling design), after a resampling of the reads set at 20 624 per sample. AQ_wat: aquifer waters; AQ_bio: aquifer clay beads biofilms; up: upstream of the SIS; dw: downstream of the SIS. ploration of the bacterial species undergoing a coalescence with the aquifer microbiome. A total of 19 129 tpm OTUs were recorded among the samples (from datasets resampled to reach 4636 reads per sample). As expected, these tpm reads were mainly assigned to Proteobacteria (WS = 92 % of total reads, DB = 87 %, IB = 76 %, AQ_wat = 83 % and AQ_wat = 85 %), but some reads could also be attributed to Bacteroidetes, Nitrospirae and Cyanobacteria (Table S7). These taxonomic allocations allowed for the identification of 24 bacterial genera and 91 species, whose distributions are summarized in Tables S7 and S8. The tpm sequences were mainly allocated to Pseudomonas (WS = 36 % of the reads, DB = 27 %, IB = 7 %, AQ_wat = 51 % and AQ_bio = 48 %), Aeromonas (WS = 1 % of the reads, DB = 3 %, IB <0.05 %, AQ_wat = 0.07 % and AQ_bio <0.05 %), Xanthomonas (WS = 4 % of the reads, DB <0.05 %, IB = 1 %, AQ_wat = 8 % and AQ_bio <0.05 %), Herbaspirillum (WS = 11 % of the reads) and Nitrosomonas (DB = 4 % of the reads and IB = 0.2 %) (Table S8). Reads related to Pseudomonas were allocated to 50 species, including pollutantdegraders (P. pseudoalcaligenes, P. aeruginosa, P. fragi, P. alcaligenes, P. putida and P. fluorescens), phytopathogens (P. syringae, P. viridiflava, P. stutzeri and P. marginalis) and human-opportunistic pathogens (P. aeruginosa, P. putida, P. stutzeri, P. mendocina and Stenotrophomonas acidaminiphila (Table S9). It is to be noted that blank samples sequenced during the tpm metabarcoding assay revealed 23 Pseudomonas OTUs coming from the DNA extraction kit or generated during the PCR product Illumina sequencing process (Table S3). Only OTU00573 was found to be high in number (867 reads), but this contaminant did not have an impact on the coalescence analysis because of its absence in the underground datasets (Table S10). Other contaminant OTUs did not represent more than 10 times the ones observed in the field samples for identical OTUs, a criterion used to distinguish significant contaminants (Lukasik et al., 2017). In fact, only seven OTUs found among the blanks matched OTUs recovered from the environmental samples, and only two of these could be related to well-defined species, i.e., P. xanthomarina (17 reads among all blanks) and P. fragi (3 reads among all blanks). These reads matched a single OTU over 11 allocated to P. xanthomarina in the environmental samples and 1 OTU over 52 for P. fragi (Table S10). Reads related to Aeromonas were attributed to 11 species, but only those allocated to A. caviae could be recovered from the aquifer and aboveground compartments (Table S9). Reads related to the Xanthomonas were allocated to nine species, but only those allocated to X. axonopodis-X. campestris complex and X. cannabis species were recovered from the aquifer and upper compartments (Table S9). Regarding Pseudomonas, tpm reads allocated to P. jessenii, P. chlororaphis and P. resinovorans were restricted to the aquifer samples. Reads allocated to P. aeruginosa, P. anguilliseptica, P. chengduensis, P. extremaustralis, P. fluorescens, P. fragi, P. gessardii, P. koreensis, P. pseudoalcaligenes, P. putida, P. stutzeri, P. umsongensis and P. viridiflava were recovered from the aquifer and upper compartments (Table S9). FAPROTAX analysis indicated that a significant number of the species detected in the aquifer can not only be involved in denitrification (P. aeruginosa, P. fluorescens, P. putida, P. stutzeri, S. acidaminiphila, X. autotrophicus and P. chlororaphis) or nitrification (Nitrospira defluvii and Nitrosomonas oligotropha) but also in hydrocarbon degradation (P. aeruginosa, P. fluorescens and P. putida). Some of these species were also suggested by FAPROTAX to be human pathogens or invertebrate parasites (e.g., P. chlororaphis and P. aeruginosa).
The tpm OTUs (representative of infra-specific complexes) shared between the upper compartments and the aquifer were allocated to 14 species and 5 genera (Table 3 and Table S10). Four of these OTUs led to higher relative numbers of reads in the aquifer samples, in the following decreasing order: P. umsongensis (Otu00005), P. chengduensis (Otu00024), X. axonopodis-X. campestris (Otu00019 and Otu00878) and P. stutzeri (Otu00119 and Otu10066). These co-occurrences of OTUs between aboveground and aquifer samples support the hypothesis of significant coalescence be- Table 3. Relative distribution of tpm reads per OTU (mean ± SD) shared between the upper compartments and the aquifer that were allocated to well-defined species 1 . 3.74 ± 9.47 nd nd nd + + + 1 All reads from tpm OTUs shared between the upper compartments and the aquifer were used to compute the relative abundances. 2 tpm sequences of the OTUs are shown in Table S8. WS: watershed runoff waters; DB: detention basin deposits; IB: soil of the infiltration basin; AQ_wat: aquifer waters; AQ_bio: aquifer biofilms. +: OTUs with a relative abundance <0.05 %. nd: not detected.
tween these bacterial communities. The other OTUs showed a higher number of reads among the top compartments. The OTU allocated to X. cannabis showed the highest relative number of reads of this group among runoff waters. The distribution pattern of this OTU suggested a relative decline when moving down the aquifer. The P. aeruginosa Otu00066 was recovered in the runoff waters, and biofilms developing on clay beads incubated in the aquifer.

Discussion
The coalescence of bacterial taxa from runoff and stormwater infiltration systems (SISs) with those of a connected aquifer was investigated. Taxonomic and functional inferences were performed using 16S rRNA gene libraries. In addition, a genetic marker named tpm was used to track species and particular sequence types of Pseudomonas, Aeromonas and Xanthomonas (and a few other genera) from runoffs down into the SIS-impacted aquifer. Estimation of alpha-diversity indices from the 16S rRNA bacterial-community profiling measures indicated that groundwater samples (i.e., waters and biofilms) harboured a less diverse microbiome than those of the top compartments (i.e., WS, DB and IB). A 2-to 5-fold reduction in bacterial richness was observed from the surface compartments down into the aquifer. This result suggested that a high proportion of bacterial taxa carried by stormwater runoffs or thriving in the detention and infiltration basins were retained and/or eliminated by the vadose-zone filtration process. In line with this result, the estimation of the copy number of the bacterial 16S rRNA gene by qPCR revealed that bacterial biomass was much lower in aquifer than in runoff samples. In fact, more than 89 % of the 16S rRNA gene OTUs in the top compartments were not detected in the underground samples. This is in agreement with previous works which have shown that immobilization of microorganisms through porous media is high in the top soil layers and triggered by mechanical straining, sedimentation and ad-sorption (Kristian Stevik et al., 2004;Krone et al., 1958). Moreover, particles that accumulate as water passes through the soil can form a mat that enhances this straining process (Krone et al., 1958). Despite this filtering effect, infiltration induces significant changes in the diversity of groundwater bacterial communities. Both water and biofilm aquifer samples recovered downstream of the SIS had higher bacterial richness than those collected upstream. It is to be noted that soils of the infiltration basin showed higher bacterial diversity than those of the sediments of the detention basin and runoffs. This is most likely related to a development of plantassociated bacteria in this compartment. Indeed, the infiltration basin was covered by several plant species of Magnoliophyta like Rumex sp. which can disseminate rapidly through rhizomes  and generate multiple ecological niches for bacteria. The SourceTracker Bayesian probabilistic approach based on 16S rRNA gene metabarcoding datasets  was applied to refine our understanding of the coalescence of microbial communities from aboveground environments down into an aquifer. These inferences revealed variable levels of coalescence in the SIS recharge aquifer depending upon the investigated sink, i.e., waters or biofilms developing on clay beads incubated in the aquifer. Bacterial-community structures of the groundwater samples (upstream and downstream of the SIS) were significantly built from aboveground communities (e.g., those from runoff waters). However, the origin of a high proportion of the diversity observed among the aquifer waters downstream of the SIS remained undefined. This is likely related to the emergence of novel biomes among the vadose zone of a SIS fed with urban waters and pollutants. These biomes would have emerged from the buildup of novel biotopes during the construction and functioning of the SIS (see Winiarski, 2014, for review). The prevailing environmental constraints and pollutants would then have favoured minor taxa (not detectable by DNA metabarcoding approaches) from the aboveground compartments. In fact, chemical pol-lutants have been shown to be significantly washed off or transported with particles during rain events (El-Mufleh et al., 2014), and some of these were found to reach aquifers fed by SISs (Pinasseau et al., 2019). Among these pollutants, Bernardin-Souibgui et al. (2018) reported that urban sediments found in the detention basin of the experimental site were heavily polluted by polycyclic aromatic hydrocarbons (PAHs). Their contents were estimated at 197±36 ng g dw −1 (dry weight) for light PAHs and at 955 ± 192 ng g dw −1 for heavy PAHs. PCBs (polychlorinated biphenyls) were also recorded for the seven congeners of the European norm for a total of 0.2 to 2.1 mg kg dw −1 (Sebastian et al., 2014). Metallic trace elements (MTEs) were recorded in significant amounts, with Cu being found at about 280 mg kg dw −1 , Pb at about 200 mg kg dw −1 , Zn at about 1600 mg and Cd at about 5 mg kg dw −1 (Sebastian et al., 2014). MTEs, PCBs and PAHs were also recorded in the soils of the infiltration basin at similar concentrations, e.g., on average, at 0.26 mg PCBs kg dw −1 and at more than 940 ng g dw −1 for PAHs (Winiarski, 2014;Winiarski et al., 2006). These sediments and soils were also found to be contaminated by dioxins at about 36 ng g dw −1 (Winiarski, 2014) and by 4nonylphenols and bisphenol A, at concentrations varying from 6 to 3400 ng g dw −1 . However, MTEs and nonpolar PAHs found among SISs are unlikely to reach groundwaters. To illustrate, Pb and Cd were not recorded at depths below 1.5 m in the non-saturated zone of SISs (Winiarski et al., 2006). In contrast, polar organic pollutants were found to be transferred into aquifers as shown for some pesticides and pharmaceuticals (Pinasseau et al., 2019). These chemical contaminants represent potential energy and carbon sources for microorganisms and can also be detrimental to the growth of some organisms. They can thus have significant impacts on the biology of the contaminated soils and sediments.
Functional inferences from the knowledge on bacterial genera suggested an occurrence of several aquifer taxa involved not only in the nitrogen and sulfur cycles but also in hydrocarbon degradation. Campylobacter, Flavobacterium, Pseudomonas and Sulfurimonas cells have been associated with nitrogen and sulfur respiration processes, and Nitrospira and Leptospirillum have been associated with nitrification. The oligotrophic nature of the aquifer waters (concentrations of biodegradable dissolved organic carbon <0.5 mg L −1 ; Mermillod-Blondin et al., 2015) is thus likely to have induced a significant selective sorting of microbial taxa among the merged community. Most abundant aboveground taxa often require high energy (organic carbon) and nutrient levels to proliferate (Cho and Kim, 2000;Griebler and Lueders, 2009). Twice as many dissolved organic carbons were detected among aquifer waters of the experimental site recovered downstream of the SIS (1.93 mg L −1 ± 0.77) than upstream (0.88 mg L −1 ± 0.27) (Mermillod-Blondin et al., 2015), and this effect was confirmed for other SISs (Mermillod-Blondin et al., 2015;Winiarski, 2014). Sim-ilarly, a large part of the bacterial taxa identified from aquifer biofilms was attributed to aboveground sources by the SourceTracker approach. Indeed, watershed runoff waters and detention basin deposits were found to have significantly contributed to the buildup of the observed biofilm community structures. These biofilms showed a high content of 16S rRNA gene sequences belonging to the Beta-and Gammaproteobacteria. According to the ecological concept of r-and K-selection, these Proteobacteria are often considered as r-strategists, able to respond quickly to environmental fluctuations and colonize more efficiently newly exposed surfaces than other groups of bacteria (Araya et al., 2003;Fierer et al., 2007;Lladó and Baldrian, 2017;Manz et al., 1999;Pohlon et al., 2010). Moreover, because they tend to concentrate nutrients (Flemming et al., 2016), biofilms are likely to favour the survival of such opportunistic bacterial cells capable of exploiting spatially and temporally variable carbon and nutrient sources. Here, taxa recovered from aquifer biofilms were previously recorded to have the ability to use hydrocarbons as carbon and energy sources, e.g., Nocardia, Pseudomonas, Sphingobium and Novosphingobium. As indicated above, SISs and urban runoffs are well known to be highly polluted by such molecules (e.g., Winiarski, 2014;Marti et al., 2017;. The r-and K-selection ecological concept thus seems to apply to the biofilm community assemblages observed in this work. Taxonomic allocations of the 16S rRNA OTUs suggested that the aquifer waters and biofilms likely harboured opportunistic human, plant and animal pathogens of the genus Finegoldia, Campylobacter, Haemophilus, Duganella, Massilia, Nocardia, Aquabacterium, Flavobacterium, Pseudomonas, Streptococcus and Aeromonas. A striking observation was the enrichment of 16S rRNA gene reads allocated to Nocardia (about 4 % of the total reads) and Pseudomonas (about 35 % of the total reads) in the biofilms recovered from clay beads incubated downstream of the SIS. Nocardia and Pseudomonas 16S rRNA gene sequences were in much lower relative proportions in the aboveground compartments. The genus Pseudomonas was previously found to be abundant under low-flow conditions and was often associated with biofilm formation (Douterelo et al., 2013). Moreover, Pseudomonas species are well known for their ability to use hydrocarbons as energy and C sources. Regarding the Nocardia cells, there is little known of their ecology, but a few reports indicated a tropism for hydrocarbon polluted urban soils and sediments (e.g., Bernardin-Souibgui et al., 2018;Sébastian et al., 2014). There was no additional approach to further investigate the molecular ecology of Nocardia cells found among the investigated urban watershed. However, a tpm metabarcoding analytical scheme could be applied to DNA extracts to further explore the taxonomic allocations of Pseudomonas and some other tpm-harbouring genera. The applied tpm metabarcoding approach allowed an investigation of the coalescence of about 90 species among the investigated watershed including 50 species of Pseudomonas; 11 species allo-cated to Aeromonas; and some additional species allocated to Nitrospira, Nitrosomonas, Stenotrophomonas, Xanthobacter and Xanthomonas. A single Aeromonas species, A. caviae, was recorded among the above-and underground environments. More than 10 Pseudomonas species thriving in the recharge aquifer were detected among the aboveground compartments. P. umsongensis and P. chengduensis tpm OTUs were detected aboveground and represented a significant fraction of the tpm-harbouring bacteria retrieved from the aquifer samples. These two species were initially isolated from farm soil and landfill leachates (Kwon et al., 2003;Tao et al., 2014), further supporting the hypothesis that such soilassociated bacteria can migrate through runoff infiltration processes down to natural hydrosystems and can merge with aquifer communities. Regarding the Pseudomonas species that may pose health threats to humans, a tpm OTU affiliated to P. aeruginosa was found to be shared between the surface compartments and the biofilm tpm community developing on clay beads incubated downstream of the SIS. P. aeruginosa thus had the properties allowing for an opportunistic development among the aquifer. This species is known for its metabolic versatility and ability to thrive on hydrocarbons. This is an example of a bacterial r-strategist being able to get established opportunistically in aquifer biofilm communities impacted by urban pollutants. Apart from P. aeruginosa, the species P. putida and P. stutzeri, frequently detected in soils and wastewater treatment plants (e.g., Igbinosa et al., 2012;Luczkiewicz et al., 2015;Miyahara et al., 2010), were also recovered along the watershed and the aquifer. Although these two species were identified in human infections (Fernández et al., 2015;Noble and Overman, 1994), information about their virulence remains scarce. These species are therefore considered to be of less concern than P. aeruginosa and A. caviae, another opportunistic infectious agent (Antonelli et al., 2016) found in the aquifer. P. putida isolates have been shown to be involved in hydrocarbon degradation, and P. stutzeri can play a part in the N cycle either through denitrification or nitrogen fixation.

Conclusions
The knowledge gained from the present study demonstrated that coalescence of microbial communities from an urban watershed with those of an aquifer can occur and yield novel assemblages. Specialized bacterial communities of aquifer waters were slightly reshuffled by the aboveground communities. However, the assemblages observed among recent aquifer biofilms were found to be largely colonized by opportunistic r-strategists coming from aboveground compartments and often associated with the ability to degrade hydrocarbons, e.g., Pseudomonas, Nocardia and Novosphingobium cells. An urban aquifer was found, for the first time, to be specifically colonized not only by species like P. chengduensis, P. umsongensis, P. jessenii, P. chlororaphis and P. resinovorans but also by undesirable human-opportunistic pathogens such as P. aeruginosa and A. caviae. Artificial clay beads incubated in the aquifer through piezometers appeared to be highly efficient trapping systems (termed "germcatchers") for evaluating the ability of a SIS to prevent the transfer of undesirable r-strategists to an aquifer. Nevertheless, the long-term incidence of allochthonous bacteria on the integrity of aquifer microbiota remains to be investigated.
Free-living aquifer bacterial communities are not likely to be much impaired by exogenous cells. However, microbial communities developing as biofilms on inert surfaces might be significantly reshuffled through selective sorting likely induced, in part, by aboveground chemical pollutants. Microbial biofilms are key structures in the transformation processes of several chemicals and nutrients. They often display much higher cell densities than free-living populations (Crump and Baross, 1996;Crump et al., 1998;van Loosdrecht et al., 1990). Here, we have demonstrated that runoff and SIS bacterial taxa can colonize solid matrices of a deep aquifer. These modified communities could (i) alter geochemical processes which can indirectly impact other groundwater inhabitants, e.g., the amphipod Niphargus rhenorhodanensis and other taxa presented in Foulquier et al. (2011), or (ii) directly impact these inhabitants by inducing a modification of their microbial contents and potentially of their behaviour. The stygofauna feeds on bacteria and is well known to be significantly colonized by bacteria (e.g., Smith et al., 2016). The next step in these studies will be to investigate whether native aquifer biofilm communities can resist repeated invasions by opportunistic r-strategists and if these allochthonous bacteria will impact the ecological health of the stygofauna.
Author contributions. BC coordinated and designed the experiments. YC, VRN, TW, FMB, RB, LM, RM, FV, EB, DB, JV and BC performed the experiments and contributed to the analysis of the datasets. YC and BC prepared the paper with contributions from all co-authors.
Competing interests. The authors declare that they have no conflict of interest.