Genome‐wide scans detect adaptation to aridity in a widespread forest tree species

Patterns of adaptive variation within plant species are best studied through common garden experiments, but these are costly and time‐consuming, especially for trees that have long generation times. We explored whether genome‐wide scanning technology combined with outlier marker detection could be used to detect adaptation to climate and provide an alternative to common garden experiments. As a case study, we sampled nine provenances of the widespread forest tree species, Eucalyptus tricarpa, across an aridity gradient in southeastern Australia. Using a Bayesian analysis, we identified a suite of 94 putatively adaptive (outlying) sequence‐tagged markers across the genome. Population‐level allele frequencies of these outlier markers were strongly correlated with temperature and moisture availability at the site of origin, and with population differences in functional traits measured in two common gardens. Using the output from a canonical analysis of principal coordinates, we devised a metric that provides a holistic measure of genomic adaptation to aridity that could be used to guide assisted migration or genetic augmentation.


Introduction
Recent climate change projections suggest that 2°C global average warming by the end of this century is inevitable, and an increase of at least 4°C is possible (Peters et al. 2013). Changes in precipitation and more frequent extreme climatic events, such as droughts and heatwaves, are also predicted (Ganguly et al. 2009). Syntheses clearly demonstrate that global warming is already impacting on ecosystems and biodiversity around the world (Parmesan 2006;Chen et al. 2011). Thus, new approaches are needed to predict the climate resilience of species that are vital to ecosystem stability and to enhance ecological management and restoration. Widespread tree species are important targets when investigating resilience to future climate change because of their foundational role in maintaining ecological function (Lunt et al. 2013) and because they may contain genetic material of unrealized adaptive potential.
A ubiquitous practice towards managing and restoring terrestrial ecosystems is to re-establish indigenous or other desirable plant species (e.g. Vallejo et al. 2012;Breed et al. 2013). This can include extensive plantings to revegetate degraded landscapes and restore biodiversity, sometimes in conjunction with production uses, such as perennial cropping systems, agroforestry or carbon sequestration (Smith et al. 2013). However, investments in revegetation activities currently take little account of climate change (Vallejo et al. 2012;Breed et al. 2013) and typically focus on maintaining local patterns of genetic variation, assuming that local populations are maximally adapted to local conditions and aiming to avoid any risk of outbreeding depression (Broadhurst et al. 2008;Hereford 2009;Byrne et al. 2011). Under changing environmental conditions, local germplasm may not be optimally adapted to future conditions and a static view of 'local is best' may no longer be relevant (Breed et al. 2013). Consequently, a new genetic framework is needed to promote climate-resilient revegetation, based on a predictive understanding of patterns of adaptive variation in plant species and future climates (Booth 2013).
Historically, patterns of adaptive variation within plant species have been studied through physiological, morphometric and fitness comparisons in common gardens across climatic gradients (Alberto et al. 2013;Kremer et al. 2013;Savolainen et al. 2013). However, this approach can be expensive and may take years to yield results, especially with organisms, such as trees, that have long generation times (Neale & Kremer 2011). Recent developments in genomic technologies allow greater insights into patterns of adaptive variability (Neale & Kremer 2011;Strasburg et al. 2012;Savolainen et al. 2013) and have the potential to form the basis of tools determining patterns of adaptive diversity in plant species (Allendorf et al. 2010;Neale & Kremer 2011;Alberto et al. 2013). Indeed, the development of predictive models of adaptation based on genome-wide scanswhich do not need individual loci to be identifiedhave been envisaged, analogous to genomic selection methods used in breeding (Alberto et al. 2013).
To date, most genomic studies of local adaptation in trees have used either a candidate gene approach (Eckert et al. 2009;Strasburg et al. 2012) or a large number of anonymous, mainly neutral markers, for example AFLPs (Jump et al. 2006;Strasburg et al. 2012) and SSRs (Strasburg et al. 2012). In the candidate gene approach, discovery is usually limited by the finite number of genes being screened (except in some model organisms, e.g. Arabidopsis; Hancock et al. 2011). At the other end of the spectrum, large numbers of anonymous markers have been used to produce evidence of local adaptation (see Strasburg et al. 2012), but by their very nature, these lack the potential to identify the underlying genomic regions that are under selection. With the increasing accessibility of next-generation sequencing, new methods [e.g. RAD sequencing (Davey & Blaxter 2010), DArTseq (Sansaloni et al. 2011)] have been developed that provide nonmodel organisms with genome-wide sets of sequence-tagged markers that can be linked to a genomic database, thereby providing scope to identify regions of a genome that may be linked to local adaptation (Stapley et al. 2010).
Eucalypts are dominant tree species in many ecosystems and are planted worldwide. Thus, an understanding of the genetic basis for adaptation to climate is critical for climate-resilient forest management and reafforestation. Eucalyptus tricarpa (red ironbark) is an ecologically important widespread species that grows across a rainfall gradient in agricultural and forest landscapes of southeastern Australia. The previous establishment (Stackpole & Harwood 2001) of a range of provenances of E. tricarpa in two common gardens located near each end of the gradient provides an excellent study system to investigate local adaptation and phenotypic plasticity in a widespread eucalypt. A recent study by McLean et al. (2014) found evidence of local adaptation in functional traits in E. tricarpa, leading to a hypothesis of genetic-based climate adaptation. Our aim of this study was to test this hypothesis using an indirect approach, where we use genomic scans and outlier marker detection to detect signals of adaptation to aridity and cross-reference our results with direct evidence from common garden field trials. We propose that it might be feasible to use such an approach to bypass the need for common garden field trials for the assessments of local adaptation in widespread nonmodel species. This is particularly relevant when time and resources are limited, and we here explore a framework to implement such an approach.

Materials and methods
Study species, field sites and common gardens Eucalyptus tricarpa (red ironbark) is an important component of Victoria's threatened box-ironbark forests (Kelly & Mercer 2005) and is used for revegetation. The species varies in form across a moisture gradient, growing as dry woodlands in the northwest [mean annual precipitation (MAP) of 450 mm] and as tall, dense, wet sclerophyll forests in the southeast (MAP of 1200 mm). We selected three provenances from the 'dry' end of the aridity gradient (1 Tarnagulla, 2 Mt Bealiba and 3 Craigie); three provenances from the 'wet' end of the gradient (7 Mt Nowa Nowa, 8 Tuckerbox and 9 Martins Creek); and three provenances between the two extremes (4 Heathcote, 5 Heyfield and 6 Christmas Hills). Provenances were numbered from 1 (driest) to 9 (wettest) according to the MAP at the site of origin (Fig. 1, Table 1).
Two common garden field trials of E. tricarpa were established in 2000 using individual tree, open-pollinated seedlots (families) collected from multiple native provenances from throughout the natural range in southeastern Australia (Stackpole & Harwood 2001). The trials were planted towards each end of the rainfall gradient ( Fig. 1) and included the nine provenances that were sampled in the wild. The trials have been thinned to 60% of the original planting density of each family, on the basis of tree size (but not form) (see McLean et al. 2014).

Analysis of functional traits
A study of physiological and morphometric traits (hereafter 'functional traits') across the aridity gradient, based on measurements taken from eight to 10 trees in each of the two field trials and 10 trees from each wild population, was conducted in parallel with the genomic study. Traits are listed in Table 2. The results of the functional trait study are reported elsewhere (McLean et al. 2014).
Analysis of variance (ANOVA), using the PROC GLM procedure of SAS (SAS Institute; Version 9.1) with a onefactor fixed effects model, was used to test whether there was significant variation in functional traits among provenances in the wild and/or in the common gardens. To account for multiple testing, for each class of response variable (i.e. within 'wild provenances', and each trial), probabilities were corrected for a 5% false discovery rate (FDR) using a conservative 'dependent' FDR method that allows for correlation between tests (DFDR; Benjamini & Yekateuli 2001), using PROC MULTTEST of SAS with the DFDR option. Analysis of covariance, fitting site, a covariate (CAP1) and site 9 CAP1 interaction were also undertaken using PROC GLM.

Genomic methods
Foliage samples were collected from the nine wild provenances across the rainfall gradient ( Fig. 1, Table 1). Thirty adult trees were sampled at each locality, except in provenance 2 Mount Bealiba and provenance 6 Christmas Hills where 35 and 29 adult trees, respectively, were sampled. To avoid sampling close relatives within a provenance, samples were taken from trees that were separated by a minimum distance of 100 m (at least two tree heights) (Skabo et al. 1998). The location of each tree was recorded using a Garmin GPS-MAP 60CSx (accuracy ranging from 4 to 10 m).  Fig. 1 Locations of the nine study populations of Eucalyptus tricarpa (numbers 1-9) and the two common garden provenance trials at Huntly (open star) and Lake Tyers (solid star) in southeastern Australia. Population numbering corresponds to that in Table 1. DNA was extracted from leaf samples by the Australian Genome Research Facility (Adelaide, Australia), using an in-house CTAB protocol. DNA from each tree was standardized to approximately 50 ng/ll and sent to Diversity Arrays Technology Pty. Ltd. (Canberra, Australia) for genotyping using DArTseq technology (Sansaloni et al. 2011). DArTseq markers are expected to have many of the same qualities as DArT markers: dominant in nature, random dispersal across a genome and a large proportion coming from coding regions (Petroli et al. 2012). To ensure that all DArTseq markers were highly reproducible, only those with a 'Q' value > 2.5 ('signal-to-noise' measure comprising the ratio of the average number of counts for a sequence among the samples and their standard deviation) and a call rate ≥90% (percentage of valid scores made across all samples) were included in the final data set (about 50% of the markers were discarded). Table 2 Proportions of significant associations (P < 0.001) found in marker-trait association tests comparing results from 94 'outlier' markers to 3590 'neutral' markers in Eucalyptus tricarpa. Provenance-level allele frequencies of DArTseq markers were regressed against climatic variables and soil traits at each site, as well as functional traits measured in the wild and in two field trials. The number of marker-trait associations reflects the number of tests carried out. Chi-squared probability values in bold are those for which expected values of significant associations in the outlier data were adjusted upwards to 5 (from values <5). This adjustment made the tests more conservative and therefore did not bias the test in favour of high significance  Table S1 (Supporting information). † Soil variables: texture, soil particle size (clay, coarse sand, sand, fine sand, silt), chemistry (NH 4 , NO 3 , P, K, S, organic C, conductivity, pH in CaCl 2 , pH in water) (McLean et al. 2014). ‡ Functional traits measured in wild populations: leaf size, leaf thickness, leaf density, specific leaf area, circumference of main stem, cellulose 13 C, leaf 13 C, leaf 15 N, C mass, C/N ratio, leaf N area, leaf N mass, number of stems, total stem cross-sectional area (at 1.3 m). § Functional traits measured in common garden trials: leaf size, leaf thickness, leaf density, specific leaf area, circumference of main stem, cellulose 13 C, leaf 13 C, leaf 15 N, C mass, C/N ratio, leaf N area, leaf N mass, total cross-sectional area, height of main stem. ¶ Plasticity (relative trait range (Valladares et al. 2006) of: leaf size, leaf thickness, cellulose 13 C, leaf density, height of main stem, total stem cross-sectional area (at 1.3 m), specific leaf area (SLA).
Outlier loci representative of diversifying selection (i.e. loci whose allele frequencies differ more among the nine provenances than would be expected through drift alone) were identified using BAYESCAN v. 2.1 (Foll & Gaggiotti 2008). Although the island model of allele frequency correlation is an underlying assumption of this software, various simulation studies have indicated that BAYESCAN is one of the most robust differentiation methods for outlier detection (Perez-Figueroa et al. 2010;Narum & Hess 2011;Vilas et al. 2012;Savolainen et al. 2013), even when this assumption is violated, for example, in a population with hierarchical structure (De Mita et al. 2013). Markers with a low '1-ratio' (i.e. 'allele' or 'band' frequency <0.10 or >0.90) were excluded from the BAYESCAN analysis, as recommended by the software manual, leaving 3684 markers. Default parameters were used for the BAYESCAN searches, prior odds for the neutral model were set to 100, and the F IS prior was set to 'uniform between 0.0 and 0.3' in accordance with typical values of this inbreeding statistic for eucalypts (Byrne 2008). A 'q value' maximum threshold of 0.05 was used to identify outlier loci (this yielded a FDR of approximately 5%), which were used to define an 'adaptively enriched genetic space'. The full DArTseq data set and the data set comprising 94 outlier loci were each analysed in a population genetics framework using GENALEX 6.41 (Peakall & Smouse 2006) to generate matrices of pairwise genetic distances between individuals (Huff et al. 1993) and provenances (the widely used Nei (1972) genetic distance ;Hedrick 2000;Nei 1972). The matrix of genetic distance calculated using outlier markers and all individuals is called the 'adaptive genetic distance matrix'. GENALEX 6.41 was also used to perform analyses of molecular variances (AMOVAs) and Mantel tests for correlation between provenance-level genetic distance (Nei's genetic distance;Hedrick 2000;Nei 1972) and geographic distance.
Thirty-five climatic variables (Table S1, Supporting information) for each tree position (based on GPS coordinates) were derived from climatic surfaces in the AN-UCLIM 6.1 software package (Xu & Hutchinson 2011). ANUCLIM has a resolution of about 1 km and incorporates altitude data into its climate estimates (such that estimates within a 1 km grid can vary depending on altitude). In addition, a 'moisture index' (ratio of precipitation to potential evaporation) from the Atlas of Living Australia (ALA) (http://www.ala.org.au/) was also used. Climatic variables were normalized using Primer-E (Clarke & Gorley 2006).
For each marker in both 'outlier' and 'neutral' data sets, linear regression was used to identify significant associations between the population-level allele frequencies and environmental variables as well as 14 functional traits (Table 2). A chi-square goodness-of-fit test was used to determine whether there were significant differences between the rate of association discovery in adaptive markers and neutral markers for classes of climatic variables (rainfall, radiation, temperature, moisture indices), soil variables or functional traits.

Aligning the adaptively enriched genetic space with environment
The matrices of pairwise genetic distances between individuals that were calculated in GENALEX 6.41 were analysed using PERMANOVA software (Anderson et al. 2008).
Primer-E (Clarke & Gorley 2006) was used to identify sets of highly correlated climatic variables (Table S2, Supporting information). Of 35 climatic variables from ANUCLIM, a subset of 15 least intercorrelated, representing four classes, was used for canonical analyses of principal coordinates (CAPs): (i) temperature (mean annual temperature, mean diurnal range, maximum temperature of the warmest period, minimum temperature of the coldest period, mean temperature of the driest quarter), (ii) rainfall (mean annual precipitation, precipitation of the wettest period, precipitation of the coldest quarter), (iii) solar radiation (mean annual radiation, radiation seasonality, radiation of the driest quarter, radiation of the coldest quarter) and (iv) moisture indices (annual mean moisture index, highest period moisture index, mean moisture index of highest quarter).
Principal coordinate analyses (PCoAs) and canonical analyses of principal coordinates (CAPs; Anderson & Robinson 2003;Anderson & Willis 2003) were undertaken with PERMANOVA (Anderson et al. 2008). CAP is a classical canonical correlation analysis between a subset of the principal coordinates that summarizes a resemblance matrix and a set of environmental variables associated with each object. The purpose of the CAP was to find axes through the multivariate cloud of points (corresponding to the adaptively enriched genetic space of E. tricarpa) that had the strongest correlation with another set of continuous variables, in this case, climatic variables (above) and soil variables (describing texture, chemistry -NH 4 , NO 3 , P, K, S, organic C, conductivity, pH in CaCl 2 and pH in waterand soil particle sizeclay, coarse sand, sand, fine sand and silt) (McLean et al. 2014). The first axis of each of these analyses, CAP1, represented the direction of molecular change most closely associated with change in environmental variables. This strategy for alignment of a molecular and environmental space is conceptually similar to the approach undertaken by Parisod & Christin (2008). CAP1 provided what we hereafter term an 'adaptive genetic index' that was used as a quantitative predictor to model genetic adaptation to either aridity or soil. As the soils showed no correlation with the adaptively enriched genetic variation (see Results), we henceforth refer to the adaptive genetic index (CAP1) with regard to climate.
As the similarity in environmental variables among trees from the same provenance (arising from the 1 km precision of the ANUCLIM climate surface) may result in pseudoreplication, the strength of the CAP1 association with environmental variables was confirmed by linear regression analyses at the provenance level (n = 9). We tested all climatic variables at this stage to determine whether there were variables other than those used in the CAP that were also closely associated with CAP1. In each test, the dependent variable was CAP1 and the independent variable was the environmental variable. The environmental variables were divided into two classes: (i) 35 climatic variables (Table S1, Supporting information) plus a 'moisture index' from ALA and (ii) 15 soil variables (Table 2; Table S3, Supporting information). Linear regression was performed at the provenance level (n = 9), using the PROC REG procedure of SAS. DFDR corrections for multiple testing were made within each class of environmental variables, as previously detailed. Similarly, we used linear regression to test whether the adaptive genetic index (CAP1) was associated with any functional traits, using provenance means for functional trait measurements from three data sets (wild provenances, Lake Tyers Trial and Huntly Trial), and controlling for multiple testing using a 5% DFDR within each of the three data sets.

The genetic architecture of adaptation
Each DArTseq marker is tagged by approximately 60 bp of DNA sequence data. The DArTseq analytical service included a BLAST analysis of each marker against the E. grandis complete genome sequence (JGI ver. 1.1), providing information about the position of each marker on a linkage group (chromosome). Most of the markers gave a single alignment, and some could not be located at all on the E. grandis genome. When more than one alignment was obtained, the alignment with the best score was used to assign a position to the marker. To test whether outlier markers were clustered or distributed randomly across the genome, we: Matching provenances to sites to which they are best adapted To determine whether the results of the CAP could be used for predictive purposes, we used the weightings (canonical eigenvectors) derived from CAP of the 15 climatic variables to calculate a genetic-based Aridity Index (AI) for each site of interest, using the following formula (adapted from equation 5.14 in PERMANOVA manual): where a = normalized climatic variable x and b = the weighting of the climate variable on the canonical eigenvector aligned with CAP.
The AI for each of the collection sites and the two common gardens was calculated to predict (i) provenances that would perform well in the common gardens and (ii) patterns of adaptation to aridity across the geographic range. The AIs of the common garden sites were usedthrough site matchingto predict the provenances that should perform best in each garden, based on the age 13 growth data for the trees from which functional traits were measured (McLean et al. 2014).

Spatial analysis
Geographic locations of all records of E. tricarpa were downloaded from ALA (accessed 18 September 2013), and outlying samples likely to be errors were removed. The distribution area was defined by drawing a polygon encompassing a 15 km buffer around each retained record. Ordinary kriging (Schabenberger & Gotway 2004) was used in ARCGIS 10.1 (ESRI, 2012) to interpolate the AIs calculated for each ALA record and each provenance study site and to create a continuous surface of AI for the distribution area. A spherical semivariogram, with variable search radius and considering the values of 12 neighbouring records, was used for the spatial calculations (Schabenberger & Gotway 2004).

Provenance differentiation
AMOVA of the full E. tricarpa DArTseq data set (274 individual trees by 6544 markers; 4.6% missing data) revealed significant molecular differentiation between provenances (P < 0.001) accounting for 7% of the total molecular variation. The genetic distances between provenances were strongly positively correlated with geographic distances (r 2 = 0.72; P = 0.001), with distinct spatial structuring of genetic variation from east to west, along the prevailing aridity gradient (Fig. 2A).
Defining an adaptively enriched genetic space BAYESCAN 2.1 identified 94 loci (2.6% of the BAYESCAN input) that had outlier F st values indicative of disruptive selection (none of the outlier loci had negative alpha values indicative of stabilizing selection; see Table  S4 (Supporting information) for F st distribution data). A greater proportion of the variation in the PCoA was accounted for by the first dimension (49%) when using the outlier data set (Fig. 2B) compared with the PCoA when using the full data set (7.3%; Fig. 2A). AMOVA of the outlier data set showed that 36% of the variation between trees could be accounted for by differences among provenances, a fivefold increase over that observed in the full data set. There was still a strong correlation between provenance-level geographic and genetic distances (r 2 = 0.81; P < 0.001; n = 9), but this was expected from the northwest-southeast alignment of the aridity gradient. Regression analyses using provenance-level data showed a significant (v 2 1 = 3214; P < 0.001) eightfold increase in the proportion of outlier loci exhibiting a significant association with provenance variation in the climatic variables compared with the remaining putatively 'neutral' markers (Table 2). Testing the subset of 15 climatic variables used for the CAP (below), the enrichment was similar, at sevenfold (v 2 1 = 1392; P < 0.001). The enrichment of climate-marker associations in the outlier data set was highest for ANUCLIM moisture indices, followed by radiation variables. No significant enrichment was observed for variables describing the physical and chemical soil properties (Table 2), indicating that the main signal of selection on these outlier markers is not associated with soil properties. Further evidence that the outlier markers show signals of climate adaptation was provided by their  Table S1, Supporting information). Note that the axes of (C) are not scaled by their respective eigenvalues (Anderson et al. 2008).
association with genetic-based provenance variation in functional traits ( Table 2). Most of the functional traits assessed from trees growing in the common-environment field trials exhibited significant (P < 0.05) differences among provenances (Table S5, Supporting information), demonstrating significant quantitative genetic differentiation across the species' range. While only 1.5% of provenance-level regressions between the outlier markers and trial traits were significant (P < 0.001), this was a significant (v 2 1 = 68.2; P < 0.001) threefold increase over that observed with the putatively neutral loci (Table 2).

A climate-aligned adaptive genetic index
The climate-based adaptive genetic variation among trees was summarized by the CAP, undertaken using the adaptive genetic distance matrix and the set of 15 representative climatic variables for the 274 individual trees. Eighty-three per cent of the variance in the adaptive genetic distance matrix for individuals was explained by the optimum number of PCo axes for this analysis [m = 13; see Anderson et al. (2008)]. The integrated molecular change defined by the main CAP1 axis was strongly correlated with the climatic data set (squared canonical correlation, d 2 = 0.98; P = 0.001) and accounted for 50% of the variation in the adaptive genetic distance matrix. This compares with only an additional 7% being explained by CAP2 (d 2 = 0.79; P = 0.001). Molecular variation along CAP1 reflected the aridity ranking of the provenances, with trees from provenances from wetter regions having more negative CAP1 values and those from the drier regions having more positive values (Fig. 2C). Thus, CAP1 represents a 'climate-aligned adaptive genetic index', with increasing values along CAP1 aligning closely with vectors for climatic variables that represent high temperature and low rainfall (Fig. 2C).
At the provenance level, linear regressions (Table S6, Supporting information) indicated that 23 (66%) of the original 35 climatic variables were significantly associated with change in CAP1 (after DFDR adjustment; P < 0.05). For 13 (37%) of the variables, the associations were highly significant (P < 0.001). Increases in CAP1 at the provenance level were most closely associated with an increase in the maximum temperature in the warmest month (TMXWM, R 2 = 0.98) and decreases in the mean rainfall in the driest periods (RDRYQ, R 2 = 0.96; RDRYM, R 2 = 0.96), the lowest period moisture index (MIL, R 2 = 0.97) and the mean moisture index of the lowest quarter (MIMLQ, R 2 = 0.97).
While variation along CAP1 was not associated with significant phenotypic differences among provenances in traits measured from the wild (Table S7, Supporting information), it was associated with genetically based differences in functional traits measured in the common gardens (Table S8, Supporting information). At the dry common garden (Huntly), the only significant associations found were greater leaf thickness [differences among provenances F 8,80 = 8.5, P < 0.001; association with CAP1, R 2 = 0.82, P (DFDR) < 0.05] and lower SLA (F 8,80 = 7.5, P < 0.001; R 2 = 0.79, P (DFDR) < 0.05) with increasing values of CAP1 (i.e. increasing molecular adaptation to aridity). At the wet common garden (Lake Tyers), the only significant association found was between CAP1 and leaf size, where provenances with higher CAP1 values tended to have genetically smaller leaves [F 8,76 = 6.1, P < 0.001; R 2 = 0.81, P (DFDR) < 0.05] (Fig. 3). The trial-specific nature of these associations was confirmed by analysis of covariance (Table 3) that showed differences in the slopes of the provenance-level relationships between CAP1 and these functional traits in the two common gardens. Thus, the provenance-level changes in the outlier loci, as described by CAP1, were strongly associated with climate variation and quantitative genetic-based changes in functional traits, arguing that CAP1 describes molecular change associated with adaptation of the provenances to increasing aridity. Further evidence that provenance variation in CAP1 is adaptive comes from the relative growth of the provenances at the two common garden sites. Analysis of covariance (Table 3) indicated that the slope of the provenance-level relationships differed significantly between the wet (Lake Tyers) and dry (Huntly) sites for CAP1 and growth, as measured by either tree cross-sectional area (trial by CAP1 interaction F 1,14 = 10.2, P < 0.01) or tree height (F 1,14 = 9.8, P < 0.01). This result indicates significant genotype-by-environment interactions for growth traits and provides independent evidence that variation in CAP1 has fitness consequences. At the Lake Tyers (wet) common garden, there were significant differences in performance among provenances (height, F 8,76 = 4.6, P < 0.001; total cross-sectional area, F 8,76 = 7.2, P < 0.001), with those originating from wetter sites showing better performance than those from drier sites. For example, provenance-level regression of total cross-sectional area (R 2 = 0.48, P = 0.04) and tree height (R 2 = 0.59, P = 0.02) against CAP1 values yielded significant negative correlations, indicating that the wet-adapted provenances (negative values of CAP1) grew taller and had a greater total cross-sectional area than dry-adapted provenances when grown at the wet trial site. Similarly, at the dry site (Huntly), there were significant, or near-significant, differences among provenances in performance (height, F 8,80 = 2.8, P < 0.01; cross-sectional area, F 8,80 = 1.8, P < 0.1). For example, the two best-performing provenances were those originating from the two most arid sites (1 Tarnagulla and 2 Bealiba), and CAP1 values were positively linearly associated with cross-sectional area (R 2 = 0.44, P < 0.05). Thus, a provenance's CAP1 valueits climate-aligned adaptive genetic indexappears to predict provenance growth and performance to some degree, with more arid-adapted provenances tending to perform relatively better on more arid sites and more wet-adapted provenances performing better on wetter sites.

Genetic architecture of adaptation to aridity
At the provenance level, all 94 outlier loci were correlated strongly with either climate or functional traits or both (Table S9, Supporting information). Twenty-nine per cent of the outlier markers showed strong, significant correlations with CAP1 (P < 0.001), in contrast to <1% of neutral DArTseq markers (i.e. >1000-fold enrichment in the outlier loci; Table 2). Strong associations were found between 82% of the outlier markers and genetically based differences among provenance functional traits (78% in the dry trial, 77% in the wet trial). Seventy markers (75%) were associated with both climate and genetic traits (Table S9, Supporting information).
Of the 6544 markers used in this study, 4134 could be located on the E. grandis genome sequence, and 3808 of these mapped to the 11 main eucalypt chromosome scaffolds ( Fig. 4; Table S9, Supporting information). Of the 94 outlier loci, 48 mapped to these 11 chromosomes, four mapped to other contigs (those that have yet to be assigned to one of the 11 chromosomes), and 42 could not be located on the E. grandis genome sequence (possibly, in part, due to differences in sequence between E. tricarpa and E. grandis). The neutral markers provided reasonably uniform coverage of the genome (Fig. 4). Outlier markers associated with adaptation to aridity occurred on all chromosomes, and while the proportion per chromosome varied tenfold (from 0.27% on chromosome 2 to 2.7% on chromosome 4), these differences were not statistically significant (GLIMMIX F 10, Table 3 Analysis of covariance (ANCOVA) fitting Eucalyptus tricarpa provenance-level traits to site (i.e. common garden), CAP1 (i.e. adaptive molecular index) and CAP1 9 site interaction fragments of the chromosomes shown in Fig. 4 indicated that an observed 3-Mb fragment with five outlier markers was highly unlikely through chance alone. This conclusion was reached regardless of whether randomization was undertaken across (P = 0.0013) or within (P = 0.0028) chromosomes. In contrast, the frequency of 3-Mb fragments observed with three outlier markers was not significantly greater than expected by chance (P > 0.05). Evidence of nonrandom distribution of the outlier markers along the genome is further support for selection impacting regions of the genome in which these outlying markers occur.
The statistically significant cluster of adaptive markers was at the end of chromosome 8 and encompassed a block of 6 markers (Fig. 4). Four of these outlier markers lay within 403 kb of each other, and another two markers were within 2.9 Mb of these four (Table S9, Supporting information). Comparison with the E. grandis annotated genome sequence (Myburg et al. 2014) showed that these markers lie in the vicinity of a histidine kinase, a putative microRNA (potentially involved in leaf longevity and leaf morphogenesis) and a cytochrome P450 gene, all of which could have adaptive significance across an aridity gradient. Nevertheless, while there may be localized hot spots for adaptation, the widespread distribution of the outlier markers across the eleven chromosomes argues that adaptation to aridity is a genome-wide phenomenon and is likely to involve multiple traits and genes.

Matching provenances to sites to which they are best adapted
To extrapolate the identified adaptive change associated with CAP1 to the whole geographic range of E. tricarpa, we calculated the AI for all known locations of E. tricarpa, using the 15 normalized climatic variables used in the original CAP. Values of AI were initially calculated for the sites of origin of each provenance and for the common gardens. As expected, the trend in growth patterns described for CAP1 is reflected by the AI; in general, the better-performing provenances in each trial were from sites with AI most similar to those of the trial sites (Fig. 5). Calculating the AI for other known locations of the species allows visualization of predicted climate adaptation across the E. tricarpa distribution;   Fig. 5 Association of performance (mean total cross-sectional area of stems) of Eucalyptus tricarpa provenance material in the field trials with the CAP-derived Aridity Index (AI) at the site of origin of each provenance. Provenances from wetter areas (more negative AI) grew better in the wetter field trial (Lake Tyers, grey squares and regression line), while provenances from drier areas (more positive AI) grew better in the drier field trial (Huntly, black diamonds and regression line). The AI at each trial site is indicated by an arrow (Lake Tyers, grey arrow; Huntly, black arrow). Figure 6 shows a clear trend from highly positive AIs (i.e. dry-adapted provenances) in the northwest of the range through to highly negative values (i.e. wetadapted) in the east of the range. There are patches of higher/lower rainfall in the middle of the range that do not fit with the general west-to-east trend.
The AI exhibited an 11-fold enrichment in marker associations with outlier markers compared to neutral markers (Table 2). While the AI calculated here was derived from 15 selected climatic variables, single climatic variables, such as the lowest period moisture index (MIL) or maximum temperature of the warmest period (TMXWM), could provide good surrogates for practical application in this case. For example, MIL was strongly correlated (P < 0.001) at the provenance level with CAP1 (R 2 = 0.97) and with 30% of the outlier markers. TMXWM was also strongly correlated with CAP1 (R 2 = 0.96) and 21% of the outlier markers.

Discussion
Many authors have signalled the potential of genomic techniques to inform management decisions regarding conservation, restoration or assisted migration (Allendorf et al. 2010;Funk et al. 2012). Maximizing survival in plantings for ecological restoration or climate change mitigation relies in part on choosing the most suitable germplasm. Species and populations that are more plastic are likely to adjust to climatic changes in situ, while in those showing strong local adaptation, assisted migration is likely to be beneficial (Hof et al. 2011;Richter et al. 2012;Aitken & Whitlock 2013;Lunt et al. 2013). Knowledge of the patterns of local adaptation and plasticity in a species is, therefore, critical for making sound seed-sourcing decisions for revegetation and restoration.
Information underpinning such decisions should come from multiple sources, including ecological similarity, phenotypic similarity (including plasticity), genomewide similarity at neutral markers and genetic similarity at adaptive loci (Allendorf et al. 2010).
The best method of detecting local genetic adaptation within a species is to establish multiple common garden experiments across a range of environments (Matyas 1996;Aitken & Whitlock 2013;Kremer et al. 2013), but this requires significant resources for establishment and analysis, particularly when involving long-lived organisms such as trees (Neale & Kremer 2011). One motivation for this research was to determine whether genome-wide scanning technology, combined with outlier marker detection, could be used to bypass time-consuming and expensive common garden experiments. The study demonstrated the feasibility of a relatively simple approach in which plants are sampled from across the full range of ecological variants, genotyped using a high-throughput genome-wide approach and assessed for population structure and outlier loci. If outlying markers are detected, linear regression can be used to screen them for associations with environmental variables, such as climate or soil. CAP can be used to identify the vector describing the change in the environmental variables most strongly correlated with the adaptive molecular change variation (e.g. the AI) to allow the predictions of contemporary and future adaptive surfaces for species.
Our genome-wide scanning approach provides a means of modelling adaptation and has the potential to match overall adaptation of provenances to sites without identifying particular loci (Alberto et al. 2013), nor necessarily knowing the underlying functional traits (e.g. frost or drought resistance, heat tolerance) Fig. 6 Predicted surface of climate adaptation across the entire range of Eucalyptus tricarpa. The map is derived from AI values calculated for all reliable records of E. tricarpa in the Atlas of Living Australia. Provenances used in this study are represented by numbers in white circles. Common garden trials are indicated by squares (HL -Huntly (dry) trial; LT -Lake Tyers (wet) trial). Inset shows the distribution of E. tricarpa in relation to the whole of Australia. involved in adaptation. Although we have shown that there are strong population-level associations between outlier loci in E. tricarpa and various climatic variables and some functional traits, there are numerous ecophysiological traits that we did not assess, but for which we may, unknowingly, have detected adaptive loci. The genome-wide scanning approach to adaptation detection allows for the identification of a wide range of adaptive loci, without the a priori constraints imposed by a targeted candidate gene survey (Franks & Hoffmann 2012).
Further validation of the methodology will involve confirmation that allele frequencies of putatively adaptive loci are correlated with climate and/or adaptive traits in provenances of E. tricarpa other than those that have been tested here. Provenances of particular interest include those that grow in zones where the rainfall does not align with the general west-east trend of the rainfall gradient. Our approach will also be useful for identifying changes in population-level adaptive allele frequencies over time, both within and between generations (see Jump et al. 2006), as theyor the Aridity Indices derived from themcould provide a means of monitoring adaptation to environmental change, as well as hybridization (e.g. where differentiated germplasm has been planted) and genetic drift.
Many studies of adaptation focus on individual mutations in selected candidate genes. However, focusing on particular genes could result in other important sources of adaptive variation being overlooked (Allendorf et al. 2010). The genome-wide distribution of adaptive markers found in E. tricarpa is concordant with theoretical predictions (LeCorre & Kremer 2003) and other experimental results (e.g. Eckert et al. 2010) that have suggested that adaptive genetic variation is a genome-wide phenomenon (Neale & Kremer 2011). Adaptation is generally thought to involve multiple loci of small effect that are spread across the genome, combinations of which may produce an 'adaptive phenotype' (Pritchard & Di Rienzo 2010;Berg & Coop 2013). 'Hot spots' of adaptive differentiation, such as the cluster of six adaptive markers identified on chromosome 8, may represent co-adapted gene complexes that are under divergent selection (Nosil et al. 2009). Alternatively, they may be a consequence of genetic hitchhiking (Barton 2000;Schlotterer 2003), where selectively neutral markers are linked to a single gene that is under positive selection.
In summary, we have shown that a genome-wide scan and outlier analysis can detect evidence of adaptation and its environmental drivers in wild populations of nonmodel species. This approach will be useful for determining climatic adaptation in species identified for environmental plantings so that selection of seed sources accounting for climate adaptation ('climate-adjusted provenancing') confers greater resilience to future climates.

Data accessibility
DArTseq data, climate data and functional trait data may be accessed via Dryad: doi:10.5061/dryad.qq20s.

Supporting information
Additional supporting information may be found in the online version of this article.
Table S1 ANUCLIM climatic variables and abbreviations.

Table S9
Outlier DArTseq markers from Eucalyptus tricarpa, associations with climate, CAP1, Aridity Index and functional traits in the two provenance trials; number of locations that each marker could be placed on the E. grandis reference genome; the scaffold number (scaffolds 1-11 correspond to the 11 chromosomes of E. grandis) to which the marker was mapped (with best score), position on the linkage group or scaffold and the first 60 bp of the DNA sequences of the DArTseq marker.