Research Article |
|
Corresponding author: Ghennie Tatiana Rodríguez-Rey ( ghennie.rodriguez@ucaldas.edu.co ) Academic editor: Myriam Heuertz
© 2026 Anneth Díaz-Reyes, Natalia Aguirre-Acosta, Carolina Feuillet-Hurtado, Ghennie Tatiana Rodríguez-Rey.
This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Citation:
Díaz-Reyes A, Aguirre-Acosta N, Feuillet-Hurtado C, Rodríguez-Rey GT (2026) Population genomics of Ulex europaeus in the Northern Andes: insights into an invasive species in high-mountain ecosystems. Plant Ecology and Evolution 159(1): 106-122. https://doi.org/10.5091/plecevo.165188
|
Background and aims – Ulex europaeus is an invasive allopolyploid (hexaploid) plant considered a global threat that has successfully colonized various regions, including the Northern Andes. This study aimed to assess its diversity and genetic structure in six sampling sites from the Central and Eastern Andes of the Northern Andes of Colombia, located in conservation-priority highland landscapes, using single nucleotide polymorphisms.
Material and methods – Given the complex inheritance patterns of polyploids, both diploid and hexaploid datasets were analysed to estimate genetic diversity (e.g. Ho, He, FIS, and private alleles) and population structure (using e.g. FSTp, AMOVA, STRUCTURE, PCA, DAPC analyses). Additionally, the introduction history of the species in the Northern Andes, particularly the introduction from the Eastern to the Central Andes, was investigated using ABC-RF.
Key results – Diploid and hexaploid datasets showed consistent clustering patterns, supporting predominantly disomic inheritance. Populations exhibited high heterozygosity (diploid: Ho = 0.302, He = 0.252; hexaploid: Ho = 0.483, He = 0.222). Genetic structure analyses showed moderate differentiation (FSTp = 0.118 for diploid; FSTp = 0.074 for hexaploid) and significant isolation by distance (diploid: r = 0.479, p value = 0.022; hexaploid: r = 0.507, p value= 0.012), but without a clearly defined spatial pattern, suggesting restricted gene flow influenced by external factors. ABC-RF analyses indicated at least two independent introduction events in the Eastern Andes, followed by multiple dispersal events into the Central Andes.
Conclusion – Ulex europaeus populations in the Northern Andes maintain high heterozygosity and restricted gene flow. Polyploidy likely contributes to preserving genetic diversity, while multiple introduction events and human-mediated dispersal shape population structure, underscoring the complex invasion dynamics of this species in high-mountain ecosystems.
gene flow barriers, genetic structure, invasion history, SNPs, Andean paramo, polyploidy
In recent decades, the number of introduction events and the successful establishment of invasive species have increased, primarily driven by climate change and human activities (
Assessing genetic diversity and structure within invaded regions allows us to understand how introduced species respond to local environmental filters, landscape fragmentation, and stochastic demographic processes (
Genetic studies on invasive species have revealed contrasting patterns of diversity depending on introduction history, reproductive strategies, and life history traits (
Reductions in genetic diversity often occur during the establishment stage, particularly when propagules originate from a single introduction event or a limited source population, potentially constraining adaptive potential in the novel environment (
Among the mechanisms that enhance invasion success, polyploidy stands out as a key factor, particularly in novel environments (
Ulex europaeus L. (gorse) is a widespread invasive shrub and an ideal model to study the influence of polyploidy and genetic variability on invasion success. It is an allopolyploid that originated from hybridization between two genetically distinct ancient lineages within the Ulex genus with wide climatic ranges. These lineages separated around 5 million years ago (Mya), and after a prolonged period of independent evolution, they gave rise to the hexaploid allopolyploid U. europaeus between ca 0.7 and 2 Mya (
Previous genetic studies on U. europaeus have been conducted both in its native range (i.e. Spain, France, and Scotland) and in invaded regions (i.e. Chile, Réunion, New Zealand, and the United States).
Furthermore, given the hybrid origin of U. europaeus from divergent lineages, the species may exhibit a predominantly disomic inheritance pattern, in which only homologous chromosomes pair and recombine during meiosis, producing gametes with allelic combinations similar to those of diploid species (
Here, we evaluated the population genomics of U. europaeus in high mountain ecosystems of the Central and Eastern Andes using single nucleotide polymorphisms (SNPs). Specifically, we addressed the following questions: (1) How does the genetic diversity of U. europaeus vary across the Northern Andes? (2) Is gene flow restricted among sampling sites in this region? (3) What was the likely introduction history of U. europaeus from the Eastern to the Central Andes? We hypothesize that polyploidy may contribute to high heterozygosity in these populations, which could influence the species’ genetic diversity. Additionally, we expect that biogeographic barriers limit genetic connectivity and promote population differentiation due to unsuitable conditions for U. europaeus in adjacent ecosystems.
For the population genomic analysis, samples were collected in six sampling sites located in the Central and Eastern Andes, within the Northern Andes of Colombia, where the species has established naturalized, non-planted populations and with individuals in reproductive stage (Fig.
In the Central Andes, three sampling sites were selected. One site is located in the Lagunilla sector of Murillo, Tolima Department (Central-TOL1), within the buffer zone of Los Nevados National Natural Park and characterized by a subpáramo ecosystem. The other two sites are situated in high Andean forests: the Alaska sector of Murillo, Tolima (Central-TOL2), and San Felix, Caldas Department (Central-CAL). Both sites are subject to disturbance from livestock activity.
In the Eastern Andes, the selected sampling sites include Bogotá D.C. (East-DC), the capital of Colombia and the area closest to the Eastern Hills, where U. europaeus was first recorded in the Northern Andes; Chía, Cundinamarca Department (East-CUN); and Ventaquemada, Boyacá Department (East-BOY). All three are located within high Andean Forest ecosystems and are embedded in a landscape matrix that has been modified and fragmented by human activities.
At each sampling site, mature leaves were collected from 30 adult individuals (N = 180), each separated by at least five meters to reduce the likelihood of sampling clonal individuals. Samples were labelled and preserved in silica gel. Subsequently, leaf tissue was macerated with liquid nitrogen and stored at -20°C until DNA extraction.
Genomic DNA (gDNA) was extracted using the CTAB II protocol (
Sequencing data were demultiplexed using Illumina bcl2fastq v2.20, allowing one or two mismatches (or N) in barcode recognition, depending on barcode uniqueness. Adapter sequences were removed and reads not matching with restriction sites at the 5’ end were discarded. Reads were trimmed with Trimmomatic v.0.39 (
Due to the absence of a reference genome for U. europaeus, a de novo clustering approach was used. Combined reads from all individuals were clustered using CD-HIT-EST v.4.6.1 (
Given that U. europaeus may exhibit predominantly disomic inheritance with potential segmental components, parallel analyses were performed under both inheritance assumptions to capture its genomic complexity. Because many population genomic tools have limited compatibility with polyploid data, analyses were conducted using diploid and hexaploid datasets when possible (ploidy = 2 and ploidy = 6, respectively). In cases where polyploid data could not be processed, only the diploid dataset was used. Comparing both approaches allowed us to assess the robustness of genetic patterns inferred under different inheritance models.
Variant calling and SNP genotyping for both datasets were performed using Freebayes v.1.3.6 (
Genetic analyses were conducted using a multi-locus, genome-wide approach that enabled a comprehensive assessment of genetic diversity within sampling sites and of the population genetic structure. This approach provides robust insights into the evolutionary history of populations by including loci with different genealogies, thereby increasing statistical power, reducing false discovery rates, and improving the precision of population genetic parameter estimates (
Observed (Ho) and expected (He) heterozygosity were estimated at the overall level for both diploid and hexaploid datasets using the poppr v.2.9.4 and vcfR v.1.14.0 packages in R v.4.2.1 (
To evaluate genetic differentiation, Wright’s fixation indices (FST) were estimated pairwise among sampling sites for both diploid and hexaploid datasets using 10,000 permutations in the StAMPP v.1.6.3 package in R (
The number of genetic clusters (K), representing the number of inferred populations to which individuals could be assigned, was estimated using the Structure v.2.3.4 tool (
To explore patterns of genetic variation among individuals and populations, visualize population structure, and assess the degree of differentiation between sampling sites in the Northern Andes, principal component analysis (PCA) and discriminant analysis of principal components (DAPC) were also performed for both datasets using the adegenet v.2.0.1 package in R (
The relationship between geographic distance and genetic differentiation among sampling sites was evaluated for both datasets (diploid and hexaploid) using a Mantel test (
To infer the introduction history of U. europaeus from the Eastern to the Central Andes, an Approximate Bayesian Computation analysis based on the diploid dataset was performed using the Random Forest algorithm (ABC-RF), implemented in DIYABC-RF v.1.2.1 (
In a first step, three introduction scenarios were tested for the Eastern Andes sites East-CUN and East-DC, located near the Eastern Hills of Bogotá, where the earliest records of the species in Colombia are concentrated. The scenarios evaluated were: (a) independent introductions of East-CUN and East-DC from a shared ancestral population; (b) introduction to East-DC from the ancestral population, followed by introduction to East-CUN from East-DC; and (c) introduction to East-CUN from the ancestral population, and a subsequent introduction to East-DC from East-CUN (Suppl. material
Based on the most likely scenario identified in the first step and on genetic clustering patterns revealed in previous analyses, four alternative introduction scenarios were subsequently modelled for the Central Andes sampling sites (Central-TOL2, Central-TOL1, and Central-CAL). These scenarios evaluated whether the sites were established through three independent introduction events (scenarios 1 and 2), or through two introduction events: one involving the Tolima sites (Central-TOL2 and Central-TOL1), and another involving the Caldas site (Central-CAL; scenarios 3 and 4; Suppl. material
In each modelling step, 30,000 and 80,000 simulated datasets were generated under a uniform prior, respectively. Effective population size priors ranged from 10 to 10,000 for sampling sites and from 10 to 100,000 for the unsampled ghost ancestral population. Divergence time priors ranged between 10 and 1,000 generations, with the constraint that the most recent divergence time was smaller than the older ones (t1 < t2 for the first step, and t1 < t2 < t3 < t4 for the second step). Population size changes following the introductions were also included. The data were summarized using summary statistics for SNPs and linear discriminant analysis (LDA). The best scenario was selected using the Random Forest algorithm with 10 replicates of 1,000 trees. Posterior probabilities and prior error rates were estimated to evaluate model performance (
In summary, both diploid and hexaploid datasets were used to estimate overall heterozygosity indices (Ho and He), FST, Bayesian clustering, PCA, DAPC, Mantel test, and clustering trees. In contrast, only the diploid dataset was used for AMOVA, per-site estimates of Ho, He, Ap, FIS, PHWE, and PE-Het, and ABC-RF analyses (Suppl. material
A total of 472.5 million raw paired-end reads were obtained from the 180 individuals, averaging approximately 2.5 million reads per sample. The reference cluster assembly consisted of 359,271 contigs. Variant calling generated 394,021 SNPs for the diploid dataset and 82,388 SNPs for the hexaploid dataset. After quality control and filtering, 3,891 SNPs were retained in the diploid dataset (with 5.6% missing data) and 2,994 SNPs in the hexaploid dataset (with 6.1% missing data). These variants were distributed across 1,867 contigs in the diploid dataset and 1,582 contigs in the hexaploid dataset. Both datasets included all 180 individuals, and quality control confirmed high sequencing quality across all samples (Suppl. material
At the overall level, observed heterozygosity (Ho) exceeded expected heterozygosity (He) in both datasets: Ho = 0.302 and He = 0.252 for the diploid dataset, and Ho = 0.483 and He = 0.222 for the hexaploid dataset (Table
Genetic diversity statistics by sampling site and overall, estimated from diploid and hexaploid datasets for Ulex europaeus in the Central and Eastern Andes of the Northern Andes. Private alleles (Ap), observed heterozygosity (Ho), expected heterozygosity (He), p value of the Hardy-Weinberg equilibrium test (PEHW), p value of the test for excess heterozygosity (PE-Het), and inbreeding coefficient (FIS). *Highly significant p value (< 0.0001).
| Andes | Sampling site | Ap | Ho | He | PEHW | FIS |
| PE-Het | ||||||
| Central | Central-TOL1 | 5 | 0.383 | 0.288 | <0.0001* | -0.369 |
| <0.0001* | ||||||
| Central-TOL2 | 1 | 0.343 | 0.264 | <0.0001* | -0.364 | |
| <0.0001* | ||||||
| Central-CAL | 2 | 0.388 | 0.281 | <0.0001* | -0.443 | |
| <0.0001* | ||||||
| Eastern | East-CUN | 1 | 0.346 | 0.268 | <0.0001* | -0.344 |
| <0.0001* | ||||||
| East-DC | 0 | 0.345 | 0.262 | <0.0001* | -0.356 | |
| <0.0001* | ||||||
| East-BOY | 2 | 0.460 | 0.319 | <0.0001* | -0.493 | |
| <0.0001* | ||||||
| Overall (Diploid) | - | 11 | 0.302 | 0.252 | <0.0001* | -0.325 |
| Overall (Hexaploid) | - | - | 0.483 | 0.222 | - | - |
Among sampling sites, East-BOY exhibited the highest observed heterozygosity (Ho = 0.460) and the lowest FIS (-0.493), suggesting high genetic diversity and excess of heterozygotes. In contrast, East-DC showed no private alleles, while Central-TOL1 had the highest number (Ap = 5), highlighting differences in allelic richness across sampling sites (Table
The evaluation of genetic differentiation revealed moderate population structuring in the diploid dataset and weaker structuring in the hexaploid dataset. In the diploid dataset, the overall fixation index estimated using Weir and Cockerham’s method (FST = 0.100) was nearly identical to the unweighted mean of pairwise values (FSTp = 0.118), indicating consistent differentiation estimates regardless of the approach used. In contrast, the hexaploid dataset showed a lower overall differentiation (FSTp = 0.074), reflecting weaker population structure. Significant pairwise FST values were observed among all sampling sites in both datasets (Fig.
Pairwise estimates of FST among sampling sites from the diploid (below the diagonal) and hexaploid (above the diagonal) datasets for Ulex europaeus in the Central and Eastern Andes of the Northern Andes. All pairwise estimates FST were significant after False Discovery Rate correction (p value < 0.0001).
In the Central Andes, Central-TOL1 and Central-TOL2, the two geographically closest sites, exhibited relatively low genetic differentiation (diploid: FST = 0.084; hexaploid: FST = 0.054) compared to their higher differentiation with Central-CAL (diploid: FST = 0.152 and 0.136; hexaploid: FST = 0.081 and 0.079). Similarly, in the Eastern Andes, East-BOY showed high genetic differentiation from all other sites, while East-DC and East-CUN, being geographically closer, presented lower levels of differentiation (diploid: FST = 0.066; hexaploid: FST = 0,051; Fig.
The Mantel test revealed a moderate but significant positive correlation between genetic and geographic distance, suggesting a pattern of isolation-by-distance (diploid: r = 0.479, p value = 0.022; hexaploid: r = 0.507, p value = 0.012; Suppl. material
In the structure analysis, although the ΔK method identified K = 3 as the most likely number of genetic clusters for both datasets, the average log-likelihood (estLnProbMean) supported K = 6, which reflects the finer-scale genetic structure observed among sampling sites (Suppl. material
Population structure inferred by structure (K = 6) from the diploid (A) and hexaploid (B) datasets for Ulex europaeus in the Central and Eastern Andes of the Northern Andes. The colours represent the genetic clusters; each vertical line corresponds to an individual and indicates the proportion of membership to each group. The labels below the bars indicate the sampling site of each individual, grouped in sets of 30.
Principal component analyses (PCA) confirmed the genetic differentiation among sampling sites, revealing six distinguishable groups in both datasets. The first two principal components explained 15.92% of the total variation in the diploid dataset and 12.08% in the hexaploid dataset (Fig.
The Kosman/Leonard dendrograms showed six well-defined genetic groups corresponding to the sampling sites, which were clustered into two main clades. In the diploid dataset, Central-TOL1 and Central-TOL2 (Central Andes) showed a close genetic similarity with East-CUN (Eastern Andes), while East-BOY and East-DC (Eastern Andes), grouped together and exhibited genetic similarity with Central-CAL (Central Andes) (Fig.
In the ABC-RF analysis, the first step indicated that the most likely scenario for the introduction of U. europaeus to the Eastern Hills involved at least two independent introduction events (scenario 1): one into East-CUN and another into East-DC. In both cases, propagules originated from an ancestral, unsampled source population introduced into the country (Fig.
Graphical illustration of the most likely introduction scenario of Ulex europaeus from the Eastern Andes to the Central Andes in the Northern Andes. Sampling sites are shown using consistent colours across the tree and the map, with the salmon-coloured symbol representing the unsampled ancestral source population. In the tree, inferred dispersal pathways are indicated at the corresponding nodes. On the map, source populations are represented by stars, whereas recipient populations are represented by dots
Scenarios evaluated and results from the two successive steps of the ABC-RF analysis used to infer the introduction history of Ulex europaeus from the Eastern Andes to the Central Andes, based on the diploid dataset. The best (most likely) scenario identified at each step is indicated in bold and marked with an asterisk (*). Prior error, RF votes, and posterior probability values correspond to the average of 10 RF replicates.
| Scenarios | Prior error (%) | RF votes (%) | Posterior probability |
| Step 1. Eastern Hills Three scenarios, 16 summary statistics, 30,000 simulations, and 1,000 decision trees | |||
| 1. Independent introduction of East-CUN and East-DC from an ancestral population* | 0.371 | 46.5 | 0.576 |
| 2. Introduction to East-DC from an ancestral population, and posterior dispersal to East-CUN from East-DC | 24.4 | - | |
| 3. Introduction to East-CUN from an ancestral population, and posterior dispersal to East-DC from East-CUN | 29.1 | - | |
| Step 2. Introduction to Central Andes Four scenarios, 292 summary statistics, 80,000 simulations, and 1,000 decision trees | |||
| 1. Three independent introduction events, one for each population: first Central-TOL2 from East-CUN, then Central-TOL1 from East-CUN, and finally Central-CAL from East-DC | 0.205 | 4.1 | - |
| 2. Three independent introduction events, one for each population: first Central-TOL1 from East-CUN, then Central-TOL2 from East-CUN, and finally Central-CAL from East-DC | 0.7 | - | |
| 3. Two independent introduction events, one for Tolima sampling sites and another for Caldas sampling site: first Central-TOL2 from East-CUN, then Central-TOL1 from Central-TOL2, and finally Central-CAL from East-DC | 44.2 | - | |
| 4. Two independent introduction events, one for Tolima sampling sites and another for Caldas sampling site: first Central-TOL1 from East-CUN, then Central-TOL2 from Central-TOL1, and finally Central-CAL from East-DC* | 51.0 | 0.530 |
The second step of the analysis suggests that the most likely origin of the sampling sites in the Central Andes involved two independent introduction events: one into the Tolima sites (Central-TOL1 and Central-TOL2), originating from East-CUN; and another into Central-CAL, originating from East-DC (Fig.
The invasive success of U. europaeus in the Northern Andes does not appear to be limited by genetic changes associated with its introduction process. The consistently high levels of heterozygosity across sampling sites may support the species’ persistence and adaptive potential in these novel environments. A moderate level of genetic structure was detected, indicating restricted gene flow among sites and revealing populations that are beginning to differentiate. Overall, the observed patterns are consistent with a complex introduction history involving multiple independent introduction events followed by localized dispersal processes.
Genetic diversity and population structure analyses performed with both diploid and hexaploid datasets revealed similar patterns. Diversity estimates showed consistent trends across assumed ploidy models, and the genetic structure analyses indicated similar clustering among sampling sites. These results suggest that U. europaeus populations in the Northern Andes exhibit a predominantly disomic behaviour, such that the diploid model adequately describes the patterns observed in the hexaploid dataset. Any potential segmental behaviour would likely be restricted to specific multicopy genomic regions, such as the ETS and ITS (
The populations analysed showed no evidence of heterozygosity loss. On the contrary, a consistent excess of heterozygotes was detected, which may be explained by several factors. On the one hand, the founding individuals may have been predominantly heterozygous, providing an initial evolutionary advantage (
Beyond these intrinsic mechanisms, demographic processes commonly associated with recent invasions could contribute to the high levels of heterozygosity observed. In particular, introduced populations can exhibit elevated heterozygosity due to the recent admixture of multiple introduced lineages, which rapidly increases genetic variability by combining alleles from different source regions (
Pairwise FST values from both datasets were significant among all sampling sites, indicating moderate population structuring and restricted gene flow (non-random mating) even among geographically close sites. Although the Central and Eastern Andes are separated by the Magdalena River Valley and lie above 2600 m a.s.l., a potential biogeographic barrier, no clear genetic clustering by mountain range (Eastern vs Central Andes) was detected. This suggests that some level of gene flow does occur among sites, but that it is not sufficient to fully homogenize the populations.
Natural dispersal in U. europaeus occurs mainly through explosive dehiscence, zoochory, and hydrochory (
Limited gene flow within and between mountain ranges may be facilitated by human activities. Intentional dispersal, such as the use of U. europaeus in live fences (
Clustering analyses revealed that the East-DC and East-CUN populations, located near the Eastern Hills of Bogotá, exhibited higher genetic admixture and lower differentiation than the other populations. This pattern suggests that propagules introduced into other sampling sites likely originated from areas near Bogotá (
Regarding the colonization of U. europaeus in the Central Andes, the ABC-RF analysis showed some uncertainty about the sequence of establishment of the Tolima populations. However, the most likely scenario indicates that Central-TOL1 was first established from East-CUN, followed by the colonization of Central-TOL2 from Central-TOL1, whereas Central-CAL originated independently from East-DC more recently. This sequence aligns with the introduction times reported by
In conclusion, the results of this study indicate that populations of U. europaeus in the Northern Andes maintain consistently high levels of heterozygosity. This pattern supports the hypothesis that polyploidy may contribute to preserving genetic variation and enhancing the species’ adaptability to a variety of environmental conditions, thereby facilitating its persistence and expansion in invaded ecosystems. Genetic structure analyses revealed recent differentiation among populations and restricted gene flow. While biogeographic features such as the inter-Andean valley may limit dispersal in the region, our results did not detect an effect of this barrier on the clustering pattern and instead suggests that other factors, including human-mediated dispersal, play an important role in shaping genetic connectivity. Finally, the inferred introduction history points to at least two independent introduction events in the Eastern Andes, followed by multiple introductions into the Central Andes, highlighting the complexity of U. europaeus invasion in high mountain ecosystems. Future studies incorporating environmental data and genome-wide selection analyses could provide insights into the role of local adaptation in the success of this species.
The sequencing data generated and analysed during this study have been deposited in the NCBI BioProject database under the BioProject ID PRJNA1249511 (https://www.ncbi.nlm.nih.gov/bioproject/1249511). The data that supports the findings of this study are available in the supplementary material of this article.
We express our gratitude to the Ministerio de Ciencia, Tecnología e Innovación (Minciencias) and the Universidad de Caldas for supporting the project entitled “Population status, reproductive biology and genetic diversity of the invasive plant Ulex europaeus L., in populations of the Colombian Central Cordillera” (projects 1127-852-71470 and 0643120), as well as for funding the ADR master’s degree. We also thank the Centro de Bioinformática y Biología Computacional (BIOS) for providing access to their high-performance computing cluster. We thank the Laboratorio de Colecciones Biológicas and Laboratorio de Biología Vegetal y Molecular of the Universidad de Caldas for providing the space and equipment necessary for the laboratory phase of this research. We thank María Camila Ángel-Vallejo and Eliana Jimena García-Marín for their assistance and contributions during fieldwork. We also extend our thanks to the owners of the property in Alaska for allowing us access to their land, where invasive U. europaeus populations are present. Finally, we sincerely thank the editor and reviewer for their constructive comments and valuable suggestions, which helped to substantially improve the quality of this manuscript.
Additional tables and figures supporting the analyses.