Research Article |
Corresponding author: Arthur F. Boom ( boomarthur@gmail.com ) Academic editor: Isabel Larridon
© 2022 Arthur F. Boom, Jérémy Migliore, Esra Kaymak, Pierre Meerts, Olivier J. Hardy.
This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Citation:
Boom AF, Migliore J, Kaymak E, Meerts P, Hardy OJ (2022) Nuclear ribosomal phylogeny of Brachystegia: new markers for new insights about rain forests and Miombo woodlands evolution. Plant Ecology and Evolution 155(2): 301-314. https://doi.org/10.5091/plecevo.91373
|
Background and aims – Brachystegia is a species-rich tree genus found in tropical Africa and a typical element of Miombo woodlands, a widely distributed subtype of the Zambezian savanna. Plastid DNA was shown to be largely uninformative to assess species phylogenetic relationships due to widespread chloroplast capture among species. Here, we aim to assess the capacity of nuclear ribosomal DNA (rDNA) to clarify the phylogeny of Brachystegia species while accounting for intra-individual site polymorphisms (2ISPs), which are often present in rDNA and potentially phylogenetically informative.
Material and methods – Genome skimming sequencing on 47 samples representing 27 of the 29 currently recognized Brachystegia species, allowed us to retrieve complete nuclear ribosomal cistrons encoding for 18S, 5.8S, and 25S rRNA genes (35S rDNA). We reconstructed the Brachystegia phylogeny using Maximum Likelihood methods based on the standard substitution model or integrating 2ISPs (GENOTYPE implementation in RAxML-NG). We additionally tested the effect of partitioning the data (one partition for rDNA genes and one for the ITS1+ITS2). We also conducted network inferences (Neighbor-Net splits graph), as a strict bifurcative approach might not properly model topological uncertainty at shallow phylogenetic depth.
Key results – 2ISPs-aware and standard phylogenetic reconstructions are largely congruent. We identified several well-supported main clades clarifying the species relationships, including two clades of Miombo woodlands species. Miombo Group A includes species with ovoid to globose axillary dormant buds, while Miombo Group B species have flattened ones. Two morphologically close Brachystegia species (B. kennedyi and B. leonensis) found in Guineo-Congolian rain forests form also a robustly supported clade. 2ISPs coding allowed to identify an additional Guineo-Congolian clade (B. eurycoma and B. nigerica). Ribosomal DNA therefore proves more useful to explore the generic phylogeny than plastid DNA but the species relationships within and among the main clades remain poorly resolved, probably due to recent diversification and/or recurrent hybridization, so that the diversification of Brachystegia remains to be more properly characterised.
Conclusion – Nuclear and plastid phylogenetic reconstructions of Brachystegia species are discordant. Even if not well-resolved, rDNA phylograms and networks are characterized by taxonomic sorting, while we observe a strictly geographic sorting in the plastid dataset. Most of the species’ relationships remain to be characterized using additional nuclear markers combined with in-depth morphological investigations.
Brachystegia, genome skimming, phylogeny, RAxML-NG, ribosomal DNA, Zambezian woodlands, Miombo woodlands, 2ISPs
Phylogenetic studies are needed to investigate the evolutionary processes that shaped the current African savannas (e.g.
The nuclear-encoded 35S ribosomal DNA (rDNA) cistron, comprising the 18S, 5.8S, and 25S rRNA genes, is the most easily accessible nuclear DNA region using genome skimming, i.e. shallow genomic sequencing (
In addition to 2ISPs, other processes could blur the phylogenetic relationships between species at the nuclear genome, namely hybridization and incomplete lineage sorting (ILS) (
This work aims to investigate the diversification of Brachystegia, bridging the gap between genetic and morphological/taxonomic considerations through a multimarker phylogenetic approach. Namely, we use genome skimming on specimens from museum collections with degraded DNA (
Brachystegia plant material (leaves) was collected on vouchers from the four following herbaria: BR, BRLU, FHO, and LISC (acronyms according to Index Herbariorum, Thiers continuously updated). In addition, we collected material (n = 10) from fieldwork and dried leaves using silica gel for DNA extraction purposes. In total, 47 individuals of 27 of the 29 described species of Brachystegia (following
Specimens used for the rDNA phylogeny. Most of them were also included in the plastid phylogeny of
No | Taxon | Voucher | Coordinates | Country | GenBank |
---|---|---|---|---|---|
1 | B. allenii Hutch. & Burtt Davy | Milne-Redhead & Taylor 7663 (FHO) | -10.689, 38.945 | Tanzania | OK335216 |
2 | B. allenii Hutch. & Burtt Davy | White 2406A (BR) | -14.960, 30.246 | Zambia | OK335217 |
3 | B. angustistipulata De Wild | Jefford, Juniper & Newbould 2799 (BR) | -6.000, 30.000 | Tanzania | OK335218 |
4 | B. bakeriana Hutch. & Burtt Davy | Dechamps, Murta & da Silva 1327 (BR) | -14.817, 18.633 | Angola | OK335219 |
5 | B. boehmii Taub. | Procter 262 (FHO) | -4.832, 29.962 | Tanzania | OK335220 |
6 | B. boehmii Taub. | Duvigneaud 2833 (BRLU) | DR Congo | OK335221 | |
7 | B. bussei Harms | White 2410 (BR) | -14.726, 30.762 | Zambia | OK335222 |
8 | B. bussei Harms | Burtt 4736 (BR) | -6.041, 37.519 | Tanzania | OK335223 |
9 | B. cynometroides Harms | Forest Product Research Laboratory n/a (FHO, collection date 6 Nov. 1969) | Cameroon | OK335224 | |
10 | B. eurycoma Harms | Latilo & Daramola 28945 (BR) | 7.710, 11.480 | Nigeria | OK335225 |
11 | B. eurycoma Harms | Chapman 156 (FHO) | 7.230, 10.628 | Nigeria | OK335226 |
12 | B. floribunda Benth. | Barbosa 11037A (LISC) | -12.179, 17.242 | Angola | OK335227 |
13 | B. floribunda Benth. | Boom 41 (BRLU) | -11.530, 27.467 | DR Congo | OK335228 |
14 | B. gossweileri Hutch. & Burtt Davy | Barbosa 10988 (FHO) | -10.735, 14.981 | Angola | OK335229 |
15 | B. gossweileri Hutch. & Burtt Davy | Mendes dos Santos 1980 (FHO) | -12.148, 18.090 | Angola | OK335230 |
16 | B. kennedyi Hoyle | Meikle & Keay 581 (BR) | 6.105, 5.893 | Nigeria | OK335231 |
17 | B. kennedyi Hoyle | Kennedy 2181 (FHO) | 6.105, 5.893 | Nigeria | OK335232 |
18 | B. laurentii (De Wild.) Louis ex Hoyle | Wieringa 4529 (BR) | -0.974, 10.925 | Gabon | OK335233 |
19 | B. leonensis Hutch. & Burtt Davy | Sesay 51 (BR) | 8.913, -11.728 | Sierra Leone | OK335234 |
20 | B. leonensis Hutch. & Burtt Davy | Jongkind 9067 (BR) | 5.646, -8.135 | Liberia | OK335235 |
21 | B. longifolia Benth. | Boom 37 (BRLU) | -10.915, 28.517 | DR Congo | OK335236 |
22 | B. longifolia Benth. | Dechamps, Murta & da Silva 1400 (BR) | -11.983, 18.283 | Angola | OK335237 |
23 | B. longifolia Benth. | Boom 39 (BRLU) | -11.530, 27.466 | DR Congo | OK335238 |
24 | B. manga De Wild. | Groome & Hoyle 1073 (FHO) | -7.623, 33.403 | Tanzania | OK335239 |
25 | B. manga De Wild. | Duvigneaud 1214 (BR) | -11.187, 27.905 | DR Congo | OK335240 |
26 | B. michelmorei Hoyle | Astle 797 (FHO) | -9.796, 29.295 | Zambia | OK335241 |
27 | B. microphylla Harms | Leippert 6334 (BR) | -4.485, 35.758 | Tanzania | OK335242 |
28 | B. mildbraedii Harms | Wieringa 5552 (BR) | -1.435, 10.478 | Gabon | OK335243 |
29 | B. nigerica Hoyle & A.P.D.Jones | Chesters A124/30 (BR) | 6.155, 6.770 | Nigeria | OK335244 |
30 | B. nigerica Hoyle & A.P.D.Jones | Lapido 19061 (FHO) | 7.134, 3.840 | Nigeria | OK335245 |
31 | B. puberula Hutch. & Burtt Davy | Bamps, Martins & Matos 4473 (BR) | -14.217, 14.033 | Angola | OK335246 |
32 | B. russelliae I.M.Johnst. | Mendes 55 (BR) | Angola | OK335247 | |
33 | B. russelliae I.M.Johnst. | Mendonça 4593 (FHO) | -12.476, 16.295 | Angola | OK335248 |
34 | B. spiciformis Benth. | Liben 1742 (BR) | -5.863, 23.392 | DR Congo | OK335249 |
35 | B. spiciformis Benth. | Boom 61 (BRLU) | -11.488, 27.600 | DR Congo | OK335250 |
36 | B. spiciformis Benth. | Duvigneaud & Timperman 242 B2 (BRLU) | -10.598, 22.345 | DR Congo | OK335251 |
37 | B. spiciformis Benth. | Barbosa, Henriques & Moreno 2164 (BRLU) | -14.831, 13.621 | Angola | OK335252 |
38 | B. stipulata De Wild. | Boom 7 (BRLU) | -11.510, 28.007 | DR Congo | OK335253 |
39 | B. tamarindoides Welw. ex Benth. | Torre 8678 (LISC) | -15.096, 13.565 | Angola | OK335254 |
40 | B. taxifolia Harms | Boom 24 (BRLU) | -11.477, 27.662 | DR Congo | OK335255 |
41 | B. taxifolia Harms | Boom 46 (BRLU) | -11.533, 27.463 | DR Congo | OK335256 |
42 | B. taxifolia Harms | Duvigneaud 3614br2 (BRLU) | -12.019, 27.784 | DR Congo | OK335257 |
43 | B. torrei Hoyle | Torre & Paiva 11521 (LISC) | -15.587, 39.614 | Mozambique | OK335258 |
44 | B. utilis Hutch. & Burtt Davy | Boom silica collection ABo0065 (BRLU) | -11.515, 28.067 | DR Congo | OK335260 |
45 | B. utilis Hutch. & Burtt Davy | Salubeni & Chikuni 6642 (FHO) | -12.069, 33.588 | Malawi | OK335259 |
46 | B. wangermeeana De Wild. | Boom 10 (BRLU) | -11.516, 28.008 | DR Congo | OK335261 |
47 | B. wangermeeana De Wild. | Plancke 154/2025 (BRLU) | -10.693, 23.182 | DR Congo | OK335262 |
48 | J. paniculata (Benth.) Troupin | Boom 51 (BRLU) | -11.432, 27.469 | DR Congo | OK335215 |
First, the quality of the genomic libraries was checked using FastQC v.0.11 (https://www.bioinformatics.babraham.ac.uk/projects/fastqc/), and then they were trimmed to remove low-quality regions and Illumina adapters using Trim Galore! v.0.4.5 (http://www.bioinformatics.babraham.ac.uk/projects/trim_galore/). De novo assemblies of the rDNA sequences were done for each sample using GetOrganelle v.1.6.2 (default parameters for ribosomal assembly;
In order to evaluate the impact of 2ISPs, we performed Maximum Likelihood (ML) based phylogenetic inferences with RAxML-NG software using three different implementations (
Mutation patterns and rates differ between the rRNA genes and the internal transcribed spacers (
Finally, to evaluate the ability of rDNA to retrieve clades with taxonomic significance, we assessed whether sequences divergence was correlated to taxonomy and/or geographical distance using Mantel tests computed with the R package vegan v.2.5-6 (
After the trimming step, we obtained a mean of 1,227,134 reads per library (SD = 642,412). Raw sizes of the different rDNA assemblies ranged between 5,948 bp and 9,048 bp (mean = 7,526, SD = 757), and the length of the extracted 18S–ITS1–5.8S–ITS2–25S sequences ranged between 5,823 bp and 5,843 bp (mean = 5,828 bp). Mean depth coverage per sample ranged between 49X and 1,096X (mean = 260X, median = 192X). This wide range in mean depth coverage was also reflected in terms of relative rDNA reads content, i.e. between 0.19% and 3.35% of the total number of reads for each genomic library mapped on the corresponding assembly (mean = 0.81, SD = 0.66). The final alignment included a total of 5,973 sites and contained 166 variable and 78 parsimony-informative sites, when 2ISPs were not taken into account. When the outgroup J. paniculata was removed, these values drop to 108 and 75 variable and parsimony-informative sites, respectively. Taking into account the exclusive 2ISP sites, i.e. sites characterised by a single-defined nucleotide with ambiguity codes, the number of variable sites rose to 394 and 338, respectively with and without J. paniculata. Details regarding the exact base composition of each sequence are available in Supplementary file 2.
ML-N and ML-A phylogenetic reconstructions supported similar groupings, with minor differences (more details in Supplementary file 3: Fig. S1). For clarity, we only present the ML-A and ML-I cladograms, i.e. without branch lengths, in the main article (Fig.
Maximum Likelihood phylogenetic inferences of Brachystegia species using rDNA sequences and two different coding schemes. Cladograms were produced using RAxML-NG software (
Partitioning the data (ITS1+ITS2 and rDNA genes) did not produce major differences that are supported with high bootstrap values (Supplementary file 3: Figs S5, S6, and S7b). We again observed three to four robustly supported clades, low BS values for most of the branches, and individuals from the same species were only placed together in four or six cases depending on the RAxML-NG implementation used (Supplementary file 3: Figs S5, S6, and S7).
The Neighbor-Net splits (Fig.
The correlation between the genetic and taxonomic distances was significant for the Miombo species as a whole (r = 0.19, p value = 0.001, n = 35) but not within Miombo Group A (r = 0.11, p value = 0.138, n = 17) or Miombo Group B (r = -0.069, p value = 0.219, n = 18). Hence, rDNA discriminated between different Miombo taxonomic groups but did not provide fine-scale taxonomic information. The correlation between geographical and genetic distances was non-significant for Miombo species as a whole (r = 0 .12, p value = 0.028, n = 35) and for Miombo Group B (r = -0.02, p value = 0.551, n = 18) but was significant within Miombo Group A (r = 0.34, p value = 0.001, n = 17). The mean genetic distance between individuals was higher in Miombo Group A than in Miombo Group B (i.e. differences in the nucleotide composition of 0.03% and 0.0055%, respectively) for roughly the same number of species (n = 10) and specimens (n = 17–18). Overall, the number of parsimony-informative sites within Miombo Group A and B was low (n = 17 in both clades).
Network linking rDNA Brachystegia sequences (Neighbor-Net approach – p-distance). The four main clades identified in the different Maximum Likelihood (ML) phylogenetic inferences are delineated with thick coloured lines. The BS support values for these clades are given according to the different ML analyses (ML-N, ML-A, ML-I; one and two partitions; ML-I for ITS1, ITS2, 18S, and 25S). In Miombo Group A, the specimens of B. bakeriana, B. floribunda, and B. spiciformis are clustered together (delineated with the dotted orange line).
Explicitly taking into account the retention of polymorphism allows for the detection of clades and species groups that align with morphology. For instance, regardless of the partitioning scheme (one vs two partitions), ML-N placed the specimen B. nigerica 30 (Lapido 19061, Nigeria) as poorly supported sister to the B. kennedyi-B. leonensis clade (BS = 22–28), while the specimen B. nigerica 29 (Chesters A124/30, Nigeria) grouped with B. eurycoma (BS = 61–63). In ML-A, the specimen B. nigerica 30 moved to the B. eurycoma-B. nigerica subtree but with decreased support (BS = 9). With two partitions, however, the specimen B. nigerica 30 remained sister, with low support, to the B. kennedyi-B. leonensis clade (BS = 25). In contrast, in ML-I, regardless of the partitioning scheme, both specimens of B. nigerica were resolved as sisters with moderate support (BS = 53–57) in a well-supported clade composed of B. eurycoma and B. nigerica (BS = 80–84).
As described in
Comparison between the ribosomal (subfigure A, left; ML-I tree) and plastid (subfigure A, right) phylograms of Brachystegia specimens, together with the geographic distribution of the specimens (subfigure B) in the miombo woodlands (MW, in red) and the African tropical rain forests (RF, in green). Four specimens labelled with * are present in the rDNA phylogram but absent in the plastid tree. The plastid phylogeny delineates five geographically coherent clades and two additional singletons (each represented by different shapes in each subfigure), independently of the clades delineated by rDNA. Bootstraps supports are given in Fig.
The sorting of individuals according to their taxonomic species in the rDNA trees is, for the Miombo woodland species, the exception irrespective of the treatment of 2ISPs. However, reciprocal monophyly is supported for several Guineo-Congolian species and for the morphologically distinct Miombo Group A vs Miombo Group B. Correlation between geographic and genetic distances for Miombo Group A could suggest some inter-species gene flow at local scale but could not rule out past allopatric diversification. On the other hand, such correlation was not found in the Miombo Group B. Altogether, rDNA sequences provide insights regarding the Brachystegia evolutionary history, even if most of the relationships between species and clades remained unresolved due to insufficient phylogenetic information. This lack of discriminating signal between species could be due to the fairly recent origin of the different species (
Nuclear ribosomal DNA also retrieved clades that are congruent with the taxonomy. The formerly recognized Miombo A and B groups in F.T.E.A. (
Apart from Zambezian species, the relationship between B. leonensis and B. kennedyi is congruent with the views of
The rDNA phylogeny (Fig.
Properly evaluating the effect of coding intra-individual polymorphisms is out of the scope of this paper. However, we note that considering the polymorphism as recommended in
Overall, the rDNA gene trees in this study shed some light onto part of the evolutionary history of Brachystegia species but they cannot resolve relationships between closely related species because the assembled 18S–25S rDNA data is not sufficiently divergent and thus not sufficiently informative. The lack of resolution between Miombo species could be explained mainly by the non-mutually exclusive following reasons: a recent evolutionary history of diversification and/or the occurrence of gene flow between the extant species.
If due to recent diversification, the proper characterisation of a species tree may need the use of several unlinked loci. Targeted enrichment could constitute an interesting strategy, as such methods allow the investigation of young and species rich genera (e.g. Inga in
If the lack of resolution is due to reticulate history, the characterization of the nature, directionality, and extent of gene flow could be explored using targeted enrichment (e.g. Brownea in
The analysis of the nuclear ribosomal DNA provides an overview of the evolutionary relationships between the different species of the African genus Brachystegia. As in some other tree genera, we found a general fit between nuclear phylogeny and morphology, and a near genus-wide decoupling of geographically sorted plastid signatures. A sole gene tree based on rDNA sequences associated with recent diversification did not allow to fully resolve the Brachystegia species tree. Other genomic approaches (i.e. target enrichment, GBS, RAD-seq) need to be tested towards this end. The data provided the opportunity to test different 2ISP scoring in phylogenetic inferences including the novel implementation in RAxML-NG. It proved here to be of some use, as it detects clades that would have been overlooked otherwise. The gain in topological accuracy however remained marginal in Brachystegia as the different coding schemes produced congruent and similar cladograms.
We thank all the curators and other persons who helped us with collecting samples for the genetic analyses. Namely, we thank the people from the following herbaria: Steven Janssens and Ann Bogaerts (BR), Tariq Stévart and Geoffrey Fadeur (BRLU), Stephen Harris and Serena Marner (FHO), Maria Cristina Duarte and Maria M. Romeiras (LISC). We extend our thanks to the people involved in the field collections: Michel Hasson, Annie and Richard De Cauwer, Eric Lowele, and Michel Anastassiou. We acknowledge Laurent Grumiau (ULB-EBE Molecular Biology platform, Belgium) and Latifa Karim (GIGA Liège, Belgium) for their technical support. We are indebted to the Faculté des sciences agronomiques de Lubumbashi (D.R. Congo) for the logistic support provided during the field collection (permit number 023/2016).
We are grateful to G.W. Grimm and an anonymous reviewer for their input during the reviewing process. The comments in addition to the analytical input improved both the content and the shape of this manuscript.
The study was funded by the Belgian “Fonds pour la Formation à la Recherche dans l’Industrie et l’Agriculture” – “Fonds National pour la Recherche Scientifique” (FRIA-FNRS PhD grant to A.F.B) and by the BRAIN-be BELSPO research program BR/132/A1/AFRIFORD (postdoctoral grant to J.M.).
The 18S–25S rDNA alignment used in this study. Ambiguities between A and G, C and T, G and C, A and T, G and T, and A and C have been coded as R, Y, S, W, K.
Nucleotide content of ribosomal DNA sequences for each specimen in the alignment used in this study. Nucleotide content is provided for the full 18S–ITS1–5.8S–ITS2–25S alignment, but also for the different main subunits (18S, ITS1, ITS2, and 25S). Apart the classic nucleotide (i.e. A, T, G, and C), intra-individual site polymorphisms have been coded using IUPAC recommendations. Namely ambiguities between A and G, C and T, G and C, A and T, G and T, and A and C have been coded as R, Y, S, W, K, and M. Some ambiguities appear frequent (e.g. Y, mean: 7, range: 1–31) when others are less frequent (e.g. W, mean = 0.5, range = 0–2). After mapping the reads on the reference, low quality bases, indels, and heterozygote positions with three or more possible nucleotides have been coded as N (mean = 1, range = 0–5). The number of gaps is also reported.
The different supplementary figures (Figs S1–S12). Specific captions are provided for each figure.