The vascular plant diversity of Burundi

Background and aims – The vascular plant diversity of Burundi is still insufficiently explored, described, and understood. The goal of this paper is to show the degree of botanical exploration and the spatial patterns of botanical diversity in Burundi to date. Material and methods – The study is based on a dataset containing virtually all plant collections, observed in herbaria, recorded in databases, or cited in literature, made in Burundi. All data were compiled, cleaned, and each record georeferenced. Various distribution analyses were carried out, some of which were based on a grid of 199 hexagonal cells. Key results – The dataset comprises 37,200 herbarium collections representing 3,860 species grouped in 1,290 genera and 216 families. The expected species richness is estimated at 4,869. The average number of collections per species is 8.8, but 1,149 species (27%) are sampled only once. The seven most species-rich families are Fabaceae (539 spp.), Poaceae (387), Asteraceae (298), Orchidaceae (286), Cyperaceae (272), Rubiaceae (227), and Acanthaceae (128), which together account for over 50% of the vascular plant flora of Burundi. The seven largest genera are Cyperus (90 spp.), Crotalaria (60), Indigofera (50), Polystachya (48), Habenaria (47), Vernonia (45), and Eragrostis (41). In terms of number of herbarium collections, the six most important families are Poaceae (4,754 collections), Fabaceae (4,300), Asteraceae (2,226), Rubiaceae (2,191), Cyperaceae (1,730), and Lamiaceae (1,275). The four areas most intensively explored and with the highest known species diversity are the Rusizi plain, the Kibira rain forest belonging to the Albertine Rift, the Bururi and Rumonge areas in the west, and the Mosso depression in the east. Conclusion – With a collecting index of 133 collections per 100 km 2 , the botanical exploration of Burundi can be considered as relatively good. However, 28% of the species are only represented by a single record and some 1,000 species are potentially present but have remained uncollected to date. For every 100 new collections, there are on average 6 new species records, indicating that Burundi’s inventory is still not complete.


INTRODUCTION
Comprehensive inventories that catalogue the occurrence of components of biodiversity in any given geographical region remain fundamental, indeed, essential research tools for conservation planners and other users of biodiversity information (Figueiredo et al. 2009). For many parts of the world, sound data on biodiversity and thorough knowledge of plant species richness and distribution are still lacking (Sosef et al. 2017). In Burundi, these basic but crucial data on plants are extremely scattered, poorly documented, and often obsolete or even lacking (Reekmans and Troupin 1983).
Our knowledge of the botanical wealth of tropical regions is still largely based on information obtained from herbarium specimens (Sosef et al. 2021). These remain, without a doubt, the most reliable and hence preferable source of data (Farjon 2001). Fortunately, with the ongoing digitization of herbarium specimens (Nieva de la Hidalga et al. 2020), a large amount of spatial and temporal data related to these specimens has become available, allowing analyses of species richness and distribution patterns (Sosef et al. 2017) that are precious and urgently needed for global as well as regional conservation efforts (CBD 2011;Schmidt et al. 2017).
Burundi, one of the smallest countries in tropical Africa, is rather mountainous, has a surface area of 27,834 km 2 , and is nestled between the highlands of East Africa and the eastern part of the Democratic Republic of the Congo at about 2°20'S to 4°30'S and 28°50'E to 30°53'E. Despite its modest surface area, the country encompasses an astonishing diversity of natural environments, with for example a wide elevational range of 780-2,670 m ( Fig. 1), which partially determines the diversity of the vegetation and the vascular plant flora. Apart from the alpine regions, the country was originally more or less completely covered by forest, but currently, clearance for agriculture and other human purposes has reduced this to only 5% of its surface. The different vegetation types occurring in this country, which is a crossroad of Guinean, Zambezian, and Sudanese floristic influences, include dry forest, xerophilous gallery forest, forest-grassland mosaic, montane rainforest, mesophilous periguinean forest (i.e. peripheral evergreen Guineo-Congolian rain forest), miombo woodland, wooded grassland, savannas, and marshes (Lewalle 1972;Reekmans 1980aReekmans , 1980bNtore et al. 2018), with small areas of the ericaceous belt vegetation that still occurs on the top of the Congo-Nile Divide (Hedberg 1951;Ntore et al. 2018).
Until the publication of this paper, it has been widely assumed that the country harbours a comparatively rich botanical diversity (Lewalle 1972;Reekmans 1980aReekmans , 1980b. However, this idea has largely been based on informal expert opinions or non-documented estimations (e.g. Bigendako 1997;Bizuru et al. 2003;Maréchal et al. 2014). Apart from the Flore d' Afrique centrale series as well as taxonomic monographs and revisions (e.g. Ndabaneze 1989;Pichi Sermolli 1983, 1985, only few botanical studies on very limited areas (Lewalle1972; Reekmans 1980aReekmans , 1980bReekmans , 1981 are available, although a Red List of the endemic and range-restricted vascular plants was published recently (Ntore et al. 2018). To counter this situation, the first author undertook an effort to compile a (nearly) comprehensive herbarium specimen database, leading to the first ever Checklist of the Vascular Plants of Burundi, to be published soon. The present paper provides the results of several basic analyses and thus represents the first ever such study undertaken for the country as a whole. Together, the checklist and the present paper aim at offering a better understanding of the species wealth and provide temporal and spatial patterns of plant diversity across Burundi.
Using the unique source of herbarium specimen data, we will provide answers to the following questions: (i) What is the overall degree of the botanical collecting effort? (ii) What is the spatial and temporal distribution of the collecting effort? (ii) How many plant species are known to occur in the country? (iii) How is plant species richness distributed across Burundi? and (iv) Approximately how many plants species remain to be recorded for the country?

Data compilation
Herbarium collections of vascular plants from Burundi kept at B, BJA, BM, BR, BRLU, EA, FI, G, GENT, GOET, K, JE, LG, MO, P, WAG, and YBI were surveyed (underlined herbaria were visited physically, others were consulted online or their data obtained from literature; herbarium codes follow Thiers continuously updated). Each collection, which may consist of a single specimen or has several duplicates in different herbaria, constitutes evidence of the presence of a taxon at a specific locality in Burundi, at a specific time. The majority of the records were extracted from the database at BR. Then, we performed a thorough literature review from which we obtained additional specimens, for instance Arbonnier and Geerinck (1993), Geerinck (1992), Geerinck and Arbonnier (1996), Pichi Sermolli (1983Sermolli ( , 1985, and Schultze-Motel (1960). The data fields include barcode(s), herbarium code, species name, vernacular name, collector name, collection number, collecting date, elevation, locality description, and latitude/longitude.

Georeferencing
While most collections made during the last two decades have coordinates taken with Global Positioning Systems (GPS) equipment, many of the older ones often lack accurate latitude and longitude data. Most of the collecting localities have their geographic coordinates available from the gazetteer by Bamps (1982), but others needed to be added using other sources, mostly topographic maps and Google Earth. Coordinates assigned using Bamps (1982) or from label information were entered in a column as degrees minutes seconds (DMS), then converted to decimal degrees in another one.
DD D M S 60 3600 = + + For our general mapping and diversity estimations as a country-wide scale, we accepted accuracy values of less than 10 km. Localities that had an estimated accuracy of more than 10 km were not georeferenced. By applying a thorough manual check of the correctness of all geographic data, georeferencing errors were minimized.

Data analyses
For some of the calculations, a pattern of hexagonal grid cells was defined, each of 296 km 2 or 10 arcminutes in diameter; at the country border, these were clipped. In total, 199 such cells were defined. All analyses were performed at the species level, unless otherwise indicated. In all taxon diversity analyses, specimens doubtfully identified to species level (indicated by aff. or cf.), or related to hybrids or cultivated material were left out. When counting species, collections identified to genus level only were not taken into account unless the genus was not represented by any species. In such counts, specimens identified to family level only were not included. Mapping and spatial analyses were carried out by using QGIS v.3.14 (QGIS Development Team 2020). To estimate the total number of species of vascular plants potentially present in the country, a Chao2 estimation was applied (Chao 1987).

Data compilation
The final dataset contains 37,200 unique herbarium collection records, each represented by one or more specimens. A total of 785 collections (ca 2%) could not be georeferenced due to their low precision in the locality information.

Collecting density index
The collecting density index (CDI) of a given region is defined as the number of samples obtained per 100 km 2 (Campbell and Hammond 1989). These authors consider a value of 100 as an acceptable minimum for a tropical region to be considered as 'fairly well known' . Currently, the CDI for Burundi is 133 and it can thus be stated that the flora of Burundi is reasonably well sampled. Although this value seems insignificant, in tropical countries it is only very rarely achieved. For instance, it drops down to only 1.6 for the neighbouring Democratic Republic of the Congo. However, in Burundi, the collecting efforts are not evenly distributed across the country and thus some regions still remain fairly poorly sampled.

History of collecting
At the end of the 19 th century, many explorers travelled in Burundi (then called Urundi), searching for the famous sources of the Nile and the legendary 'Mountains of the Moon' (Lewalle 1967), and some of them collected plants.
In the 1890s, the country became part of the German protectorate of East Africa (Deutsch-Ostafrika). In 1893, it was, however, the British Georges Francis Scott Elliot who conducted the first herbarium collections and deposited them at the herbaria of the British Museum (BM) and the Royal Botanic Gardens, Kew (K). German botanists, notably A. Keil, Hans Meyer, and Gustav Albert Peter, collected plants later, i.e. between 1910 and 1920, around Bujumbura (then called Usumbura) and in eastern Burundi. Most of their collections stored at the herbarium of the Botanic Garden and Botanical Museum Berlin-Dahlem (B) were destroyed in the 1943 bombing of Berlin. However, a few of them survived (Sleumer 1949: 173), and 60 of them are digitally accessible on https://www.herbonauten.de, nine of which are also available on https://plants.jstor.org. In 1919, the country was placed under Belgian mandate, and Belgian botanists took over the collecting work, continuing this activity after the independence of Burundi in 1962 (  GENT, or LG, a duplicate was generally deposited at BJA and EA. However, the highest annual record, which amounts to 2,930 botanical specimens, was made shortly before, in 1952, by Georges Michel and J. Reed in south-eastern Burundi (Fig. 2). The largest set was made by Marcel Reekmans, who collected 10,521 herbarium specimens between 1971 and 1981 followed by the set made by José Lewalle (1931Lewalle ( -2004, who collected over 6,500 numbers between 1965 and 1972 (Table 1; Reekmans and Troupin 1983). The year 1979 marked a turning point for the botanical exploration of the country. Indeed, Burundian researchers, mainly those preparing their doctoral theses, began to take over this task and achieved an additional ca 7,300 collections (Fig. 2). From 1965 to the present, the annual average sampling rate is 290 collections. Five hundred records do not bear a collecting year. Figure 3 shows a map on which each point represents the locality where one or more herbarium collections have been made. Although this shows that much of the country has been visited at least once by a botanist, Fig.  4 shows that the intensity of these visits varies greatly. It provides the collecting density across the country, recorded per hexagonal grid cell. The most densely sampled cells (more than 1,000 collections) are found in (i) the west of the country (lower Rusizi where three cells include respectively 2,618, 2,194, and 1,758 collections), (ii) on the Congo-Nile ridge (with a cell including 2,557 collections), (iii) in the east, in the Mosso region (Kinyinya and Gihofi, with cells including respectively 1,609 and 1,595 collections), and (iv) in the south-west (Rumonge, with a cell including 1,475 collections, and the valley of Siguvyaye with 883 collections). Contrastingly, the lowest sampling densities are found in the central part and some smaller peripheral areas of the country. It is worth noting that the central region corresponds to the highest population density, where virtually all natural vegetation has been eliminated and replaced by agricultural fields, plantations, and habitations (Ntore et al. 2018).

Species richness
The species richness is a simple count of the number of species known from a specific region. Burundi material not identified to the species level (a total of 1,174 collections) was left out of this calculation, except when the specimen consisted of an unidentified species of a genus for which no other species were recorded. In total, 76 botanical specimens are not identified at all, 371 are only identified to the family level, while 718 are identified only to the genus level. More than 15% of these specimens are kept in BJA and YBI, which together house 2,692 collections that have no duplicates elsewhere.
Currently, the species richness of the vascular flora of Burundi is 3,860. However, given the comparatively large number of collections that remain unidentified at the species level, the sampled richness is presumably slightly higher.  (128)

Species accumulation curve
The number of species recorded for Burundi, accumulated over time, is shown in Fig. 5. This figure includes segments with a steep slope, corresponding to the peak collecting periods of 1950-1952 and 1965-1981. Apart from the high collecting efforts realized in these periods, it also indicates that these collectors went to previously unexplored regions. For example, in Kinyinya and Gihofi   (Fig. 6); on average, each species is represented by 8.8 collections. Surprisingly, a total of 1,370 species (28%) were collected only once, 600 species (13%) were collected twice, 410 species (9%) three times, and 220 species (6%) four times. A total of 3,145 species (75%) are

Spatial analysis of collection-based species richness
In order to identify spatial patterns of species richness, the number of species collected within each hexagonal cell was mapped (Fig. 7). Eleven cells contain more than 500 different species. As expected, the most species-rich cells are located in the most densely sampled zones (Fig. 4), i.e. in the west of the country, the Rusizi plain with three cells including respectively 1,018, 832, and 766 species, the Congo-Nile ridge containing 920 species and Rumonge and Bururi (respectively 682 and 577 species), and in the east of the country, Gihofi and Kinyinya, encompassing respectively 802 and 763 species. The most species-poor cells are those that have a smaller surface located at the border, and in the centre and the north-east, and the far south of the country. By randomly sampling from the collection dataset (without putting them back in) and plotting the increase in number of species, we can obtain a rarefaction curve (Gotelli and Colwell 2001). Figure 8 shows this curve for Burundi. The curve is still increasing substantially at the top right side, indicating, similar to Fig. 5 above, that the inventory of Burundi is far from complete.

Assessment of the inventory completeness
The inventory completeness of a geographic sampling unit is the ratio between the observed richness and the expected richness of the unit (multiplied by 100 to obtain a percentage). It can be inferred from available data by several estimators. For our type of data, we need to use a non-parametric estimator, and Chao2 (Gotelli and Chao 2013) has been shown to be a suitable one (see for example Sosef et al. 2017). It predicts the number of unobserved species (that occur in the country but have zero collections) from those found only once or twice in a sampling set. The assumption is that the number of species detected with a low number of records decreases with increasing sampling effort. The formula used is where S Chao2 is the estimated total number of species, S obs is the number of species observed, q 1 is the number of singletons (i.e. the number of species known from a single collection), and q 2 is the number of doubletons (the number of species known from two collections). For our dataset, the estimated total species richness is: In conclusion, the inventory completeness of Burundi is 79%.

DISCUSSION
Herbarium specimens and their related information constitute a precious and reliable source of baseline data for estimating the botanical richness of a country. Mapping their collecting locality patterns may also provide a broader floristic understanding of a region and could show a species range decline, range extension, or disturbance history. They facilitate planning for further explorations, as they highlight areas that are under-collected. They also provide crucial input to IUCN Red List assessments and as such they are crucial to the conservation of botanical diversity. Their management goes hand in hand with training of botanists and the proper scientific curation of these natural history collections. Apart from a general lack of herbarium collections, notably from several poorly known regions but also from more recent times, the greatest limitation met in the present research is the large number of undetermined specimens or/and those with incomplete label data. We do not think, however, that these limitations had any major impact on the conclusions. The results do confirm the general opinion that the country is botanically rich. Furthermore, they highlight that potentially a large number of species, more than 1,000, is yet to be discovered within the country.

Broader floristic knowledge
With 133 specimens collected per 100 km 2 , Burundi can be considered as botanically fairly well known. However, this does not mean that there is no more need for botanical exploration. In fact, most of the species are known from only a few records (see Fig. 6), many even from only a single one. Indeed, continuing plant collecting activities is still highly needed for the consolidation of the floristic knowledge of the country. Such activities are required to complete the documentation of (i) the morphological variability within and distinction between species; (ii) the species distributional data; (iii) the changes in the range size of species through time; (iv) the changes in species diversity through time; (v) the tracking of ongoing invasive alien weed range expansion; (vi) the collecting of species still unknown in Burundi. The latter activity also includes the discovery of species new to science; between 2010 and 2021, 13 species were newly described from Burundi material (Fischer et al. 2021;IPNI 2021).

Range changes and disturbance history
Herbarium specimens provide data related to the botanical disturbance history of an area. Some of the specimens originate from areas where species have disappeared, while others are recorded from areas where they were previously unknown. Data deduced from historical herbarium specimens originating from areas where the species in question has disappeared no longer reflect the reality of current plant distribution and richness in this heavily populated country. Nevertheless, such data remain useful for the understanding the historical spatial distribution of species (Sharrock et al. 2018).
Some species are found in areas where they were previously unknown and have presumably been recently introduced. That is particularly the case for a number of invasive weeds that are dramatically and continuously proliferating over the past 20 years in the country, especially around Bujumbura. Most of these are still under-sampled, e.g.

Perspectives and recommendations for further research
The herbarium dataset used for this study is expected to provide a solid baseline for the future exploration and study of the flora of Burundi, including its conservation planning (Sharrock et al. 2018). It will be transformed and enriched, and published as the Checklist of the Vascular Plants of Burundi (Ntore et al. in prep.).
Since plant collections often have several duplicates distributed to different herbaria, where they may benefit from new identifications or other name changes, a data source where such changes would be linked would be of great value. In the near future, the authors will act to create an e-Flora platform for Burundi, which will provide such a facility and which will then serve as a dynamic Checklist and data source for the vascular plants of Burundi.
Unfortunately, as the country is heavily and increasingly populated, the creation of new protected areas is subject to conflicting interests. Thus, the only possible alternative solution might be ex situ conservation (e.g. in gene banks or botanic gardens) of range-restricted and rare species that are known from degraded habitats and are close to extinction.
Wesley Tack, and Jan Wieringa for their kind help in the realization of this work. We also warmly thank the editor and the reviewers for their careful reading and their excellent remarks.