RRC ID 83280
Author Jean Cossi GANGLO
Title Completeness of Digital Accessible Knowledge of plants across Africa and priorities for future data discovery
Journal Research Square
Abstract Digital Accessible Knowledge (DAK) is of utmost importance for biodiversity conservation. The Global Biodiversity Information Facility (GBIF, www.gbif.org) is a mega data infrastructure with more than two billion and two hundred million occurrence records as of 17 January 2023. It is by far the largest initiative assembling and sharing DAK to support scientific research, conservation, and sustainable development. We decided to analyze plant data published at the GBIF site at the scale of Africa to highlight the contribution of the continent to the GBIF and thereby underline data quality issues and data gaps across taxonomic groups and geographic space. We therefore downloaded data on 17th January 2023 from the Plantae kingdom from Africa. They are available at https://doi.org/10.15468/dl.p2n6um. We achieved data treatment and analysis using R, several packages and related functions. Although Africa is home to rich biodiversity with many hotspots, the global data contribution of the continent to the GBIF (61,176,994 as of 17th January 2023) is still incredibly low (2.69%). Furthermore, there are large disparities between African countries, with South Africa contributing alone far more than 50% of the data of the continent. The plant data of Africa (9,116,401 occurrence records) accounted for 14.90% of the data of the continent; this underlines huge gaps between taxonomic groups. We noted important data loss during the process of data cleaning clearly underlining limited data quality from the continent; indeed, the data fitness for purpose in completeness analysis were only 50.94% of the total data records initially downloaded. Efforts for quality check before data publication on GBIF site are still needed across African countries. The Magnoliopsida was the dominant plant class with the highest number of records (71.07%) and the highest number of species (68.36%), followed by Liliopsida, with 22.80% of the records and 19.06% of the species. In geographic space, plant data gaps are also quite large across the continent; data completeness is more achieved in West Africa, Southern Africa, East Africa, and Madagascar. Accessibility by roads and large protected areas (> 10,000 Km²) are limiting factors for data completeness across the continent. The large multidimensional data gaps identified in this study and the important data loss noted during data cleaning process should be in priority addressed in future data collections across the continent.
Published 2025-1-3
DOI 10.21203/rs.3.rs-2182259/v6
Resource
GBIF Plant Specimen Database of Tama Forest Science Garden, Forestry and Forest Products Research Institute, Japan Flora of Japan Specimen Database Paleobotany collection of National Museum of Nature and Science AIS Wildtype Populations of Arabidopsis Ibaraki Nature Museum, Vascular Plants collection Bryophyte specimens of National Museum of Nature and Science (TNS) Plant specimens in the Museum of Nature and Human Activities, Hyogo Prefecture, Japan Fossil Specimens of Komatsu City Museum