The World Checklist of Vascular Plants (WCVP) is a global consensus view of vascular plant species. It provides names, synonymy, taxonomy, and distributions for the > 340,000 vascular plant species known to science.
We have developed rWCVP to make some common tasks that use the WCVP easier. These include standardising a list of taxon names against the WCVP, getting and mapping the distribution of a species, and creating a checklist of taxa found in a particular region.
Accessing the WCVP
For the functions in rWCVP to work, you need to have access to a copy of the WCVP.
One way to load the WCVP is to install the associated data package, rWCVPdata:
if (!require(rWCVPdata)) {
install.packages("rWCVPdata",
repos = c(
"https://matildabrown.github.io/drat",
"https://cloud.r-project.org"
)
)
}
# taxonomy data
names <- rWCVPdata::wcvp_names
# distribution data
distributions <- rWCVPdata::wcvp_distributions
If the data package isn’t available, or you’d prefer to use a different version of the WCVP, you can provide a local copy of the data to the main functions in this package. For instance, to generate a checklist:
names <- read_csv("/path/to/wcvp_names.csv")
distributions <- read_csv("/path/to/wcvp_distributions.csv")
checklist <- wcvp_checklist("Acacia",
taxon_rank = "genus", area_codes = "CPP",
wcvp_names = names, wcvp_distributions = distributions
)
Be careful if you’re using your own WCVP version! The structure of the WCVP tables sometimes changes between versions. rWCVP should be set up to work with the latest version of WCVP, and any previous versions that share the same structure.
Filtering the WCVP
Some rWCVP functions involve filtering the WCVP to generate lists or summaries of vascular plant species in particular areas. These functions accept two arguments for filtering the WCVP:
-
taxon
: the name of a valid taxon with a taxonomic rank of species or higher (e.g. the species “Myrcia almasensis”, the genus “Myrcia”, or the family “Myrtaceae”). -
area
: a vector of WGSRPD level 3 codes for the regions you want to focus on.
These arguments can be combined in the wcvp_checklist
,
wcvp_occ_mat
, and wcvp_summary
to generate
outputs for focal taxa in a desired area. For example, Myrtaceae species
in Brazil:
checklist <- wcvp_checklist("Myrtaceae",
taxon_rank = "family",
area_codes = c("BZC", "BZN", "BZS", "BZE", "BZL")
)
A note on filtering by taxon
When filtering by taxon, you need to tell the function what taxonomic
rank the name you’re providing is using the taxon.rank
argument.
For example, making a summary table for the genus Poa:
wcvp_summary("Poa", taxon_rank = "genus")
#> $Taxon
#> [1] "Poa"
#>
#> $Area
#> [1] "the world"
#>
#> $Grouping_variable
#> [1] "area_code_l3"
#>
#> $Total_number_of_species
#> [1] 573
#>
#> $Number_of_regionally_endemic_species
#> [1] 573
#>
#> $Summary
#> # A tibble: 280 × 6
#> area_code_l3 Native Endemic Introduced Extinct Total
#> <chr> <int> <int> <int> <int> <int>
#> 1 ABT 21 0 5 0 26
#> 2 AFG 23 1 0 0 23
#> 3 AGE 13 2 5 0 18
#> 4 AGS 24 0 10 0 34
#> 5 AGW 34 8 3 0 37
#> 6 ALA 6 0 3 0 9
#> 7 ALB 17 0 0 0 17
#> 8 ALG 8 0 1 0 11
#> 9 ALT 30 3 1 0 31
#> 10 ALU 7 0 3 0 10
#> # … with 270 more rows
You can provide taxon names of the ranks species
,
genus
, family
, order
, or
higher
. The WCVP only provides taxonomic information up to
the family rank. We have included a table called
taxonomic_mapping
to map families to orders and higher
taxonomies, based on APG IV.
head(taxonomic_mapping)
#> higher order family
#> 1 Angiosperms Acorales Acoraceae
#> 2 Angiosperms Alismatales Alismataceae
#> 3 Angiosperms Alismatales Aponogetonaceae
#> 4 Angiosperms Alismatales Araceae
#> 5 Angiosperms Alismatales Butomaceae
#> 6 Angiosperms Alismatales Cymodoceaceae
We use this table behind the scenes to allow you to filter using these higher taxonomic ranks.
For example, making a summary table for all Poales:
wcvp_summary("Poales", taxon_rank = "order")
#> $Taxon
#> [1] "Poales"
#>
#> $Area
#> [1] "the world"
#>
#> $Grouping_variable
#> [1] "area_code_l3"
#>
#> $Total_number_of_species
#> [1] 23770
#>
#> $Number_of_regionally_endemic_species
#> [1] 23770
#>
#> $Summary
#> # A tibble: 368 × 6
#> area_code_l3 Native Endemic Introduced Extinct Total
#> <chr> <int> <int> <int> <int> <int>
#> 1 ABT 359 0 70 0 429
#> 2 AFG 485 16 15 0 504
#> 3 AGE 908 31 166 0 1074
#> 4 AGS 389 20 88 0 477
#> 5 AGW 890 130 105 0 995
#> 6 ALA 711 2 153 0 864
#> 7 ALB 363 4 14 0 387
#> 8 ALD 44 7 13 0 57
#> 9 ALG 445 11 41 0 497
#> 10 ALT 383 9 6 0 389
#> # … with 358 more rows
A note on filtering by area
The WCVP lists taxon distributions using the World Geographic Scheme for Recording Plant Distributions (WGSRPD) at level 3. This level corresponds to “botanical countries”, which mostly follow the boundaries of countries, except where large countries are split or outlying areas omitted.
The functions in rWCVP expect area
to be provided as a
vector of WGSRPD level 3 codes. These can be annoying to look up for the
entire region you’re interested in. For example, to filter by species in
Brazil you need to provide a vector of 5 codes.
To make things easier, rWCVP has a function that converts the name of a region to a vector of WGSRPD level 3 codes.
get_wgsrpd3_codes("Brazil")
#> [1] "BZC" "BZE" "BZL" "BZN" "BZS"
This can be input directly into functions that filter the WCVP by area.
wcvp_summary("Poa", taxon_rank = "genus", area = get_wgsrpd3_codes("Southern Hemisphere"))
#> $Taxon
#> [1] "Poa"
#>
#> $Area
#> [1] "Southern Hemisphere (incl. equatorial Level 3 areas)"
#>
#> $Grouping_variable
#> [1] "area_code_l3"
#>
#> $Total_number_of_species
#> [1] 264
#>
#> $Number_of_regionally_endemic_species
#> [1] 237
#>
#> $Summary
#> # A tibble: 74 × 6
#> area_code_l3 Native Endemic Introduced Extinct Total
#> <chr> <int> <int> <int> <int> <int>
#> 1 AGE 13 2 5 0 18
#> 2 AGS 24 0 10 0 34
#> 3 AGW 34 8 3 0 37
#> 4 ANT 0 0 1 0 1
#> 5 ASC 0 0 1 0 1
#> 6 ASP 2 1 3 0 5
#> 7 ATP 8 1 3 0 11
#> 8 BOL 31 2 3 0 34
#> 9 BOR 2 1 0 0 2
#> 10 BUR 3 0 0 0 3
#> # … with 64 more rows