Chapter 3 Accessing trait data in R

3.1 Learning Objectives

In this chapter you will learn:

  • How to create a summary of available data to query from a TERRA REF season
  • How to query a specific trait
  • How to visualize query results

3.2 Introduction

In this chapter, we go over how to query TERRA REF trait data using the traits package. The traits package is a way to query for various sources of species trait data, including BETYdb, NCBI, Coral Traits Disease and others. In this chapter we use BETYdb as our trait source, as it contains the TERRA REF data that we are interested in.

Our example will show how to query for season 6 data and visualize canopy height. In addition to the traits package we will also be using some of the tidyverse packages, which allow us to manipulate the data in an efficient, understandable way. If you are unfamiliar with tidyverse syntax, we recommend checking out some of the resources here.

3.3 Query for available traits

3.3.1 Getting Started

First, we will need to install and load the traits package from CRAN, and load it into our environment, along with the other packages we will use in this tutorial.

3.3.2 Setting options

The function that is used to query BETYdb is called betydb_query. To reduce the number of arguments needed to pass into this function, we can set some global options using options. In this case, we will set the URL used in the query, and the API version.

3.3.3 Querying available traits

The TERRA REF database contains trait data for many other seasons of observation, and available data may vary by season. Here, we get a visual summary of available traits and methods of measurement for a season.

First we construct a general query for the Season 4 data. This returns all season 4 data. The function betydb_query takes as arguments key = "value" pairs which represent columns in the database to query. In this example, we set sitename column for season 4 data, and set the limit to “none” to return all records. By default, the function will search all tables in the database. To specify a particular table you can use the table argument.

The return value for the betydb_query function is just a data.frame so we can work with it like any other data.frame in R.

Let’s plot a time series of all traits returned. First you might notice that the relevant date columns in the season_4 data.frame are returned as characters instead of a date format. Before plotting, let’s get our raw_date column into a proper date format and time zone using functions from dplyr and lubridate.

3.4 Querying a specific trait

3.4.1 Querying season 6 canopy height data

You may find after constructing a general query as above that you want to only query a specific trait. Here, we query for the canopy height trait by adding the key-value pair trait = "canopy_height" to our query function. Note that the limit is also set to return only 250 records, shown here for demonstration purposes.