Chapter 5 Retrieve source RGB image files
5.1 Learning Objective
In this chapter you will learn:
- Find and retrieve a list of available sensor names
- Retrieve files by downloading them to your system
5.2 Introduction
In this chapter we will show how to use the Python terrutils
library to retrieve the list of available sensor names.
The terrautils
library provides a number of functions that can be used to perform different actions with data that is stored in TerraRef.
The names of the sensors we retrieve using TerraUtils
can provide information on what types of Level 1 data is available.
Our examples also show how to retrieve files of interest from TerraUtils
by using the available API (Application Programming Interface).
5.3 Getting Started
First, we will need to install the terrautils
library into the Python environment.
We can do this by using the pip
utility to install the library from pypi.
Simple run pip install terrautils
in a terminal to install terrautils
.
All the terrautils
functions are now available in Python, although we will only use a very limited number of them.
5.4 Retrieving sensor names
In this section we retrieve the names of different sensor types that are available. This will allow you to understand what files may be available other than just those containing RBG image data.
In order to run Python functions, including those from the terrautils
library, within this Rmarkdown, we have to install and set up reticulate
.
if(!require(reticulate)){
install.packages("reticulate")
reticulate::py_install("terrautils")
}
library(reticulate)
use_virtualenv("r-reticulate")
We will first be using the get_sensor_list
function to retrieve all the data on available sensors.
We will then use the unique_sensor_names
function to extract only the sensor names from the data we just retrieved.
url = 'https://terraref.ncsa.illinois.edu/clowder/'
key = ''
sensors = get_sensor_list(None, url, key)
names = unique_sensor_names(sensors)
The variable names
will now contain the list of all available sensors.
5.5 Retrieving the images
Once we have a list of files and their IDs we can retrieve them one-by-one. We do this by creating a URL that identifies the file to retrieve, making the API call to retrieve the file contents, and writing the contents to disk.
To create the correct URL we start with the one defined before and attach the keyword ‘/files/’ followed by the ID of each file. For example, assuming we have a file ID of ‘111’, the final URL for retrieving the file would be:
By looping through each of our files, and using their ID and filename, we can retrieve the files from the server and store them locally.
We are streaming the data returned from our server request (stream=True
in the code below) due to the high probability of large file sizes.
If the stream=True
parameter was omitted the file’s entire contents would be in the r
variable which could then be written to the local file.
To illustrate how this might work we are going to pre-populated an array of file names and their associated Clowder IDs.
files = [ {"id": "5c507cb74f0c4b0cbe6705f2",
"filename": "rgb_geotiff_L1_ua-mac_2018-06-02__14-12-05-077_right.tif"},
{"id": "5c507cb84f0cfd2aedf5a75a",
"filename": "rgb_geotiff_L1_ua-mac_2018-06-02__14-12-05-077_left.tif"},
{"id": "5c507eaf4f0c4b0cbe6716cd",
"filename": "rgb_geotiff_L1_ua-mac_2018-05-05__11-35-13-442_left.tif"},
{"id": "5c507eaf4f0cfd2aedf5b680",
"filename": "rgb_geotiff_L1_ua-mac_2018-05-05__11-37-40-442_right.tif"}
]
The following code shows how to download the image files.
First we format the base URL for our query allowing us to reuse it for each file.
Next we loop through our array and create a customized URL while making the call to fetch the data using the requests
interface.
Finally we open the output file and use a loop to write the retrieved data.
import requests
from io import open
# We are using the same `url` and `key` variables declared in the previous example above.
filesurl = url + 'files/'
params={ 'key': key }
for f in files:
r = requests.get(filesurl + f["id"], params=params, stream=True)
with open(f["filename"], 'wb') as o:
for chunk in r.iter_content(chunk_size=1024):
if chunk:
o.write(chunk)
The images are now stored on the local file system.