Getting Started with Cloud-Native HLS Data in Python

Extracting an EVI Time Series from Harmonized Landsat-8 Sentinel-2 (HLS) data in the Cloud using CMR's SpatioTemporal Asset Catalog (CMR-STAC)

This tutorial demonstrates how to work with the HLS Landsat 8 (HLSL30.002) and Sentinel-2 (HLSS30.002) data products.

The Harmonized Landsat Sentinel-2 (HLS) project produces seamless, harmonized surface reflectance data from the Operational Land Imager (OLI) and Multi-Spectral Instrument (MSI) aboard Landsat-8 and Sentinel-2 Earth-observing satellites, respectively. The aim is to produce seamless products with normalized parameters, which include atmospheric correction, cloud and cloud-shadow masking, geographic co-registration and common gridding, normalized bidirectional reflectance distribution function, and spectral band adjustment. This will provide global observation of the Earth’s surface every 2-3 days with 30 meter spatial resolution. One of the major applications that will benefit from HLS is agriculture assessment and monitoring, which is used as the use case for this tutorial.

NASA's Land Processes Distributed Active Archive Center (LP DAAC) archives and distributes HLS products in the LP DAAC Cumulus cloud archive as Cloud Optimized GeoTIFFs (COG). This tutorial will demonstrate how to query and subset HLS data using the NASA Common Metadata Repository (CMR) SpatioTemporal Asset Catalog (STAC) application programming interface (API). Because these data are stored as COGs, this tutorial will teach users how to load subsets of individual files into memory for just the bands you are interested in--a paradigm shift from the more common workflow where you would need to download a .zip/HDF file containing every band over the entire scene/tile. This tutorial covers how to process HLS data (quality filtering and EVI calculation), visualize, and "stack" the scenes over a region of interest into an xarray data array, calculate statistics for an EVI time series, and export as a comma-separated values (CSV) file--providing you with all of the information you need for your area of interest without having to download the source data file. The Enhanced Vegetation Index (EVI), is a vegetation index similar to NDVI that has been found to be more sensitive to ground cover below the vegetated canopy and saturates less over areas of dense green vegetation.

NOTE: This tutorial no longer uses the PROVISIONAL Version 1.5 daily 30 meter (m) global Harmonized Landsat Sentinel-2 (HLS) Sentinel-2 Multi-spectral Instrument Surface Reflectance (HLSS30) product and the PROVISIONAL Version 1.5 daily 30 meter (m) global Harmonized Landsat Sentinel-2 (HLS) Landsat-8 OLI Surface Reflectance (HLSL30) data. </div>


Use Case Example

This tutorial was developed using an example use case for crop monitoring over a single large farm field in northern California. The goal of the project is to observe HLS-derived mean EVI over a farm field in northern California without downloading the entirety of the HLS source data.

This tutorial will show how to use the CMR-STAC API to investigate the HLS collections available in the cloud and search for and subset to the specific time period, bands (layers), and region of interest for our use case, load subsets of the desired COGs into a Jupyter Notebook directly from the cloud, quality filter and calculate EVI, stack the time series, visualize the time series, and export a CSV of statistics on the EVI of the single farm field.


Data Used in the Example


Topics Covered

  1. Getting Started
    1.1 Import Packages and Set up the Working Environment
  2. Navigating the CMR-STAC API
    2.1 Introduction to the CMR-STAC API
  3. CMR-STAC API: Searching for Items
    3.1 Spatial Querying via Bounding Box
    3.2 Temporal Querying
  4. Extracting HLS COGs from the Cloud
    4.1 Subset by Band
    4.2 Load a Spatially Subset HLS COG into Memory
  5. Processing HLS Data
    5.1 Apply Scale Factor and Calculate EVI
    5.2 Quality Filtering
    5.3 Export to COG
  6. Automation
  7. Stacking HLS Data
    7.1 Open COGs and Stack Using Xarray
    7.2 Visualize Stacked Time Series
    7.3 Export Statistics

Before Starting this Tutorial

Setup and Dependencies

It is recommended to use Conda, an environment manager to set up a compatible Python environment. Download Conda for your OS here: https://www.anaconda.com/download/. Once you have Conda installed, Follow the instructions below to successfully setup a Python environment on Linux, MacOS, or Windows.

This Python Jupyter Notebook tutorial has been tested using Python version 3.7. Conda was used to create the python environment.

> conda create -n hlstutorial -c conda-forge --yes python=3.7 gdal=3.2 rasterio shapely geopandas geoviews holoviews xarray matplotlib cartopy scikit-image hvplot pyepsg  

> conda activate hlstutorial  

> conda install jupyter notebook --yes  

> jupyter notebook

TIP: Having trouble activating your environment, or loading specific packages once you have activated your environment? Try the following in your command line interface:

> conda update conda

or

> conda update --all

NOTE: This tutorial has been successfully tested using GDAL 3.0, 3.1, and 3.2. Python environments with GDAL 3.3 may experience issues access HLS data for the Cumulus cloud archive. If are unsuccessful in accessing HLS data, try installing either GDAL 3.0, 3.1, or 3.2 in you conda environment to see if that resolves your issue. The Python environment setup instructions above installs GDAL 3.2.

Still having trouble getting a compatible Python environment set up? Contact LP DAAC User Services at: https://lpdaac.usgs.gov/lpdaac-contact-us/

If you prefer to not install Conda, the same setup and dependencies can be achieved by using another package manager such as pip.


You will need to have a netrc file set up in your home directory in order to successfully run the code below. Check out the Setting up a netrc File section in the README.

Source Code used to Generate this Tutorial

The repository containing all of the required files is located at: https://git.earthdata.nasa.gov/projects/LPDUR/repos/hls-tutorial/browse


1. Getting Started

1.1 Import Packages

Import the required packages and set the input/working directory to run this Jupyter Notebook locally.