Working with Daily NASA VIIRS Surface Reflectance Data


Objective:

This tutorial demonstrates how to work with the daily Suomi-NPP NASA VIIRS Surface Reflectance (VNP09GA.001) data product.

The Land Processes Distributed Active Archive Center (LP DAAC) distributes the NASA-produced surface reflectance data products from the Visible Infrared Imaging Radiometer Suite (VIIRS) sensor aboard the joint NOAA/NASA Suomi-National Polar-orbiting Partnership (Suomi-NPP) satellite. The Suomi-NPP NASA VIIRS Surface Reflectance products are archived and distributed in the HDF-EOS5 file format. The Hierarchical Data Format version 5 (HDF5) is a NASA selected format of choice for standard science product archival and distribution and is the underlying format for HDF-EOS5. HDF-EOS is the standard format and I/O library for the Earth Observing System (EOS) Data and Information System (EOSDIS). HDF-EOS5 extends the capabilities of the HDF5 storage format, adding support for the various EOS data types (e.g., point, swath, grid) into the HDF5 framework.

In this tutorial, you will use Python to define the coordinate reference system (CRS) and export science dataset (SDS) layers as GeoTIFF files that can be loaded into a GIS and/or Remote Sensing software program.


Example: Converting VNP09GA files into quality-filtered vegetation indices and examining the trend in vegetation greenness during July 2018 to observe drought in Europe.

Data Used in the Example:


Topics Covered:

  1. Getting Started
    1a. Import Packages
    1b. Set Up the Working Environment
    1c. Retrieve Files
  2. Importing and Interpreting Data
    2a. Open a VIIRS HDF5-EOS File and Read File Metadata
    2b. Subset SDS Layers and read SDS Metadata
  3. Generating an RGB Composite
    3a. Apply a Scale Factor
    3b. Create an Image
  4. Quality Filtering
    4a. Decode Bit Values
    4b. Apply Quality and Land/Water Masks
  5. Functions
    5a. Defining Functions
    5b. Executing Functions
  6. Visualizing Results
    6a. Plot Quality-Filtered VIs
    6b. Scatterplot Comparison of NDVI vs. EVI
  7. Georeferencing
  8. Exporting Results
    8a. Set Up a Dictionary
    8b. Export Masked Arrays as GeoTIFFs
  9. Working with Time Series
    9a. Automation of Steps 2-5, 7-8
    9b. Time Series Analysis
    9c. Time Series Visualization

Prerequisites:

A NASA Earthdata Login account is required to download the data used in this tutorial. You can create an account at the link provided.


Procedures:

NOTE: This tutorial was developed for NASA VIIRS Daily Surface Reflectance HDF-EOS5 files and should only be used for the VNP09GA.001 product.


Python Environment Setup

2. Setup

  • Using your preferred command line interface (command prompt, terminal, cmder, etc.) type the following to successfully create a compatible python environment:
    • conda create -n viirstutorial -c conda-forge --yes python=3.7 h5py numpy pandas matplotlib scikit-image gdal
    • conda activate viirstutorial
    • jupyter notebook

If you do not have jupyter notebook installed, you may need to run:

  • conda install jupyter notebook

TIP: Having trouble activating your environment, or loading specific packages once you have activated your environment? Try the following:

Type: 'conda update conda'

If you prefer to not install Conda, the same setup and dependencies can be achieved by using another package manager such as pip and the requirements.txt file listed above.
Additional information on setting up and managing Conda environments.

Still having trouble getting a compatible Python environment set up? Contact LP DAAC User Services.


1. Getting Started

1a. Import Packages

Import the python packages required to complete this tutorial.


1b. Set Up the Working Environment

Define the input directory containing the VNP09GA.001 files, and create an output directory to store the results from this tutorial.

Change the inDir variable below to the path of the directory where your data is stored (e.g., C:/Data/VIIRS/VNP09A1/).

If you plan to execute this tutorial on your own OS, `inDir` above needs to be changed.


1c. Retrieve Files

In this section, a VNP09GA .h5 file has been downloaded to the inDir defined above.

Make sure that you download the file into the inDir directory defined above to follow along in the tutorial.

The standard format for VIIRS filenames is as follows:

VNP09GA: Product Short Name
A2018183: Julian Date of Acquisition (AYYYYDDD)
h18v03: Tile Identifier (horizontalXXverticalYY)
001: Collection Version
2018184074906: Julian Date of Production (YYYYDDDHHMMSS)

Below, take the Julian Date of Acquisition and convert to calendar date using the datetime package.


2. Importing and Interpreting Data

2a. Open a VIIRS HDF-EOS5 File and Read File Metadata

Read in a VNP09GA HDF-EOS5 file using the h5py package.

The VNP09GA HDF-EOS5 file contains two groups in which data and metadata are stored.

First, the HDFEOS group contains the VNP09GA SDS layers.

Continue into the GRIDS directory.

The VNP09GA data product provides data for three imagery bands (I1, I2, I3) at nominal 500 meter resolution (~ 463 meter) and nine moderate-resolution bands (M1, M2, M3, M4, M5, M7, M8, M10, M11) at nominal 1 kilometer (km) (~926 meter) resolution. This example uses the nominal 1 km M bands.

Below, list the SDS contained in the HDF-EOS5 file at 1 km resolution.

Second, the HDFEOS INFORMATION group contains the metadata for the file. The StructMetadata.0 object stores the file metadata.

Below, read the file metadata and store as a list.

Identify all the objects in the VNP09GA HDF-EOS5 file below.


2b. Subset SDS Layers and read SDS Metadata

Identify and generate a list of all the SDS layers in the HDF-EOS5 file.

Read in the near infrared (SurfReflect_M7), red (SurfReflect_M5), green (SurfReflect_M4), and blue (SurfReflect_M3) SDS layers from the HDF-EOS5 file. These layers will be used later in the tutorial to calculate daily vegetation indices.

List the attributes for one of the SDS.

Extract the scale factor and fill value from the SDS metadata.


3. Generating an RGB Composite

3a. Apply a Scale Factor

Extract the data for each SDS and apply the scale factor defined in the previous section. The scale factor is the same for all surface reflectance M-bands, so you can simply use the scale factor from the red band for the other bands.

Create an RGB array using np.dstack() and set the fill values equal to nan.


3b. Create an Image

Before filtering the data for poor quality observations such as clouds, visualize the non-filtered RGB image using the matplotlib package.

Below, use the skimage package's exposure module (http://scikit-image.org/docs/dev/api/skimage.exposure.html) to apply a contrast stretch and gamma correction to the RGB composite for a better looking image.

Note: stretching the array above inherently changes the values of the array, and those values should be used for visualization purposes only.

Take the stretched RGB composite and plot the results.

Above, the natural color RGB composite shows some cloud cover over VIIRS tile h18v03. This tile covers parts of England, France, Belgium, the Netherlands, Germany, Poland, Denmark, Norway, and Sweden. Notice the dry conditions (tan-brown) particularly evident over Denmark, Germany, and Poland.


4. Quality Filtering

This section demonstrates how to decode bit values, determine which values to keep and which values to mask, and how to apply masks to the VNP09GA data arrays. Begin by importing the Quality Flag 5 from the VNP09GA HDF-EOS5 file.

Surface Reflectance Quality Flag QF5 contains an overall quality determination (good/bad) for each 1 km surface reflectance band (M1-M7).

Table 14. Surface Reflectance Quality Flags (QF5).

Above are the bit field mappings for Quality Flag layer 5. This table can also be found in the VIIRS SR Version 1.7 User Guide under Section 3.3, Table 14 (pp. 34-35). Below, use the 4th, 5th, 6th, and 7th bits to assure that the overall quality for the bands used in this example (M3-5, M7) are set to 0, or Good.

Note: Remember that bits are read from right to left (bit 0 is the Least Significant Bit).


4a. Decode Bit Values

Below, set the number of bits and create a list of all possible combinations (bit values). Based on the table above, only include pixels where the QF5 layer values are equal to 0 for bit numbers 4-7, meaning that the overall quality for bands M3-M5 and M7 are set to Good.

Next, loop through all possible values, convert numeric to binary, find values where bits 4-7 = 0000, and add them to the list of good values.

Notice above, all of the values determined to be good quality contain 0000 reading left to right, which represent bits 4-7.


4b. Apply Quality and Land/Water Masks

Use np.ma.MaskedArray() and the list of good quality values to apply a QA mask to the surface reflectance data from the QF5 layer.

Below, run through the same steps as above, this time using the Surface Reflectance Quality Flag QF2, which contains a land/water mask that can be used to mask out water pixels in the surface reflectance arrays. More information on QF2 can be found in Section 3.3, Table 11 (pp. 32-33) in the VIIRS SR Version 1.7 User Guide.

Below, exclude each value where bits 0-2 are classified as:

010: Inland water
011: Sea Water


5. Functions

5a. Defining Functions

Below, define two functions for calculating the Normalized Difference Vegetation Index (NDVI) and Enhanced Vegetation Index (EVI).

In Python, to create a function, first define (def) the function which includes a name ndviCalc() and add variable names for the inputs needed to be passed into the function (red, nir). Adding return on the next line means that every time that the ndviCalc() function is executed, it will return the results of the equation ((nir - red)/(nir + red)) based on the input provided (red, nir).


5b. Executing Functions

Next, apply the functions to the masked surface reflectance data by passing the bands needed to calculate each Vegetation Index.


6. Visualizing Results

6a. Plot Quality-Filtered VIs

In this section, begin by highlighting the functionality of the matplotlib plotting package. First, make basic plots of the NDVI and EVI results. Next, combine both plots into a single image and add some additional parameters to the plot. Finally, export the completed plot to an image file.

Above, adding ; to the end of the call suppresses the text output from appearing in the notebook. The output is a basic matplotlib plot of the quality filtered, water masked NDVI array centered over Germany and Denmark in Europe. Below, create the same plot for EVI.

Next, add some additional parameters to the visualization and combine both plots into a single figure. Export the figure as a .png file to the output directory created earlier in the tutorial.


6b. Scatterplot Comparison of NDVI vs. EVI

Plot all NDVI vs. EVI values for the quality filtered, land-only data derived from VIIRS Surface Reflectance on July 2, 2018.


7. Georeferencing

In this section, identify the upper-left longitude and latitude coordinates from the metadata, the output pixel size, and the proj.4 string that will later be used to export the quality filtered, land-only VIIRS Vegetation Indices data that were created earlier in the tutorial as GeoTIFFs.

Currently, VIIRS HDF-EOS5 files do not contain information regarding the spatial resolution of the dataset within, so here set to the standard nominal resolution of 1 km for this product.

The VNP09GA.001 product is referenced to a projected Sinusoidal coordinate system identical to the one used for the Moderate Resolution Imaging Spectroradiometer (MODIS) collection, defined below.


8. Exporting Results

8a. Set Up a Dictionary

In this section, create a dictionary containing each of the arrays that will be exported as GeoTIFFS, including the Surface Reflectance bands, the derived vegetation indices, and quality flag layers used above and define a band name for each that will be used to set the output filename.


8b. Export Masked Arrays as GeoTIFFs

Now that the data have been imported, subset, filtered, and functions have been executed, steps to export the results as masked GeoTIFFs using a for loop are provided in this section.


9. Working with Time Series

Here, automate steps 2-5, 7-8 from above to get data for the entire month of July for visualization and statistical analysis of the progression of drought conditions in Europe.

The cell below will find all the .h5 files in that directory, create a list of all the file names, and print the results.

9a. Automation of Steps 2-5, 7-8

This section provides steps to loop through each daily VNP09GA file from July 2018 and import the data, scale the data, apply a land/water mask and filter by quality, calculate VIs, georeference, and export as GeoTIFFs.

Ultimately, near the end of the code block below, append each file into a 3D array containing all NDVI observations for July 2018.

Notice above that the for loop also stacked the monthly observations into a three-dimensional array that we will use to generate time series visualizations in the final section.


9b. Time Series Analysis

This section provides steps for how to calculate basic statistics on the NDVI time series created above, export the results to a .csv file, and visualize the mean and standard deviation of NDVI over Central Europe to gain a better understanding of the progression of drought in the region.

First set a pandas dataframe to store the statistics.

Statistics: The statistics calculated below include box plot statistics, such as the total number of values/pixels (n), the median, and the interquartile range (IQR) of the sample data for values within a feature from a selected date/layer. Also calculated are the `Lower 1.5 IQR`, which represent the lowest datum still within 1.5 IQR of the lower quartile, and the `Upper 1.5 IQR`, which represent the highest datum still within 1.5 IQR of the upper quartile. Additional statistics generated on each layer/observation/feature combination include min/max/range, mean, standard deviation, and variance.

Below, use Numpy functions to calculate statistics on the quality-masked, land-only data. The format( , '4f') command will ensure that the statistics are rounded to four decimal places.

Next, export the Pandas dataframe containing the statistics to a csv file.

Use the previously calculated statistics to plot a time series of Mean NDVI for the month of July 2018. Taking the figure one step further, plot the standard deviation as a confidence interval around the mean NDVI as a time series.

Above, based on the linear regression trend line, it appears that NDVI is decreasing throughout the month of July 2018 over Germany, Denmark, and the surrounding countries. This could indicate that the drought conditions were persisting throughout the month.


9c. Time Series Visualization

Now import the imageio package to create a gif of the daily RGB composites from earlier in the tutorial to visually observe drought conditions taking place over the month of July 2018.

Lastly, visualize the gif created above in the Jupyter Notebook.

Note: If the gif won't display, try increasing the notebook data rate limit when starting a new jupyter notebook: `jupyter notebook --NotebookApp.iopub_data_rate_limit=1.0e10`


Citations


Contact Information

Contact: LPDAAC@usgs.gov

Voice: +1-866-573-3222

Organization: Land Processes Distributed Active Archive Center (LP DAAC)

Website: https://lpdaac.usgs.gov/

Date last modified: 02-23-2022

Work performed under USGS contract G15PD00467 for LP DAAC.

LP DAAC Work performed under NASA contract NNG14HH33I.