Cloud coverage identification over a parameterized area

The aim of this project is to allow students to experiment with various earth observation datasets, specifically to experiment with cloud coverage information. Filtering-out data with clouds is a common issue for many remote sensing techniques that cannot realize measurements across the clouds. A notable exception is SAR imaging, which is not affected by clouds. In this project, students should work with data in various formats, coordinate systems, projections, etc. These should be unified into a common format, so a cloud mask can be created by joining multiple partial masks or by selecting a mask that provides the highest quality data for a given location and time.

The result of the project should be three functions and command-line utilities.

  • The first function gives information about cloud presence at a given geographic location. On input, it takes the geographic coordinates of a point and returns information about cloud presence. This is at least in a binary form, or better, there should be an option to provide also information about the type of clouds above the specified point and time. This procedure should consider available cloud data and from those select a dataset that gives the highest quality information for the given point and time.
  • The second function gives a raster or vector image of cloud coverage above the geographical area specified by boundary coordinates and time. The choice of the output format is up to the developers. However, the output in the form of a raster might be easier to implement, considering that some sources might only provide cloud masks as rasters.
  • The third function gives cloud coverage as a number over the geographical area specified by boundary coordinates and time. Otherwise, it is similar to the first function.

The documentation provided alongside the project should compare the considered cloud datasets and provide reasons why particular datasets were selected.

The input data should consist of at least three different cloud datasets. The choice is up to the students. Some possible input datasets could be the following:

  • Sentinel-2
  • Modis
  • Planet
  • EUMETSAT MSG
  • METOESAT8
  • CAMS global reanalysis (EAC4) (maybe)

Some steps of the work

  1. Review of existing datasets containing cloud coverage information and selection of three datasets used in the project
  2. Review of state of the art – how is cloud coverage calculated in published literature, existing software, articles, etc.
  3. Preparation of a toy dataset for experimentation with the procedures developed in the project.
  4. Development of a procedure for unifying the datasets into a common format, coordinate system, projection, etc. It should deal with the different resolution of the data, different source data formats, projections. The output of the procedure should be a cloud mask.
  5. Development or utilization of existing method/program which can give information about cloud presence given geographic coordinates of a point and time.
  6. Development or utilization of existing method/program which can calculate cloud coverage given geographic coordinates of an area (x_min, y_min, x_max, y_max), time. It is up to you how the cloud masks are used in this procedure.
  7. Prepare a presentation of the project demonstrating the results using the toy dataset prepared in this project.

Computational resources

A remote computer for long-running code can be provided to students after discussion.

Semester 2020/2021

This project is assigned to Peter Janič and Adam Tkáč

All source codes and documentation for this project should be located here:
https://git.kpi.fei.tuke.sk/groups/svd/cloud-coverage-identification

Links/Resources