Skip to main content

Practical Lab WS 23/24 Scalable Machine Learning

Scalable Machine Learning

Under direction of
Dr. Alexander Rüttgers
Assistant
Lukas Troska

Dates

Datum: Every friday, 10:15 to 12:00

Location: INS, room 2.035, Friedrich-Hirzebruch-Allee 7

First meeting: Friday, October 13th 2023, 10:15-12:00, INS, room 2.035, Friedrich-Hirzebruch-Allee 7

Contact: ed tod rld ta sregtteur tod rednaxelaa tod b@foo tod de

image
Earth observation data collected by the two German radar satellites TanDEM-X and TerraSAR-X. Both satellites are operated by the German Aerospace Center (DLR).
Credit: DLR (CC BY-NC-ND 3.0)

Content

In this practical lab, we focus on machine learning techniques to analyze large image datasets, more precisely earth observation data from satellites. This involves an efficient parallelization strategy to be able to process these datasets within a reasonable amount of time. Topics covered in this lab are

  • Programming in Python and in PyTorch
  • Deep neural network parallelism schemes (data parallelism, model parallelism, layer pipelining) and basic principles of the mpi4py library
  • Clustering, classification and anomaly detection tasks applied to earth observation data
  • Hyperparameter optimization

All the topics considered will be discussed in a lecture and then implemented in Python by the participants following a given programming sheet. These sheets can be solved alone or in groups up to 3 people. The time for solving a programming tasks is usually 2 weeks.

Background

Earth observation data covers all kind of information from the planet Earth. One viable way to obtain such data is by remote sensing using satellites, such as for instance the Sentinel-1 to Sentinel-6 satellites from the Copernicus program of the European commission. These European satellites gather large amounts of data from Earth (optical, radar, …) which, to a large percentage, is available for everyone.

Data from Earth can be used to better predict and to adapt to climate change, to measure land-use changes such as deforestation or to assist in natural disaster events (flooding, fire, …). However, this requires to not only capture the data but also to employ efficient and scalable machine learning algorithms to obtain actual knowledge from the data. This is, up to a certain amount, the main topic of this practical lab.

Registration

The registration for the practical lab has to be done via Basis with the usual deadlines of the university (“Prüfungsanmeldung und -abmeldung”). More information will be provided in the first meeting on October 13th 2023.

Requirements

Basic programming experience in Python is a necessary requirement. The required software (Python IDE/ code editor such as PyCharm Community Edition, Python package-manager) can be installed on your personal computer. Furthermore, Linux workstations are available for the participants at the institute. All operation systems (Linux, macOS, Windows) can be used but assistance in the lab is only possible for Linux. If you do not want to install Linux on your personal computer, using Linux on a virtual machine such as VMware Workstation Player might be an alternative. For master participants there are no further formal requirements.

Exam

All work during the practical lab will be considered for the final grade. Additionally, there will be an oral exam at the end of the semester.