Some specificities about HAL cluster

Using S1Tiling Lmod module on HAL

S1Tiling is already installed on HAL (since June 2023). It’s avaiable through Lmod; see also HAL user guides.

# Use the lastest version
ml s1tiling

# Check versions available
ml av version

# Activate a specific version
ml s1tiling/1.0.0rc2-otb7.4

Note

For the moment, only S1Tiling 1.0 RC2 is installed with a dependency to OTB 7.4, Python 3.8.4 and G++ 8.2.

Installation on HAL

You may prefer to install S1Tiling yourself. In that case, there are mainly two x two ways to install S1Tiling on HAL.

If one wants to install S1Tiling from sources instead of pipy, it could be done from the following context. Then, in later steps, use "${S1TILING_SRC_DIR}" instead of s1tiling as pip parameter.

# Proposed directories where it could be installed
TST_DIR=/work/scratch/${USER}/S1Tiling/install
S1TILING_ROOT_DIR=/work/scratch/${USER}/S1Tiling/
S1TILING_SOURCES=sources
S1TILING_SRC_DIR=${S1TILING_ROOT_DIR}/${S1TILING_SOURCES}

cd "${S1TILING_ROOT_DIR}"
git clone git@gitlab.orfeo-toolbox.org:s1-tiling/s1tiling.git ${S1TILING_SOURCES}

…from available OTB module (and w/ pip)

ml otb/7.4-python3.8.4-gcc8.2

# Create a pip virtual environment
python -m venv install_with_otb_module

# Configure the environment with:
source install_with_otb_module/bin/activate
# - an up-to-date pip
python -m pip install --upgrade pip
# - an up-to-date setuptools==57.5.0
python -m pip install --upgrade setuptools==57.5.0

# Finally, install S1Tiling from sources
mkdir /work/scratch/${USER}/tmp
TMPDIR=/work/scratch/${USER}/tmp/ python -m pip install s1tiling

deactivate
ml purge

To use it

ml purge
ml otb/7.4-python3.8.4-gcc8.2
source install_with_otb_module/bin/activate

S1Processor requestfile.cfg

deactivate
ml purge

…from available OTB module (and w/ conda)

ml otb/7.4-python3.8.4-gcc8.2

# Create a conda environment
ml conda
conda create --prefix ./conda_install_with_otb_distrib python==3.7.2

# Configure the environment with:
conda activate "${TST_DIR}/conda_install_with_otb_distrib"
# - an up-to-date pip
python -m pip install --upgrade pip
# - an up-to-date setuptools==57.5.0
python -m pip install --upgrade setuptools==57.5.0

# Finally, install S1Tiling from sources
mkdir /work/scratch/${USER}/tmp
TMPDIR=/work/scratch/${USER}/tmp/ python -m pip install s1tiling

conda deactivate
ml purge

To use it

ml purge
ml conda
ml otb/7.4-python3.8.4-gcc8.2
conda activate "${TST_DIR}/conda_install_with_otb_distrib"

S1Processor requestfile.cfg

conda deactivate
ml purge

…from released OTB binaries…

Given otbenv.profile cannot be unloaded, prefer the above methods based on OTB module.

First let’s start by installing OTB binaries somewhere in your personnal (or project) environment.

# Start from a clean environment
ml purge
cd "${TST_DIR}"
# Install OTB binaries
wget https://www.orfeo-toolbox.org/packages/OTB-7.4.1-Linux64.run
bash OTB-7.4.1-Linux64.run

# Patches gdal-config
cp "${S1TILING_SRC_DIR}/s1tiling/resources/gdal-config" OTB-7.4.1-Linux64/bin/
# Patches LD_LIBRARY_PATH
echo "export LD_LIBRARY_PATH=\"$(readlink -f OTB-7.4.1-Linux64/lib)\${LD_LIBRARY_PATH:+:\$LD_LIBRARY_PATH}\"" >> OTB-7.4.1-Linux64/otbenv.profile

Note

gdal-config is either available from the sources (${S1TILING_SRC_DIR}/s1tiling/resources/gdal-config) or to download from here: gdal-config.

…and with conda

Given the OTB binaries installed, we still need to update the Python bindings for the chosen version of Python.

# Create a conda environment
ml conda
conda create --prefix ./conda_install_with_otb_distrib python==3.7.2

# Configure the environment with:
conda activate "${TST_DIR}/conda_install_with_otb_distrib"
# - an up-to-date pip
python -m pip install --upgrade pip
# - an up-to-date setuptools==57.5.0
python -m pip install --upgrade setuptools==57.5.0
# - numpy in order to compile OTB python bindinds for Python 3.7.2
pip install numpy

# - load OTB binaries
source OTB-7.4.1-Linux64/otbenv.profile
# load cmake and gcc to compile the binding
ml cmake gcc
# And update the bindings
(cd OTB-7.4.1-Linux64/ && ctest -S share/otb/swig/build_wrapping.cmake -VV)
ml unload cmake gcc

# Finally, install S1Tiling from sources
mkdir /work/scratch/${USER}/tmp
TMPDIR=/work/scratch/${USER}/tmp/ python -m pip install s1tiling

conda deactivate
ml purge

To use it

ml purge
ml conda
conda activate "${TST_DIR}/conda_install_with_otb_distrib"
source "${TST_DIR}/OTB-7.4.1-Linux64/otbenv.profile"

S1Processor requestfile.cfg

conda deactivate
ml purge

…and with pip

Given the OTB binaries installed, we still need to update the Python bindings for the chosen version of Python.

# Create a pip virtual environment
ml python
python -m venv install_with_otb_binaries

# Configure the environment with:
source install_with_otb_binaries/bin/activate
# - an up-to-date pip
python -m pip install --upgrade pip
# - an up-to-date setuptools==57.5.0
python -m pip install --upgrade setuptools==57.5.0
# - numpy in order to compile OTB python bindinds for Python
pip install numpy

# - load OTB binaries
source OTB-7.4.1-Linux64/otbenv.profile
# load cmake and gcc to compile the binding
ml cmake gcc
# And update the bindings
(cd OTB-7.4.1-Linux64/ && ctest -S share/otb/swig/build_wrapping.cmake -VV)
ml unload cmake gcc

# Finally, install S1Tiling from sources
mkdir /work/scratch/${USER}/tmp
TMPDIR=/work/scratch/${USER}/tmp/ python -m pip install s1tiling

deactivate
ml purge

To use it

ml purge
source install_with_otb_binaries/bin/activate
source "${TST_DIR}/OTB-7.4.1-Linux64/otbenv.profile"

S1Processor requestfile.cfg

deactivate
ml purge

Executing S1 Tiling as a job

The theory

A few options deserve our attention when running S1 Tiling as a job on a cluster like HAL.

Option Need to know
[PATHS].tmp

Temporary files shall not be generated on the GPFS, instead, they are best generated locally in $TMPDIR. Set this option to %(TMPDIR)s/s1tiling for instance.

[PATHS]
tmp : %(TMPDIR)s/s1tiling

Warning

Of course, we shall not use $TMPDIR when running S1 Tiling on visu nodes. Actually, we should not use S1 Tiling for intensive computation on nodes not dedicated to computations.

[PATHS].srtm

Original SRTM files are stored in /work/datalake/static_aux/MNT/SRTM_30_hgt.

[PATHS]
srtm : /work/datalake/static_aux/MNT/SRTM_30_hgt
[Processing].cache_srtm_by

SRTM files should be copied locally on [PATHS].tmp instead of being symlinked over the GPFS.

[Processing]
cache_srtm_by : copy
[Processing].nb_otb_threads This is the number of threads that will be used by each OTB application pipeline.
[Processing].nb_parallel_processes This is the number of OTB application pipelines that will be executed in parallel.
[Processing].ram_per_process RAM allowed per OTB application pipeline, in MB.
PBS resources
  • At this time, S1 Tiling does not support multiple and related jobs. We can have multiple jobs but they should use different working spaces and so on. This means select value shall be one.
  • The number of CPUs should be equal to the number of threads * the number of parallel processes – and it shall not be less than the product of these two options.
  • The required memory shall be greater that the number of parallel processes per the RAM allowed to each OTB pipeline.

This means, that for

# The request file
[Processing]
nb_parallel_processes: 10
nb_otb_threads: 2
ram_per_process: 4096

Then the job request shall contain at least

#PBS -l select=1:ncpus=20:mem=40gb
# always 1 for select
# cpu = 2 * 10 => 20
# mem = 10 * 4096 => 40gb

TL;DR: here is an example

PBS job file

#!/bin/bash
#PBS -N job-s1tiling
#PBS -l select=1:ncpus=20:mem=40gb
#PBS -l walltime=1:00:00

# NB: Using 5Gb per cpu

# The number of allocated CPUs is in the select parameter let's extract it
# automatically
NCPUS=$(qstat -f "${PBS_JOBID}" | awk '/resources_used.ncpus/{print $3}')
# Let's use 2 threads in each OTB application pipeline
export NB_OTB_THREADS=2
# Let's deduce the number of OTB application pipelines to run in parallel
export NB_OTB_PIPELINES=$(($NCPUS / $NB_OTB_THREADS))
# These two variables have been exported to be automatically used from the
# S1 tiling request file.

# Let's suppose we have a S1Tiling module -- which will be the case
# eventually. See the previous sections in the meantime.
ml s1tiling

mkdir -p "${PBS_O_WORKDIR}/${PBS_JOBID}"
cd "${PBS_O_WORKDIR}/${PBS_JOBID}"
S1Processor S1Processor.cfg || {
    code=$?
    echo "Echec de l'exécution de programme" >&2
    exit ${code}
}

S1 Tiling request file: S1Processor.cfg

[PATHS]
tmp : %(TMPDIR)s/s1tiling
srtm : /work/datalake/static_aux/MNT/SRTM_30_hgt
...

[Processing]
cache_srtm_by: copy
# Let's use the exported environment variables thanks to "%()s" syntax
nb_parallel_processes: %(NB_OTB_PIPELINES)s
nb_otb_threads: %(NB_OTB_THREADS)s
ram_per_process: 4096
...