Design notes

Pipelines

Internally S1 Tiling defines a series of pipelines. Actually, it distinguishes pipeline descriptions from actual pipelines. The actual pipelines are generated from their description and input files, and they are handled internally; they won’t be described here.

Each pipeline corresponds to a series of processings. The intended and original design is to have a direct match: one processing == one OTB application, and to permit to chain OTB applications in memory through OTB Python bindings.

However, a processing doesn’t always turn into the execution of an OTB application, sometimes we need to do other computations like calling a python function or executing an external program. Other times, we just need to do some analysis that will be reused later on in the pipeline.

When we need to have files produced at some point, we end a pipeline, the next one(s) can take over from that point.

s1tiling.libs.otbpipeline.PipelineDescriptionSequence(...)

This class is the main entry point to describe pipelines.

s1tiling.libs.otbpipeline.FirstStepFactory(...)

Defines the prototype of FirstStep factory functions accepted in PipelineDescriptionSequence.register_inputs().

Simple pipelines

In simple cases, we can chain the output of an in-memory pipeline of OTB applications into the next pipeline.

At this moment, the following sequence of pipelines is defined:

pipelines = PipelineDescriptionSequence(config)
pipelines.register_pipeline([AnalyseBorders, Calibrate, CutBorders], 'PrepareForOrtho', product_required=False)
pipelines.register_pipeline([OrthoRectify],                          'OrthoRectify',    product_required=False)
pipelines.register_pipeline([Concatenate],                                              product_required=True)
if config.mask_cond:
    pipelines.register_pipeline([BuildBorderMask, SmoothBorderMask], 'GenerateMask',    product_required=True)

For instance, to minimize disk usage, we could chain in-memory orthorectification directly after the border cutting by removing the second pipeline, and by registering the following step into the first pipeline instead:

pipelines.register_pipeline([AnalyseBorders, Calibrate, CutBorders, OrthoRectify],
                            'OrthoRectify', product_required=False)

Complex pipelines

In more complex cases, the product of a pipeline will be used as input of several other pipelines. Also a pipelines can have several inputs coming from different other pipelines.

To do so, we name each pipeline, so we can use that name as input of other pipelines.

For instance, LIA producing pipelines are described this way

pipelines = PipelineDescriptionSequence(config, dryrun=dryrun)
dem = pipelines.register_pipeline([AgglomerateDEMOnS1],
    'AgglomerateDEMOnS1',
    inputs={'insar': 'basename'})
demproj = pipelines.register_pipeline([ExtractSentinel1Metadata, SARDEMProjection],
    'SARDEMProjection',
    is_name_incremental=True,
    inputs={'insar': 'basename', 'indem': dem})
xyz = pipelines.register_pipeline([SARCartesianMeanEstimation],
    'SARCartesianMeanEstimation',
    inputs={'insar': 'basename', 'indem': dem, 'indemproj': demproj})
lia = pipelines.register_pipeline([ComputeNormals, ComputeLIAOnS1],
    'Normals|LIA',
    is_name_incremental=True,
    inputs={'xyz': xyz})

# "inputs" parameter doesn't need to be specified in all the following
# pipeline declarations but we still use it for clarity!
ortho  = pipelines.register_pipeline([filter_LIA('LIA'), OrthoRectifyLIA],
    'OrthoLIA',
    inputs={'in': lia},
    is_name_incremental=True)
concat = pipelines.register_pipeline([ConcatenateLIA],
    'ConcatLIA',
    inputs={'in': ortho})
select = pipelines.register_pipeline([SelectBestCoverage],
    'SelectLIA',
    product_required=True,
    inputs={'in': concat})
ortho_sin  = pipelines.register_pipeline([filter_LIA('sin_LIA'), OrthoRectifyLIA],
    'OrthoSinLIA',
    inputs={'in': lia},
    is_name_incremental=True)
concat_sin = pipelines.register_pipeline([ConcatenateLIA],
    'ConcatSinLIA',
    inputs={'in': ortho_sin})
select_sin = pipelines.register_pipeline([SelectBestCoverage],
    'SelectSinLIA',
    product_required=True,
    inputs={'in': concat_sin})

Pipeline inputs

In order to buid the Direct Acyclic Graph (DAG) of tasks, that will be executed through the pipelines described, we need to inject inputs.

Pipeline inputs need to be registered explicitly. This is done through FirstStepFactories passed to PipelineDescriptionSequence.register_inputs. Each FirstStepFactory takes care of returning a list of FirstSteps. These FirstSteps are expected to hold metadata that will be used to generate the DAG of tasks. They may also obtain related products on-the-fly. For instance: s1_raster_first_inputs_factory() and eof_first_inputs_factory() first check which products are already on disk before trying to download the missing ones.

e.g.:

pipelines.register_inputs('basename', s1_raster_first_inputs_factory)
pipelines.register_inputs('basename', tilename_first_inputs_factory)
pipelines.register_inputs('basename', eof_first_inputs_factory)

As the PipelineDescriptionSequence tries to be as independant of the actual domain as possible, it doesn’t know which information is expected by all the registered FirstStepFactories. By default, Configuration information is passed. But some other information needs to be declared in one or several calls to PipelineDescriptionSequence.register_extra_parameters_for_input_factories.

e.g.:

pipelines.register_extra_parameters_for_input_factories(
    tile_name=tilename,               # Used by all
)

pipelines.register_extra_parameters_for_input_factories(
    dag=dag,                          # Used by eof_first_inputs_factory
    s1_file_manager=s1_file_manager,  # Used by s1_raster_first_inputs_factory
    dryrun=dryrun,                    # Used by all
)

Note

In simplified developer jardon, we use Factory Method design pattern to inverse dependencies.

Dask: tasks

Given pipeline descriptions, a requested S2 tile and its intersecting S1 images, S1 Tiling builds a set of dependant Dask tasks. Each task corresponds to an actual pipeline which will transform a given image into another named image product.

Processing Classes

Again the processing classes are split in two families:

Step Factories

Step factories are the main entry point to add new processings. They are meant to inherit from either one of OTBStepFactory, AnyProducerStepFactory, or ExecutableStepFactory.

They describe processings, and they are used to instanciate the actual step that do the processing.

Inheritance diagram of s1tiling.libs.steps.OTBStepFactory, s1tiling.libs.steps.ExecutableStepFactory, s1tiling.libs.steps.AnyProducerStepFactory, s1tiling.libs.steps._FileProducingStepFactory, s1tiling.libs.steps.Store

s1tiling.libs.steps.StepFactory(name, ...)

Abstract factory for AbstractStep

s1tiling.libs.steps._FileProducingStepFactory(...)

Abstract class that factorizes filename transformations and parameter handling for Steps that produce files, either with OTB or through external calls.

s1tiling.libs.steps.OTBStepFactory(cfg, ...)

Abstract StepFactory for all OTB Applications.

s1tiling.libs.steps.AnyProducerStepFactory(...)

Abstract StepFactory for executing any Python made step.

s1tiling.libs.steps.ExecutableStepFactory(...)

Abstract StepFactory for executing any external program.

s1tiling.libs.steps.Store(appname, *argv, ...)

Factory for Artificial Step that forces the result of the previous app sequence to be stored on disk by breaking in-memory connection.

Steps

Step types are usually instantiated automatically. They are documented for convenience, but they are not expected to be extended.

  • FirstStep is instantiated automatically by the program from existing files (downloaded, or produced by a pipeline earlier in the sequence of pipelines)

  • MergeStep is also instantiated automatically as an alternative to FirstStep in the case of steps that expect several input files of the same type. This is for instance the case of Concatenate inputs. A step is recognized to await several inputs when the dependency analysis phase found several possible inputs that lead to a product.

  • Step is the main class for steps that execute an OTB application.

  • AnyProducerStep is the main class for steps that execute a Python function.

  • ExecutableStep is the main class for steps that execute an external application.

  • AbstractStep is the root class of steps hierarchy. It still get instantiated automatically for steps not related to any kind of application.

Inheritance diagram of s1tiling.libs.steps.Step, s1tiling.libs.steps.FirstStep, s1tiling.libs.steps.ExecutableStep, s1tiling.libs.steps.AnyProducerStep, s1tiling.libs.steps.MergeStep, s1tiling.libs.steps.StoreStep, s1tiling.libs.steps._ProducerStep, s1tiling.libs.steps._OTBStep, s1tiling.libs.steps.SkippedStep

s1tiling.libs.steps.AbstractStep(...)

Internal root class for all actual steps.

s1tiling.libs.steps.FirstStep(*argv, **kwargs)

First step instances are the pipeline staring points.

s1tiling.libs.steps.MergeStep(...)

Kind of FirstStep that merges the result of one or several other steps of same kind.

s1tiling.libs.steps._ProducerStep(...)

Root class for all Steps that produce files

s1tiling.libs.steps._OTBStep(app, *argv, ...)

Step that have a reference to an OTB application.

s1tiling.libs.steps.Step(app, *argv, **kwargs)

Internal specialized Step that holds a binding to an OTB Application.

s1tiling.libs.steps.SkippedStep(app, *argv, ...)

Kind of OTB Step that forwards the OTB application of the previous step in the pipeline.

s1tiling.libs.steps.AnyProducerStep(action, ...)

Generic step for running any Python code that produce files.

s1tiling.libs.steps.ExecutableStep(exename, ...)

Generic step for calling any external application.

s1tiling.libs.steps.StoreStep(previous)

Artificial Step that takes care of executing the last OTB application in the pipeline.

Existing processings

The domain processings are defined through StepFactory subclasses, which in turn will instantiate domain unaware subclasses of AbstractStep for the actual processing.

Main processings

s1tiling.libs.otbwrappers.ExtractSentinel1Metadata(cfg)

Factory that takes care of extracting meta data from S1 input files.

s1tiling.libs.otbwrappers.AnalyseBorders(cfg)

StepFactory that analyses whether image borders need to be cut as described in Margins cutting documentation.

s1tiling.libs.otbwrappers.Calibrate(cfg)

Factory that prepares steps that run SARCalibration as described in SAR Calibration documentation.

s1tiling.libs.otbwrappers.CutBorders(cfg)

Factory that prepares steps that run ResetMargin as described in Margins cutting documentation.

s1tiling.libs.otbwrappers.OrthoRectify(cfg)

Factory that prepares steps that run OrthoRectification as described in Orthorectification documentation.

s1tiling.libs.otbwrappers.Concatenate(cfg)

Abstract factory that prepares steps that run Synthetize as described in Concatenation documentation.

s1tiling.libs.otbwrappers.BuildBorderMask(cfg)

Factory that prepares the first step that generates border maks as described in Border mask generation documentation.

s1tiling.libs.otbwrappers.SmoothBorderMask(cfg)

Factory that prepares the first step that smoothes border maks as described in Border mask generation documentation.

s1tiling.libs.otbwrappers.SpatialDespeckle(cfg)

Factory that prepares the first step that smoothes border maks as described in Spatial despeckle filtering documentation.

Processings for advanced calibration

These processings permit to produce Local Incidence Angles Maps for σ0NORMLIM calibration.

s1tiling.libs.otbwrappers.AgglomerateDEMOnS2(...)

Factory that produces a Step that builds a VRT from a list of DEM files.

s1tiling.libs.otbwrappers.ProjectDEMToS2Tile(cfg)

Factory that produces a ExecutableStep that projects DEM onto target S2 tile as described in Project DEM to S2 tile.

s1tiling.libs.otbwrappers.ProjectGeoidToS2Tile(cfg)

Factory that produces a Step that projects any kind of Geoid onto target S2 tile as described in Project Geoid to S2 tile.

s1tiling.libs.otbwrappers.SumAllHeights(cfg)

Factory that produces a Step that adds DEM + Geoid that cover a same footprint, as described in Sum DEM + Geoid.

s1tiling.libs.otbwrappers.ComputeGroundAndSatPositionsOnDEMFromEOF(cfg)

Factory that prepares steps that run Applications/app_SARComputeGroundAndSatPositionsOnDEM as described in Compute ECEF ground and satellite positions on S2 documentation to obtain the XYZ ECEF coordinates of the ground and of the satellite positions associated to the pixels from the input height file.

s1tiling.libs.otbwrappers.ComputeNormalsOnS2(cfg)

Factory that prepares steps that run ExtractNormalVector on images in S2 geometry as described in Normals computation documentation.

s1tiling.libs.otbwrappers.ComputeLIAOnS2(cfg)

Factory that prepares steps that run SARComputeLocalIncidenceAngle on images in S2 geometry as described in LIA maps computation documentation.

s1tiling.libs.otbwrappers.filter_LIA(LIA_kind)

Generates a new StepFactory class that filters which LIA product shall be processed: LIA maps or sin LIA maps.

s1tiling.libs.otbwrappers.ApplyLIACalibration(cfg)

Factory that concludes σ0 with NORMLIM calibration.

Deprecated processings for advanced calibration

The following processings used to be used in v1.0 of S1Tiling, along some of the previous ones. Starting from v1.1, they are deprecated.

s1tiling.libs.otbwrappers.AgglomerateDEMOnS1(...)

Factory that produces a Step that builds a VRT from a list of DEM files.

s1tiling.libs.otbwrappers.SARDEMProjection(cfg)

Factory that prepares steps that run SARDEMProjection as described in Normals computation documentation.

s1tiling.libs.otbwrappers.SARCartesianMeanEstimation(cfg)

Factory that prepares steps that run SARCartesianMeanEstimation as described in Normals computation documentation.

s1tiling.libs.otbwrappers.OrthoRectifyLIA(cfg)

Factory that prepares steps that run OrthoRectification on LIA maps.

s1tiling.libs.otbwrappers.ComputeNormalsOnS1(cfg)

Factory that prepares steps that run ExtractNormalVector on images in S1 geometry as described in Normals computation documentation.

s1tiling.libs.otbwrappers.ComputeLIAOnS1(cfg)

Factory that prepares steps that run SARComputeLocalIncidenceAngle on images in S1 geometry as described in LIA maps computation documentation.

s1tiling.libs.otbwrappers.ConcatenateLIA(cfg)

Factory that prepares steps that run Synthetize on LIA images.

s1tiling.libs.otbwrappers.SelectBestCoverage(cfg)

StepFactory that helps select only one path after LIA concatenation: the one that have the best coverage of the S2 tile target.

Filename generation

At each step, product filenames are automatically generated by StepFactory.update_filename_meta function. This function is first used to generate the task execution graph. (It’s still used a second time, live, but this should change eventually)

The exact filename generation is handled by StepFactory.build_step_output_filename and StepFactory.build_step_output_tmp_filename functions to define the final filename and the working filename (used when the associated product is being computed).

In some very specific cases, where no product is generated, these functions need to be overridden. Otherwise, a default behaviour is proposed in _FileProducingStepFactory constructor. It is done through the parameters:

  • gen_tmp_dir: that defines where temporary files are produced.

  • gen_output_dir: that defines where final files are produced. When this parameter is left unspecified, the final product is considered to be a intermediary files and it will be stored in the temporary directory. The distinction is useful for final and required products.

  • gen_output_filename: that defines the naming policy for both temporary and final filenames.

Important

As the filenames are used to define the task execution graph, it’s important that every possible product (and associated production task) can be uniquely identified without any risk of ambiguity. Failure to comply will destabilise the data flows.

If for some reason you need to define a complex data flow where an output can be used several times as input in different Steps, or where a Step has several inputs of same or different kinds, or where several products are concurrent and only one would be selected, please check all StepFactories related to LIA dataflow.

Available naming policies

Inheritance diagram of s1tiling.libs.file_naming.ReplaceOutputFilenameGenerator, s1tiling.libs.file_naming.TemplateOutputFilenameGenerator, s1tiling.libs.file_naming.OutputFilenameGeneratorList

Three filename generators are available by default. They apply a transformation on the basename meta information.

s1tiling.libs.file_naming.ReplaceOutputFilenameGenerator(...)

Given a pair [text_to_search, text_to_replace_with], replace the exact matching text with new text in basename metadata.

s1tiling.libs.file_naming.TemplateOutputFilenameGenerator(...)

Given a template: "text{key1}_{another_key}_..", inject the metadata instead of the template keys.

s1tiling.libs.file_naming.OutputFilenameGeneratorList(...)

Some steps produce several products.

Hooks

StepFactory._update_filename_meta_pre_hook

Sometimes it’s necessary to analyse the input files, and/or their names before being able to build the output filename(s). This is meant to be done by overriding StepFactory._update_filename_meta_pre_hook method. Lightweight analysing is meant to be done here, and its result can then be stored into meta dictionary, and returned.

It’s typically used alongside TemplateOutputFilenameGenerator.

StepFactory._update_filename_meta_post_hook

StepFactory.update_filename_meta provides various values to metadata. This hooks permits to override the values associated to task names, product existence tests, and so on.