.. # define a hard line break for HTML
.. |br| raw:: html
.. _developers:
.. index:: Developer documentation
======================================================================
Design notes
======================================================================
.. contents:: Contents:
:local:
:depth: 4
.. _dev_pipeline:
.. index:: Pipelines
Pipelines
---------
Internally S1 Tiling defines a series of pipelines. Actually, it distinguishes
**pipeline descriptions** from actual pipelines. The actual pipelines are
generated from their description and input files, and they are handled
internally; they won't be described here.
Each pipeline corresponds to a series of :ref:`processings `.
The intended and original design is to have a direct match: one processing ==
one OTB application, and to permit to chain OTB applications in memory through
OTB Python bindings.
However, a processing doesn't always turn into the execution of an OTB
application, sometimes we need to do other computations like calling a python
function or executing an external program. Other times, we just need to do some
analysis that will be reused later on in the pipeline.
When we need to have files produced at some point, we end a pipeline, the next
one(s) can take over from that point.
.. autosummary::
:toctree: api
s1tiling.libs.otbpipeline.PipelineDescriptionSequence
s1tiling.libs.otbpipeline.FirstStepFactory
Simple pipelines
++++++++++++++++
In simple cases, we can chain the output of an in-memory pipeline of OTB
applications into the next pipeline.
At this moment, the following sequence of pipelines is defined:
.. code:: python
pipelines = PipelineDescriptionSequence(config)
pipelines.register_pipeline([AnalyseBorders, Calibrate, CutBorders], 'PrepareForOrtho', product_required=False)
pipelines.register_pipeline([OrthoRectify], 'OrthoRectify', product_required=False)
pipelines.register_pipeline([Concatenate], product_required=True)
if config.mask_cond:
pipelines.register_pipeline([BuildBorderMask, SmoothBorderMask], 'GenerateMask', product_required=True)
For instance, to minimize disk usage, we could chain in-memory
orthorectification directly after the border cutting by removing the second
pipeline, and by registering the following step into the first pipeline
instead:
.. code:: python
pipelines.register_pipeline([AnalyseBorders, Calibrate, CutBorders, OrthoRectify],
'OrthoRectify', product_required=False)
Complex pipelines
+++++++++++++++++
In more complex cases, the product of a pipeline will be used as input of
several other pipelines. Also a pipelines can have several inputs coming from
different other pipelines.
To do so, we name each pipeline, so we can use that name as input of other
pipelines.
For instance, LIA producing pipelines are described this way
.. code:: python
pipelines = PipelineDescriptionSequence(config, dryrun=dryrun)
dem = pipelines.register_pipeline([AgglomerateDEMOnS1],
'AgglomerateDEMOnS1',
inputs={'insar': 'basename'})
demproj = pipelines.register_pipeline([ExtractSentinel1Metadata, SARDEMProjection],
'SARDEMProjection',
is_name_incremental=True,
inputs={'insar': 'basename', 'indem': dem})
xyz = pipelines.register_pipeline([SARCartesianMeanEstimation],
'SARCartesianMeanEstimation',
inputs={'insar': 'basename', 'indem': dem, 'indemproj': demproj})
lia = pipelines.register_pipeline([ComputeNormals, ComputeLIAOnS1],
'Normals|LIA',
is_name_incremental=True,
inputs={'xyz': xyz})
# "inputs" parameter doesn't need to be specified in all the following
# pipeline declarations but we still use it for clarity!
ortho = pipelines.register_pipeline([filter_LIA('LIA'), OrthoRectifyLIA],
'OrthoLIA',
inputs={'in': lia},
is_name_incremental=True)
concat = pipelines.register_pipeline([ConcatenateLIA],
'ConcatLIA',
inputs={'in': ortho})
select = pipelines.register_pipeline([SelectBestCoverage],
'SelectLIA',
product_required=True,
inputs={'in': concat})
ortho_sin = pipelines.register_pipeline([filter_LIA('sin_LIA'), OrthoRectifyLIA],
'OrthoSinLIA',
inputs={'in': lia},
is_name_incremental=True)
concat_sin = pipelines.register_pipeline([ConcatenateLIA],
'ConcatSinLIA',
inputs={'in': ortho_sin})
select_sin = pipelines.register_pipeline([SelectBestCoverage],
'SelectSinLIA',
product_required=True,
inputs={'in': concat_sin})
.. _dev_pipeline_inputs:
Pipeline inputs
+++++++++++++++
In order to buid the `Direct Acyclic Graph (DAG)` of tasks, that will be
executed through the pipelines described, we need to inject inputs.
Pipeline inputs need to be registered explicitly. This is done through
``FirstStepFactories`` passed to
:func:`PipelineDescriptionSequence.register_inputs
`.
Each :class:`FirstStepFactory `
takes care of returning a list of :class:`FirstSteps
`. These ``FirstSteps`` are expected to hold
metadata that will be used to generate the DAG of tasks. They may also obtain
related products on-the-fly. For instance:
:func:`s1_raster_first_inputs_factory` and :func:`eof_first_inputs_factory`
first check which products are already on disk before trying to download the
missing ones.
e.g.:
.. code:: python
pipelines.register_inputs('basename', s1_raster_first_inputs_factory)
pipelines.register_inputs('basename', tilename_first_inputs_factory)
pipelines.register_inputs('basename', eof_first_inputs_factory)
As the ``PipelineDescriptionSequence`` tries to be as independant of the actual
domain as possible, it doesn't know which information is expected by all the
registered ``FirstStepFactories``. By default,
:class:`Configuration ` information
is passed. But some other information needs to be declared in one or several
calls to
:func:`PipelineDescriptionSequence.register_extra_parameters_for_input_factories
`.
e.g.:
.. code:: python
pipelines.register_extra_parameters_for_input_factories(
tile_name=tilename, # Used by all
)
pipelines.register_extra_parameters_for_input_factories(
dag=dag, # Used by eof_first_inputs_factory
s1_file_manager=s1_file_manager, # Used by s1_raster_first_inputs_factory
dryrun=dryrun, # Used by all
)
.. note:: In simplified developer jardon, we use `Factory Method` design
pattern to inverse dependencies.
Dask: tasks
-----------
Given :ref:`pipeline descriptions `, a requested S2 tile and its
intersecting S1 images, S1 Tiling builds a set of dependant
:external:doc:`Dask tasks `. Each task corresponds to an actual
pipeline which will transform a given image into another named image product.
.. _dev_processings:
Processing Classes
------------------
Again the processing classes are split in two families:
- the factories: :class:`StepFactory `
- the instances: :class:`Step `
Step Factories
++++++++++++++
Step factories are the main entry point to add new processings. They are meant
to inherit from either one of :class:`OTBStepFactory
`, :class:`AnyProducerStepFactory
`, or :class:`ExecutableStepFactory
`.
They describe processings, and they are used to instanciate the actual
:ref:`step ` that do the processing.
.. inheritance-diagram:: s1tiling.libs.steps.OTBStepFactory s1tiling.libs.steps.ExecutableStepFactory s1tiling.libs.steps.AnyProducerStepFactory s1tiling.libs.steps._FileProducingStepFactory s1tiling.libs.steps.Store
:parts: 1
:top-classes: s1tiling.libs.steps.StepFactory
:private-bases:
.. autosummary::
:toctree: api
s1tiling.libs.steps.StepFactory
s1tiling.libs.steps._FileProducingStepFactory
s1tiling.libs.steps.OTBStepFactory
s1tiling.libs.steps.AnyProducerStepFactory
s1tiling.libs.steps.ExecutableStepFactory
s1tiling.libs.steps.Store
Steps
+++++
Step types are usually instantiated automatically. They are documented for
convenience, but they are not expected to be extended.
- :class:`FirstStep ` is instantiated
automatically by the program from existing files (downloaded, or produced by
a pipeline earlier in the sequence of pipelines)
- :class:`MergeStep ` is also instantiated
automatically as an alternative to :class:`FirstStep
` in the case of steps that expect
several input files of the same type. This is for instance the case of
:class:`Concatenate ` inputs. A step
is recognized to await several inputs when the dependency analysis phase
found several possible inputs that lead to a product.
- :class:`Step ` is the main class for steps
that execute an OTB application.
- :class:`AnyProducerStep ` is the
main class for steps that execute a Python function.
- :class:`ExecutableStep ` is the
main class for steps that execute an external application.
- :class:`AbstractStep ` is the root
class of steps hierarchy. It still get instantiated automatically for steps
not related to any kind of application.
.. inheritance-diagram:: s1tiling.libs.steps.Step s1tiling.libs.steps.FirstStep s1tiling.libs.steps.ExecutableStep s1tiling.libs.steps.AnyProducerStep s1tiling.libs.steps.MergeStep s1tiling.libs.steps.StoreStep s1tiling.libs.steps._ProducerStep s1tiling.libs.steps._OTBStep s1tiling.libs.steps.SkippedStep
:parts: 1
:top-classes: s1tiling.libs.steps.AbstractStep
:private-bases:
.. autosummary::
:toctree: api
s1tiling.libs.steps.AbstractStep
s1tiling.libs.steps.FirstStep
s1tiling.libs.steps.MergeStep
s1tiling.libs.steps._ProducerStep
s1tiling.libs.steps._OTBStep
s1tiling.libs.steps.Step
s1tiling.libs.steps.SkippedStep
s1tiling.libs.steps.AnyProducerStep
s1tiling.libs.steps.ExecutableStep
s1tiling.libs.steps.StoreStep
Existing processings
++++++++++++++++++++
The :ref:`domain processings ` are defined through
:class:`StepFactory ` subclasses, which in
turn will instantiate domain unaware subclasses of :class:`AbstractStep
` for the actual processing.
Main processings
~~~~~~~~~~~~~~~~
.. autosummary::
:toctree: api
s1tiling.libs.otbwrappers.ExtractSentinel1Metadata
s1tiling.libs.otbwrappers.AnalyseBorders
s1tiling.libs.otbwrappers.Calibrate
s1tiling.libs.otbwrappers.CutBorders
s1tiling.libs.otbwrappers.OrthoRectify
s1tiling.libs.otbwrappers.Concatenate
s1tiling.libs.otbwrappers.BuildBorderMask
s1tiling.libs.otbwrappers.SmoothBorderMask
s1tiling.libs.otbwrappers.SpatialDespeckle
Processings for advanced calibration
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
These processings permit to produce Local Incidence Angles Maps for
σ\ :sub:`0`\ :sup:`NORMLIM` calibration.
.. autosummary::
:toctree: api
s1tiling.libs.otbwrappers.AgglomerateDEMOnS2
s1tiling.libs.otbwrappers.ProjectDEMToS2Tile
s1tiling.libs.otbwrappers.ProjectGeoidToS2Tile
s1tiling.libs.otbwrappers.SumAllHeights
s1tiling.libs.otbwrappers.ComputeGroundAndSatPositionsOnDEMFromEOF
s1tiling.libs.otbwrappers.ComputeNormalsOnS2
s1tiling.libs.otbwrappers.ComputeLIAOnS2
s1tiling.libs.otbwrappers.filter_LIA
s1tiling.libs.otbwrappers.ApplyLIACalibration
Deprecated processings for advanced calibration
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
The following processings used to be used in v1.0 of S1Tiling, along some of
the previous ones. Starting from v1.1, they are deprecated.
.. autosummary::
:toctree: api
s1tiling.libs.otbwrappers.AgglomerateDEMOnS1
s1tiling.libs.otbwrappers.SARDEMProjection
s1tiling.libs.otbwrappers.SARCartesianMeanEstimation
s1tiling.libs.otbwrappers.OrthoRectifyLIA
s1tiling.libs.otbwrappers.ComputeNormalsOnS1
s1tiling.libs.otbwrappers.ComputeLIAOnS1
s1tiling.libs.otbwrappers.ConcatenateLIA
s1tiling.libs.otbwrappers.SelectBestCoverage
Filename generation
+++++++++++++++++++
At each step, product filenames are automatically generated by
:func:`StepFactory.update_filename_meta
` function.
This function is first used to generate the task execution graph. (It's still
used a second time, live, but this should change eventually)
The exact filename generation is handled by
:func:`StepFactory.build_step_output_filename ` and
:func:`StepFactory.build_step_output_tmp_filename `
functions to define the final filename and the working filename (used when the
associated product is being computed).
In some very specific cases, where no product is generated, these functions
need to be overridden. Otherwise, a default behaviour is proposed in
:class:`_FileProducingStepFactory ` constructor.
It is done through the parameters:
- ``gen_tmp_dir``: that defines where temporary files are produced.
- ``gen_output_dir``: that defines where final files are produced. When this
parameter is left unspecified, the final product is considered to be a
:ref:`intermediary files ` and it will be stored in the
temporary directory. The distinction is useful for final and required
products.
- ``gen_output_filename``: that defines the naming policy for both temporary
and final filenames.
.. important::
As the filenames are used to define the task execution graph, it's
important that every possible product (and associated production task) can
be uniquely identified without any risk of ambiguity. Failure to comply
will destabilise the data flows.
If for some reason you need to define a complex data flow where an output
can be used several times as input in different Steps, or where a Step has
several inputs of same or different kinds, or where several products are
concurrent and only one would be selected, please check all
:class:`StepFactories ` related to
:ref:`LIA dataflow `.
Available naming policies
~~~~~~~~~~~~~~~~~~~~~~~~~
.. inheritance-diagram:: s1tiling.libs.file_naming.ReplaceOutputFilenameGenerator s1tiling.libs.file_naming.TemplateOutputFilenameGenerator s1tiling.libs.file_naming.OutputFilenameGeneratorList
:parts: 1
:top-classes: s1tiling.libs.file_naming.OutputFilenameGenerator
:private-bases:
Three filename generators are available by default. They apply a transformation
on the ``basename`` meta information.
.. autosummary::
:toctree: api
s1tiling.libs.file_naming.ReplaceOutputFilenameGenerator
s1tiling.libs.file_naming.TemplateOutputFilenameGenerator
s1tiling.libs.file_naming.OutputFilenameGeneratorList
Hooks
~~~~~
:func:`StepFactory._update_filename_meta_pre_hook `
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Sometimes it's necessary to analyse the input files, and/or their names before
being able to build the output filename(s). This is meant to be done by
overriding
:func:`StepFactory._update_filename_meta_pre_hook `
method. Lightweight analysing is meant to be done here, and its result can
then be stored into ``meta`` dictionary, and returned.
It's typically used alongside
:class:`TemplateOutputFilenameGenerator `.
:func:`StepFactory._update_filename_meta_post_hook `
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
:func:`StepFactory.update_filename_meta `
provides various values to metadata. This hooks permits to override the values
associated to task names, product existence tests, and so on.