Interactions between iota2 and the configuration file

The configuration file is the only way for the user to communicate setting information to iota2. This information must be checked before iota2 is started. This check must be as exhaustive as possible: check that the format is correct (string, integer, list, …), that the files listed exist or that the parameters are consistent with each other.

We have chosen to use pydantic to model the parameters and make the connection between the configuration file and iota2. Indeed, pydantic allows us to simply model the constraints on each of the parameters. This documentation will be divided into two parts: how to integrate a new section into the configuration file and/or how to complete an existing one. Then it will be detailed the specificities of the use of pydantic in iota2

Integration of a new section with its parameters

Adding a section without constraints on the parameters

Let’s take the example of creating a section called ‘my_new_section’ which would contains 3 parameters ‘param_1’, ‘param_2’ and ‘param_3’.

Step 1: create the section class

from typing import Any, ClassVar

from pydantic import Field

from iota2.configuration_files.sections.cfg_utils import Iota2ParamSection

class MyNewSection(Iota2ParamSection):
    """This define my new cfg section, containing new parameters."""

    section_name: ClassVar[str] = "my_new_section"

    param_1: str = Field(
        None,
        doc_type="None",
        short_desc="param_1 short description",
        long_desc=("This is the not mandatory 'long_desc' of the 'param_1'. "
                   "This parameter is useful to describe more precisely the "
                   "use of the parameter (constraints, limits...) "),
        available_on_builders=["I2Classification"],
        mandatory_on_builders=[])

    param_2: Any = Field(
        None,
        doc_type="None",
        short_desc="param_2 short description",
        long_desc=("This is the not mandatory 'long_desc' of the 'param_2'. "
                   "This parameter is useful to describe more precisely the "
                   "use of the parameter (constraints, limits...) "),
        available_on_builders=["I2Classification"],
        mandatory_on_builders=[])

    param_3: Any = Field(
        None,
        doc_type="None",
        short_desc="param_3 short description",
        long_desc=("This is the not mandatory 'long_desc' of the 'param_3'. "
                   "This parameter is useful to describe more precisely the "
                   "use of the parameter (constraints, limits...) "),
        available_on_builders=["I2Classification"],
        mandatory_on_builders=[])

Step 2: add the section to the list of sections imported by iota2

The new class section must be added to the list of possible sections in the class that reads the config file. The class in charge of reading the configuration file is iota2.configuration_files.read_config_file.read_config_file.

the class attribute i2_available_sections must contains the new section

i2_available_sections = [
        BuilderSection, ChainSection, ClassifSection, TaskRetry, TrainSection,
        SensorsDataInterpSection,
        I2FeatureExtractionSection, DimRedSection, ExternalFeaturesSection,
        PythonDataManagingSection, SciKitSection, SimplificationSection,
        Landsat8Section, Landsat8OldSection, Sentinel2TheiaSection,
        Sentinel2S2CSection, Sentinel2L3ASection, Landsat5OldSection,
        UserFeatSection, OBIASection, MyNewSection
    ]

Once the steps 1 and 2 are completed, then you are able to request your new section’s parameters.

Step 3: Access to parameter value

Access to the parameters thanks to the getParam() method.

Considering a configuration file containing

my_new_section :
{
    param_1:"some string"
    param_2:2
    param_3:[1,"2", 3]
}
import iota2.configuration_files.read_config_file as rcf
i2_cfg_params = rcf.read_config_file(self.cfg)

print(self.i2_cfg_params.getParam("my_new_section", "param_1"))
print(self.i2_cfg_params.getParam("my_new_section", "param_2"))
print(self.i2_cfg_params.getParam("my_new_section", "param_3"))

will return

some string
2
[1, '2', 3]

Details about the construction of the section and its parameters

Inheritance

All sections must inherit from the Iota2ParamSection class. This class provides several services, for example unrecognized_fields which automatically detects whether the field in the configuration file is known or not. For example, without changing anything in our MyNewSection class, if the configuration file contains :

my_new_section :
{
    param_1:"some string"
    param_2:2
    param_3:[1,"2", 3]
    param_4:"some value"
}

Then when iota2 will launched a warning will inform the user that the param_4 parameter is unknown in the terminal:

ConfigNotRecognisedParamWarning:  iota2 configuration file warning : 'param_4' parameter not recognised

The dict, deactivate_fields and add_fields methods from the class Iota2ParamSection allow iota2 to handle the total removal of certain fields. This functionality is used to remove fields that are exclusive to a certain builder. We will see this in more detail in a concrete example in the section Make a parameter mandatory for a builder

Typing

In the example above, param_1 get the type hint str, others are typed Any. These type hints will automatically be used by pydantic to check the type of the field. It is possible to use conventional python types, or custom classes.

Using the Field class

It is strongly recommended to use the Field class provided by pydantic. Natively, this class allows to store extra arguments. In iota2, we have chosen to add the doc_type, short_desc, long_desc, available_on_builders and mandatory_on_builders parameters. The doc_type, short_desc, long_desc and available_on_builders parameters are only dedicated to the automatic generation of documentation. The mandatory_on_builders parameter allows you to target for which builders a parameter cannot receive a default value. As iota2 implements several builders, it is clear that it is necessary to manage the deactivation of certain fields if they are specific to a builder that will not be used at runtime, especially if the field does not have a default value. You must then inform iota2 for which builder the parameter is available and whether it is mandatory. This information about the default value or not with regard to the builder is the first of the constraints we will see.

Note

The first argument of the Field class is the default value. If there is no default value, then the parameter become mandatory.

Adding constraints on parameters

Make a parameter mandatory for a builder

Consider that in our my_new_section the parameter param_1 will be shared between the builders I2Classification and I2Regression. The param_2 will be only used by the I2Classification builder and param_3 only by the I2Regression builder. Also, param_1 and param_3 will not have default values. Our MyNewSection class then becomes :

class MyNewSection(Iota2ParamSection):
    """This define my new cfg section, containing new parameters."""

    section_name: ClassVar[str] = "my_new_section"

    param_1: str = Field(
        doc_type="None",
        short_desc="param_1 short description",
        long_desc=("This is the not mandatory 'long_desc' of the 'param_1'. "
                   "This parameter is useful to describe more precisely the "
                   "use of the parameter (constraints, limits...) "),
        available_on_builders=["I2Classification", "I2Regression"],
        mandatory_on_builders=["I2Classification", "I2Regression"])
    param_2: Any = Field(
        None,
        doc_type="None",
        short_desc="param_2 short description",
        long_desc=("This is the not mandatory 'long_desc' of the 'param_2'. "
                   "This parameter is useful to describe more precisely the "
                   "use of the parameter (constraints, limits...) "),
        available_on_builders=["I2Classification"],
        mandatory_on_builders=[])
    param_3: Any = Field(
        doc_type="None",
        short_desc="param_3 short description",
        long_desc=("This is the not mandatory 'long_desc' of the 'param_3'. "
                   "This parameter is useful to describe more precisely the "
                   "use of the parameter (constraints, limits...) "),
        available_on_builders=["I2Regression"],
        mandatory_on_builders=["I2Regression"])

The param_3 mandatory_on_builders parameter value receive ["I2Regression"] while param_1 mandatory_on_builders get ["I2Classification", "I2Regression"].

If the configuration using the builder I2Regression file contains :

my_new_section :
{
    param_1:"some string"
    param_2:2
}

builders:
{
     builders_class_name : ["I2Regression"]
}

the following error is thrown :

pydantic.error_wrappers.ValidationError: 1 validation error for MyNewSection
param_3
  field required (type=value_error.missing)

However by using the I2Classification builder instead of the I2Regression one as :

my_new_section :
{
    param_1:"some string"
    param_2:2
}

builders:
{
     builders_class_name : ["I2Classification"]
}

no errors are thrown, even if the param_3 has no default values in our class MyNewSection. This is possible because before instanciating the class, iota2 disables unnecessary fields.

Literal

It is possible to constrain the acceptable values of a parameter using the Literal type hint. For example if we want param_2 values to be in [1,2, “3”] :

from typing import Literal

param_2: Literal[1,2,"3"] = Field(None,
                                  doc_type="None",
                                  short_desc="param_2 short description",
                                  long_desc=("This is the not mandatory 'long_desc' of the 'param_2'. "
                                             "This parameter is useful to describe more precisely the "
                                             "use of the parameter (constraints, limits...) "),
                                  available_on_builders=["I2Classification"],
                                  mandatory_on_builders=[])

Note

It is possible to have different types of values in the list of expected values.

Parameters validators : pre=False/True, always, root validator

Pydantic comes with a field validation system they are called validators. These validators can be used to check other constraints than the data type and to raise python exception if the constraint is not reach. To add a validator to the field, we have to decorate a function with the validator decorator where the first argument is the field name we want to validate.

Below are some examples of how to use validators or root validators with the pre and/or always option.

from pydantic import Field, root_validator, validator

class MyNewSection(Iota2ParamSection):
    """This define my new cfg section, containing new parameters."""

    section_name: ClassVar[str] = "my_new_section"

    param_1: int = Field(
        "param_1_default",
        doc_type="None",
        short_desc="param_1 short description",
        long_desc=("This is the not mandatory 'long_desc' of the 'param_1'. "
                   "This parameter is useful to describe more precisely the "
                   "use of the parameter (constraints, limits...) "),
    available_on_builders=["I2Classification", "I2Regression"],
    mandatory_on_builders=["I2Classification", "I2Regression"])

    param_2: Literal[1, 2, "3"] = Field(
        None,
        doc_type="None",
        short_desc="param_2 short description",
        long_desc=("This is the not mandatory 'long_desc' of the 'param_2'. "
                   "This parameter is useful to describe more precisely the "
                   "use of the parameter (constraints, limits...) "),
        available_on_builders=["I2Classification"],
        mandatory_on_builders=[])

    param_3: Any = Field(
        None,
        doc_type="None",
        short_desc="param_3 short description",
        long_desc=("This is the not mandatory 'long_desc' of the 'param_3'. "
                   "This parameter is useful to describe more precisely the "
                   "use of the parameter (constraints, limits...) "),
        available_on_builders=["I2Regression"],
        mandatory_on_builders=["I2Regression"])

    @validator("param_1", pre=False)
    @classmethod
    def val_1(cls, param):
        print(f"input val_1 : {param}")
        if param is not None:
            param = param + "_val_1"
        return param

    @validator("param_1", pre=True, always=True)
    @classmethod
    def val_2(cls, param):
        print(f"input val_2 : {param}")
        if param is not None:
            param = param + "_val_2"
        return param

    @root_validator()
    @classmethod
    def my_root_val(cls, values):
        """Automatically enable external features on conditions."""
        print(f"root validaror inputs : {values}")
        return values

Every call of read_config_file(“/path/to/my_cfg.cfg”) will produce the trace :

input val_2 : some string
input val_1 : some string_val_2
root validaror inputs : {'param_1': 'some string_val_2_val_1', 'param_2': 1, 'param_3': None, 'builders': ['I2Classification']}

The parameter pre=True mean that the concerned validator must be the first to be called on this parameter. The root_validator get the entire section as inputs, this is the place to validate parameters at section level. The argument always=True mean that every values must be validated, even the default ones ie : if the param_1 has no value in the configuration file, the trace become :

input val_2 : param_1_default
input val_1 : param_1_default_val_2
root validaror inputs : {'param_1': 'param_1_default_val_2_val_1', 'param_2': 1, 'param_3': None, 'builders': ['I2Classification']}

Warning

It is also interesting to note that the root validator receives the builders section. This section is here for information purposes and to modulate the behaviour of the field according to the builders requested by the user.

the above examples are only an overview of the use of pydantic validators, more information is available in the pydantic documentation.

Adding constraints across different sections

The section’s scope is limited to their own class, section classes cannot see the values of other sections. In iota2, we decided that it would be up to the builder builders to see overall view and check for compatibility between parameters present in different sections. This check is done by calling the pre_check method. Thanks to the self.i2_cfg_params attribute present in all builders, all parameters in the configuration file are accessible via the getParam(section_name, field_value) method.

Adding a new builder

You only need to modify the pydantic class which contains the builders section. For example, if you need to add the builder new_builder :

class BuilderSection(BaseModel):
    """Parameters of the section 'builders'."""

    section_name: ClassVar[str] = I2_CONST.i2_builders_section
    avail_builders: ClassVar[Tuple[str]] = ("I2Classification",
                                            "I2Regression",
                                            "I2FeaturesMap", "I2Obia",
                                            "new_builder")
    builders_paths: List[str] = Field(
        None,
        doc_type="str",
        short_desc="The path to user builders",
        long_desc=("If not indicated, the iota2 source directory"
                   " is used: */iota2/sequence_builders/"),
        available_on_builders=("I2Classification", "I2Regression",
                               "I2FeaturesMap", "I2Obia", "new_builder"))

    builders_class_name: List[str] = Field(
        ["I2Classification"],
        doc_type="list",
        short_desc="The name of the class defining the builder",
        long_desc=("Available builders are : 'I2Classification', "
                   "'I2FeaturesMap', 'I2Obia' and 'I2Regression'"),
        available_on_builders=("I2Classification", "I2Regression",
                               "I2FeaturesMap", "I2Obia", "new_builder"))

and possibly modify the builders_compatibility validator from the class BuilderSection which checks the consistency of builders to be run.

Note

the class BuilderSection must not inherit from Iota2ParamSection.

Clarification on the Operation of the baseclass class Iota2ParamSection

The reading of the configuration file must pass only through the read_config_file class. The role of this class is to read the configuration file, to check the typing of the fields in the configuration file and to check the consistency of the parameters within the same section, not across different sections. It is also via this class and in particular the get_params_descriptions() method that the documentation is generated.

The design of the configuration file class was mainly thought to answer the need to gather under the same class the reading of configuration files dedicated to different builders which may or may not share fields. Knowing that iota2 can run several builders sequentially or in different executions, the difficulty lies in disabling certain fields (which may not have default values) depending on the builders involved.

The implemented solution is to first read the builders section of the configuration file and to add add the value of the requested builders by adding a new field in each sections thanks to the class method add_fields. Once the class is instantiated, the root_validator deactivate_fields (from Iota2ParamSection) will disable any fields which are not used by the builders requested by the user. It is for this reason that the section classes must all inherit from Iota2ParamSection and there is no global root_validator in the class that models the fields. The extra-section validations are then done after instantiating all sections at the builders level.

Below is a snippet of class Iota2ParamSection

class Iota2ParamSection(BaseModel, extra=Extra.allow):

    @root_validator(pre=True)
    @classmethod
    def deactivate_fields(cls, values):
        """Deactivate non-mandatory parameters (regarding builders)."""
        current_builders = cls.schema()["properties"][
            I2_CONST.i2_builders_section]["default"]
        avail_fields = list(cls.__fields__.keys())
        for field in avail_fields:
            mandatory_on_builders = cls.schema()["properties"][field].get(
                "mandatory_on_builders", {})
            if mandatory_on_builders:
                buff = []
                for current_builder in current_builders:
                    buff.append(current_builder in mandatory_on_builders)
                if not any(buff):
                    # deactivate
                    cls.__fields__.get(field).required = False
                else:
                    cls.__fields__.get(field).required = True
        return values

    @classmethod
    def add_fields(cls, **field_definitions: Any):
        """Add fields to an existing BaseModel.

        Tribute to https://github.com/samuelcolvin/pydantic/issues/1937

        Note
        ----
        If the field already exists, it will be overwritted
        """
        new_fields: Dict[str, ModelField] = {}
        new_annotations: Dict[str, Optional[type]] = {}

        for f_name, f_def in field_definitions.items():
            if isinstance(f_def, tuple):
                try:
                    f_annotation, f_value = f_def
                except ValueError as err:
                    raise Exception(
                        ("field definitions should either be "
                         "a tuple of (<type>, <default>) or just a "
                         "default value, unfortunately this means tuples as "
                         "default values are not allowed")) from err
            else:
                f_annotation, f_value = None, f_def

            if f_annotation:
                new_annotations[f_name] = f_annotation
            new_fields[f_name] = ModelField.infer(name=f_name,
                                                  value=f_value,
                                                  annotation=f_annotation,
                                                  class_validators=None,
                                                  config=cls.__config__)

         # remove before update
         for field_name, _ in new_fields.items():
            cls.__fields__.pop(field_name, None)
         for annotation, _ in new_annotations.items():
            cls.__annotations__.pop(annotation, None)

        cls.__fields__.update(new_fields)
        cls.__annotations__.update(new_annotations)
        for _, field_meta in new_fields.items():
            cls.schema()["properties"][
                I2_CONST.i2_builders_section]["default"] = field_meta.default

Actually, the method deactivate_fields is a root_validator(pre=True) which allow iota2 to get the raw data from the configuration file before any field validator as explain in https://pydantic-docs.helpmanual.io/usage/validators/#root-validators.

New developers are encouraged to examine existing fields (in module iota2.configuration_files.sections) in order to understand how to deal with errors and incompatibility.