Operational Tools Framework


General overview. Internal perspective

Similar as WSDL describe web services, in PIXEL we define a structure to handle models and Predictive algorithms (PAs). The structure is divided into various blocks, though we will mainly handle the GeneralInfo and DockerInfo structures for publication purposes (associated with getInfo.json).

OT_framework_general_overview


Specification of a model (getInfo.json)


Introduction

A model (referring to both PIXEL models and PIXEL predictive algorithms) can be specified by a set of parameters related to:

  • name, version, description, category: name, version, description and category of the model.

  • type: type of model (model or pa). The term 'pa' stands for predictive algorithm

  • invocation model: describes how the model is invoked (synchronous execution, asynchronous execution and subscription model).

  • inputs: describes the different inputs (mandatory or optional) needed by the model to be executed

  • outputs: describes the different outputs generated by the model after its execution

  • logging: describes the logging capabilities of the model

The previous fields are specified in a JSON format that will be described in the next chapter. This format is included as a Docker LABEL by the time the Dockerfile of a (PIXEL) model is created, so that once the image is uploaded into the Docker repository (e.g. Dockerhub), it contains all information. A normal user or a software component can perform a docker inspect to get the description of the model. With this information, the user or a software component can generate another JSON (known as instance JSON) with specific information to execute the model (e.g. some input parameters within a given timeframe).


Publication of a model in the PIXEL platform. General flow

There are several steps to be performed if you want publish a model in the PIXEL platform, as depicted in the Figure below.

publication of a model (Steps)

  • Step 1: publish the model providing essential information, basically the one related to DockerInfo (see the OT Framework overview). Here the operation is performed by the Dashboard, but for testing purposes one can also use the OT API (in fact the Dashboard uses it)

  • Step 2: Get the Docker image from the public repository (docker pull). In PIXEL we are using Dockerhub under the index pixelh2020

  • Step 3: Inspect the Docker to get the description of the information (docker inspect). This is stored under the label getInfo.

  • Step 4: Now the Dashboard can request the OT to get information from available models, e.g. before building an instance for executing the model.

For the step 3, it is supposed that the model developer has included a JSON structure as label in the Dockerfile, that will be commented in the next chapter. As it is included as label, it is important minimize it, thus the model developer can use an online service (e.g. via https://jsonformatter.org/json-minify) to make the conversion from a standard JSON file.


JSON format

In order to better understand the JSON format (instead of providing the schema), we will use as example the basic pingcount model, available at https://gitpixel.satrdlab.upv.es/benmomo/ngsi-agents-thpa/src/master/thpa-model-pingcount. The model is able to count the number of pings of a given set of elements of type Ping. This is a PIXEL data model generated by a basic PingTest NGSI agent (Available at https://gitpixel.satrdlab.upv.es/benmomo/ngsi-agents-thpa/src/master/thpa-ping)

The JSON format is as follows:

{
   "name":"pingcount",
   "version":"0.1",
   "description":"PingCount model. It checks for incoming pings from an NGSI Ping Agent and counts them within a timeframe. It serves as documentation example",
   "type":"model",
   "category":"ping",
   "supportSubscription":false,
   "supportExecSync":false,
   "supportExecAsync":true,
   "system":{
      "connectors":[
         {
            "type":"ih-api",
            "description": "Needed connection data to reach the Information Hub API in the PIXEL platform. This API is typically used by models/PA to get data",
            "options":[
               {
                  "name":"url",
                  "type":"string",
                  "pattern":null,
                  "description":"URL string representing the API to reach the IH. If a value is given in getInfo.json, it represents a default value",
                  "required": true,
                  "value": "http://172.24.1.17:8080/archivingSystem/extractor/v1"
               },
               {
                  "name":"user",
                  "type":"string",
                  "pattern": null,
                  "description":"Credentials (user) if IH requires authentication. Currently not needed (future use)",
                  "required": false,
                  "value":null
               },
               {
                  "name":"password",
                  "type":"string",
                  "pattern": null,
                  "description":"Credentials (password) if IH requires authentication. Currently not needed (future use)",
                  "required": false,
                  "value":null
               }
            ]
         },
         {
            "type":"es-api",
            "description": "Needed connection data to reach the Elasticsearch API in the PIXEL platform.This API is typically used by models/PA to store (result) data",
            "options":[
               {
                  "name":"url",
                  "type":"string",
                  "pattern": null,
                  "description":"URL string representing the API to reach Elasticsearch. If a value is given in getInfo.json, it represents a default value",
                  "required": true,
                  "value": "http://172.24.1.11:9200"
               },
               {
                  "name":"user",
                  "type":"string",
                  "pattern": null,
                  "description":"credentials (user) if Elastic requires authentication. Currently not needed (future use)",
                  "required": false,
                  "value":null
               },
               {
                  "name":"password",
                  "type":"string",
                  "pattern": null,
                  "description":"credentials (password) if Elastic requires authentication. Currently not needed (future use)",
                  "required": false,
                  "value":null
               }
            ]
         }

      ]
   },
   "input":[
      {
         "name":"pingset",
         "type":"[urn:pixel:DataSource:Ping]",
         "description":"array of JSON documents of type urn:pixel:DataSource:Ping. Additional info to retrieve the data is given in 'options' array",
         "supportedConnectors": ["ih-api"],
         "metadata":{},
         "required":true,
         "forceinput": true,
         "options":[
            {
               "name":"sourceId",
               "type":"string",
               "value": "ping",
               "description":"sourceId (mapped to Index) within the IH where the NGSI generated pings are stored. If a value is given in getInfo.json, it represents a default value",
               "pattern":null,
               "metadata":{},               
               "required": true
            },
            {
               "name":"from",
               "type":"date-time",
               "value": "",
               "description":". From timestamp. ISO8601 format. If a value is given in getInfo.json, it represents a default value",
               "pattern":null,
               "metadata":{},               
               "required": true
            },
            {
               "name":"to",
               "type":"date-time",
               "value": "",
               "description":". To timestamp. ISO8601 format. If a value is given in getInfo.json, it represents a default value",
               "pattern":null,
               "metadata":{},               
               "required": true
            }
         ]
      }    

   ],   
   "output":[
      {
         "name":"output",
         "type":"urn:pixel:DataModelResult:PingCount",         
         "required":true,
         "description":"output provided by the PingCount execution",
         "supportedConnectors": ["es-api"],
         "metadata:":{},
         "options":[
            {
               "name": "es_index",
               "type": "string",
               "description": " Elasticsearch index to store the result  of the execution. Typically models will store output under index models-output-<name>",
               "pattern": null,
               "required": false,
               "value": "models-output-pingcount"
            }            
         ]
      }
   ],
   "logging": [
        {
            "name": "ping-logging",
            "type": "pixel-logging-format",
            "supportedConnectors": ["es-api"],
            "description": "activity logging for the pingCount model. id (id_execution, idRef (id_model) are given by the OT at invocation time",
            "required": true,
            "verbose": null,
            "metadata": {},
            "options":[
                {
                   "name": "es_index",
                   "type": "string",
                   "description": " Elasticsearch index to store the logs of the execution (start, end, error). All models will store output under same index",
                   "pattern": null,
                   "required": false,
                   "value": "models-logging"
                }            
            ]            
        }
   ]
}

We will go element by element to clarify them:

  • name (mandatory): this is the name of your model (string).

  • version (mandatory): this is the version of your model (string).

  • description (optional): this is the description of your model (string).

  • type (mandatory): this is the type of your model string). Two possible values: model or pa, the latter one for predictive algorithms. It is just a convention as PIXEL differentiates between models and predictive algorithms.

  • category (mandatory): this is the category of your model (string). No specific list of categories have been described in PIXEL, but some have been pre-identified according to the models developed in PIXEL, such as environmental, traffic, pas, etc.

  • supportSubscription (mandatory): indicates whether your model supports subscription mode while execution (boolean). This is the case of some predictive algorithms developed in PIXEL.

  • supportExecSync (mandatory): indicates whether your model supports synchronous mode while execution (boolean). This is not the case for the models developed in PIXEL, but it could be useful in the future.

  • supportExecASync (mandatory): indicates whether your model supports asynchronous mode while execution (boolean). This is the typical case for the models developed in PIXEL. There is no waiting time at invocation and the outputs are stored in the IH as the models end executing.

  • system (mandatory): this is an element intended to encompass general information of the model. Currently it includes one single element (array) called connectors, which will be described later.

  • input (mandatory): this is an array intended to encompass all needed inputs. Such input array will be described later. In the example there is only one single input element, but another model might need more inputs.

  • output (mandatory): this is an array intended to encompass all needed outputs. Such output array will be described later. Typically there will be only one output element, which can be mapped as a JSON document stored in a database (IH in PIXEL)

  • logging (mandatory): this is an array intended to encompass all needed logging mechanisms. Such logging array will be described later. Typically there will be only one logging element, which generates all logging info as JSON documents to be stored in a database (IH in PIXEL)

Now let's deep into the different elements describing the details.

The connectors element defines the ability of a model to connect to different services (endpoints) to get data. It can be open data APIs, databases, etc. In PIXEL, there are basically two connectors that need to be supported natively: (i) the Information Hub ih-api connector and (ii) the Elasticsearch es-api connector. A model will typically get information from the IH via the ih-api and store the data in the IG via de es-pi. The way they are defined follow a common format, as you can appreciate in the JSON example. The different fields are:

  • type (mandatory): this is the supported type (string). Typically ih-api or es-api

  • description (optional): this is the description of this element (string).

  • options (mandatory): this is an array of elements containing the different items to access the service endpoint. The elements given in the example refer to the URL endpoint and credentials (if needed). Let's focus only on the first element, as the format is the same for the others:

    • name (mandatory): name of the element.
    • type (mandatory): type of the element. For URLs it is a string
    • pattern (optional): Allows to specify if the string has to follow a specific pattern. It will allow to pre-check if the string is URL-compliant before launching a model. This field is only useful for the Dashboard.
    • description (optional): description of the element.
    • required (mandatory): indicates if this element is mandatory to build the connector (boolean). This means, in case it is not required, this element might not appear in the JSON instance. This field is useful for the Dashboard, so that it will block (or not) the user unless this element in given before letting him/her execute the model.
    • value (optional): default value of the element. This field is useful to the Dashboard in order to simplify the insertion of data for the user before launching a model. For the PIXEL platform default endpoints for the ih-api and the es-api are given in the example.

The input array includes all needed inputs needed by the model. The way they are defined follow a common format, as you can appreciate in the JSON example. The different fields are:

  • name (mandatory): name of the input (string).

  • type (mandatory): this is the data source (data format) of this input (string). PIXEL specifies a set of different data models trying to follow FIWARE. Formats are important because otherwise models will have to convert them.

  • description (optional): this is the description of this element (string).

  • supportedConnectors (mandatory): this is the connector that will use the model to get the input element (string). In PIXEL typically it will be the ih-api connector. This means that the input is located in the IH and the model will need to contact the IH API to obtain this particular piece of information.

  • metadata (optional): optional element to include extra information specific for the model.

  • required (mandatory): indicates if this input is mandatory to execute the model (boolean). This means, in case it is not required, this input might not appear in the JSON instance. This field is useful for the Dashboard, so that it will block (or not) the user unless this input in given before letting him/her execute the model.

  • forceinput (mandatory): indicates if this input can be forced by the user to execute the model (boolean). This means, in case it is forced, that the input will appear in a different array in the JSON instance. An example will be given in the chapter of Execution of a model (instance.json). The ability of forcing inputs is interesting for what-if scenarios, where the user may work with ficticious data to get results, without the need to have this data stored previously in the IH. This field is important for the Dashboard, so that it can allow the user generate the input (i) as an input element within the input array or (ii) as a forceinput element. For the model, when it starts execution and checks for inputs, the forceinput array will have priority. This will also be explained in the chapter of Execution of a model (instance.json)

  • options (mandatory): this is an array of elements containing the different items to access the element by using the service endpoint. The elements given in the example refer to the sourceId and lower and upper temporal limits. This means that the model will obtain the input by using the ih-api connector (and thus its options array) and also the options array of the input element. With the ih-api connector the model only knows the API, but to reach specific data it needs to build the specific request based on the additional parameters. In the example, sourceId maps to an index of the database, and the temporal limits allow to get all data between that time interval. Let's focus only on the format of the first element, as the format is the same for the others:

    • name (mandatory): name of the element (string).
    • type (mandatory): type of the element (string).
    • value (optional): default value of the element. This field is useful to the Dashboard in order to simplify the insertion of data for the user before launching a model. For the PIXEL platform with inputs coming from NGSI agents, values are typicaly arh-lts-.
    • description (optional): description of the element.
    • pattern (optional): Allows to specify if the string has to follow a specific pattern. This field is only useful for the Dashboard.
    • metadata (optional): optional element to include extra information specific for the model.
    • required (mandatory): indicates if this element is mandatory to build the input (boolean). This means, in case it is not required, this element might not appear in the JSON instance. This field is useful for the Dashboard, so that it will block (or not) the user unless this element in given before letting him/her execute the model.

The output array follows a similar format as the input array, as you can appreciate in the JSON example. The different fields are:

  • name (mandatory): name of the output (string). Typically, if you are having just one element in the array, you can name it output

  • type (mandatory): this is the data source (data format) of this output (string). FIWARE data models are more related to NGSI agents and inputs, for outputs there is no strong need to follow such model (Although possible).

  • required (mandatory): indicates if this output is mandatory for the model (boolean). For one single element in the output array this will typically be true.

  • description (optional): this is the description of this element (string).

  • supportedConnectors (mandatory): this is the connector that will use the model to store the output element (string). In PIXEL typically it will be the es-api connector. This means that the output will be stored in the IH and the model will need to contact the Elasticsearch API to save this particular piece of information.

  • metadata (optional): optional element to include extra information specific for the model.

  • options (mandatory): this is an array of elements containing the different items to allow the model to store the output in a particular place so that it can be later be accessible to other components (e.g. visualization component) Let's focus only on the format of the first (and only) element:

    • name (mandatory): name of the element (string).
    • type (mandatory): type of the element (string).
    • description (optional): description of the element.
    • pattern (optional): Allows to specify if the string has to follow a specific pattern. This field is only useful for the Dashboard.
    • required (mandatory): indicates if this element is mandatory to build the output (boolean). For one single element it is true, unless the model does not need to store any output.
    • value (optional): default value of the element. This field is useful to the Dashboard in order to simplify the insertion of data for the user before launching a model. For the PIXEL platform values are typically models-output-. Therefore, any component looking for results from this model can easily locate all results under this index.

The logging array follows a similar format as the input and output array, as you can appreciate in the JSON example. The different fields are:

  • name (mandatory): name of the logging (string). Typically, if you are having just one element in the array, you can name it <model-name-logging

  • type (mandatory): this is the data source (data format) of this logging (string). In PIXEL we have already defined a specific format containing iso_8601_timestamp, id_execution, id_model, type and message . Models should log at least start and end time.

  • supportedConnectors (mandatory): this is the connector that will use the model to store the logging element (string). In PIXEL typically it will be the es-api connector. This means that the logging will be stored in the IH and the model will need to contact the Elasticsearch API to save this particular piece of information.

  • description (optional): this is the description of this element (string).

  • required (mandatory): indicates if this logging is mandatory for the model (boolean). For one single element in the logging array this will typically be true.

  • verbose (optional): verbosity level for the model (string). A model might provide different levels for logging (e.g. DEBUG, INFO, WARNING, ERROR, etc.)

  • metadata (optional): optional element to include extra information specific for the model.

  • options (mandatory): this is an array of elements containing the different items to allow the model to store the logging in a particular place so that it can be later be accessible to other components (e.g. visualization component) Let's focus only on the format of the first (and only) element:

    • name (mandatory): name of the element (string).
    • type (mandatory): type of the element (string).
    • description (optional): description of the element.
    • pattern (optional): Allows to specify if the string has to follow a specific pattern. This field is only useful for the Dashboard.
    • required (mandatory): indicates if this element is mandatory to build the output (boolean). For one single element it is true, unless the model does not need to store any logging.
    • value (optional): default value of the element. This field is useful to the Dashboard in order to simplify the insertion of data for the user before launching a model. For the PIXEL platform values are typically models-logging>. All models will then log data under the same index, but they can be filtered by id_model, id_execution, etc..

The logged info is basically described in the Figure below:

publication of a model (Steps)

The timestamp is given in ISO 8601 format since the current version v0.2 (in the previous version -v0.1 it was in UNIX time).

Execution of a model (instance.json)

Introduction. Instances

An instance can be seen as a particularization of a model with specific values that allow the execution of such model. The set of parameters are very similar to the ones of the model, but here the different *value fields are filled according to user's needs:

All fields are specified in a JSON format that will be described in the next chapter, but you may expect a similar structure as for models, with some remarks.


Execution of a model (instance) in the PIXEL platform. General flow

There are several steps to be performed if you want run a model in the PIXEL platform, as depicted in the Figure below. Here we will focus mainly on the OT and model interaction, but you can suppose that either the user or the Dashboard passes the instance.json structure to the OT before running the model. The Docker image is composed of two main parts: (i) the core model by itself as developed within WP4, and (ii) the OT adaptor responsible for interacting with external entities (OT, IH).

publication of a model (Steps)

Note that the OT adaptor drafted in the previous Figure is a scheme of all logical components needed to correctly process the execution request, but it is up to the model developer to decide how to implement them (they can appear in different independent blocks ort not)

  • Step 1: the instance.json file is passed to the controller of the OT adaptor. It is responsible to parse the JSON file and control the flow of the execution. The JSON file is passed as a string (argv[1]). The OT will try to escape all potential strange parameters, so that the parsing in the OT controller is smooth, but be careful on that if you are a model developer.

  • Step 2: Once the JSON has been parsed (and checked), the input retriever of the I/O adaptor is responsible to obtain the inputs, which will typically be stored in the IH. For forced inputs, the data will be available directly in the instance.json

  • Step 3: Depending on the format of the obtained input, an input transformer might be needed to adapt it to the native input format of the core model. In PIXEL we try to work with (FIWARE) data models so that the conversion is avoided or at least minimized.

  • Step 4: The model is executed invoking the core model with the (transformed) input parameters.

  • Step 5: Output data might need to be transformed by the output transformer if it has to be stored in a specific format.

  • Step 6: The controler uses the output writer to store the result, typically in the IH.

For all steps (2-6), the controller also includes a logging module able to log all activity. At least it should log the start and end of the execution.


JSON format (normal)

In order to better understand the JSON format (instead of providing the schema), we will use as example the basic pingcount model, available at https://gitpixel.satrdlab.upv.es/benmomo/ngsi-agents-thpa/src/master/thpa-model-pingcount. The model is able to count the number of pings of a given set of elements of type Ping. This is a PIXEL data model generated by a basic PingTest NGSI agent (Available at https://gitpixel.satrdlab.upv.es/benmomo/ngsi-agents-thpa/src/master/thpa-ping)

The JSON format is as follows:

{
   "idRef": "601088c715f76f0007542876",
   "id": "999d771ae2cabc05ec59a999",
   "name": "pingcount-execution1",
   "description": "pingcount execution 1",
   "mode": "ExecAsync",
   "input": [{
         "name": "pingset",
         "category": "ih-api",
         "type": "[urn:pixel:DataSource:Ping]",
         "description": "array of JSON documents of type urn:pixel:DataSource:Ping",
         "metadata": {},
         "options": [{
               "name": "url",
               "type": "string",
               "description": "URL string representing the API to reach the IH",
               "value": "http://172.24.1.17:8080/archivingSystem/extractor/v1/data"
            }, {
               "name": "sourceId",
               "type": "string",
               "description": "sourceId, mapped to Index, within the IH where the NGSI generated pings are stored",
               "value": "urn:pixel:DataSource:Ping"
            }, {
               "name": "from",
               "type": "date-time",
               "description": "From timestamp. ISO8601 format",
               "value": "2021-01-20T11:11:11+02:00"
            }, {
               "name": "to",
               "type": "date-time",
               "description": "To timestamp. ISO8601 format",
               "value": "2021-01-20T11:20:11+02:00"
            }
         ]
      }
   ],   
   "output": [{
         "name": "output",
         "category": "es-api",
         "type": "urn:pixel:DataModelResult:PingCount",
         "description": "output provided by the PingCount execution",
         "metadata": {},
         "options": [{
               "name": "url",
               "type": "string",
               "description": "URL string representing the API to reach Elasticsearch",
               "value": "http://172.24.1.11:9200"
            }, {
               "name": "es_index",
               "type": "string",
               "description": "Elasticsearch index to store the result  of the execution.",
               "value": "models-output-pingcount"
            }
         ]
      }
   ],
   "logging": [{
         "name": "logging",
         "category": "es-api",
         "type": "pixel-logging-format",
         "description": "activity logging for the pingCount model. id_execution, idRef/id_model are given by the OT at invocation time",
         "verbose" : null,
         "metadata": {},
         "options": [{
               "name": "url",
               "type": "string",
               "description": "URL string representing the API to reach Elasticsearch",
               "value": "http://172.24.1.11:9200"
            }, {
               "name": "es_index",
               "type": "string",
               "description": "Elasticsearch index to store the logs of the execution. start, end, error",
               "value": "models-logging"
            }
         ]
      }
   ]
}

Note that the format is very similar to the model as in fact, it extends from it. Thus we will not comment line by line (you can re-read the getInfo.json chapter if you need), but instead we will focus on specific aspects, differences or remarks:

  • id: this this the ID of the execution itself. It is given by the OT at execution time and allows the model to log activity in the IH including this id, so that it is uniquely identified.

  • idRef: this is the ID of the model itself. By the time the model is published in the OT, it gets a ID. This is also useful for logging and outputting info, so that later it is possible to search for all executions (and results) of a given model.

  • mode: this refers to the way the model is executed (Subscription, ExecSync, ExecAsync). These were given as indepedent fields in the getInfo.json format.

Note also that some fields from getInfo.json, such as version, type, category are not needed here.

Note that there is no system or connector element. Instead of that, it is ported directly to each input element, as the format of the options element is the same. In our example (pingcount), this means that the single input pingtest element will include the url options element, as it is of type ih-api. As the other elements (user,password) are not required, they do not be to be present there.

The same reasoning applies for the output and logging elements. Here they include the url options element from the es-api connector.

Note also as final remark that some fields from the getInfo.json are not necessary for the instance, such as pattern and required, as they are interpreted by the Dashboard and have no further relevance.


JSON format (forceinput)

In the previous section we provided a normal instance.json with inputs and outputs. But it is possible that some inputs might be forced by the user to analyse what-if scenarios. In such cases, the data is provided raw within the instance.json, and the user has to directly type it. Obviously this is valid (user friendly) for easy inputs and not long data structures.

Forced inputs are specified at model level (getInfo.json) with the boolean parameter forceinput set to true. If this is the case and the user includes raw data, then such data appears in a new array called forceinput. We will see that with an example (pingcount), where the only input pingset is added.

The JSON format is as follows:

{
   "idRef": "5e3d771ae2cabc05ec59a14c",
   "id": "999d771ae2cabc05ec59a999",
   "name": "pingcount-execution1",
   "description": "pingcount execution 1",
   "mode": "ExecAsync",
   "input": [],
   "forceinput": [
        {
            "name": "pingset",
            "value":[
                {
                    "id": "Ping",
                    "type": "Ping",
                    "source": {
                        "value": "urn:pixel:DataSource:Ping",
                        "type": "Text"
                    },
                    "from": {
                        "value": "059501e23885-172.17.0.2",
                        "type": "Text"
                    },
                    "when": {
                        "value": "2021-01-20T18:21:40+02:00",
                        "type": "Text"
                    }
                },
                {
                    "id": "Ping",
                    "type": "Ping",
                    "source": {
                        "value": "urn:pixel:DataSource:Ping",
                        "type": "Text"
                    },
                    "from": {
                        "value": "059501e23885-172.17.0.2",
                        "type": "Text"
                    },
                    "when": {
                        "value": "2021-01-20T18:22:40+02:00",
                        "type": "Text"
                    }
                },
                {
                    "id": "Ping",
                    "type": "Ping",
                    "source": {
                        "value": "urn:pixel:DataSource:Ping",
                        "type": "Text"
                    },
                    "from": {
                        "value": "059501e23885-172.17.0.2",
                        "type": "Text"
                    },
                    "when": {
                        "value": "2021-01-20T18:23:40+02:00",
                        "type": "Text"
                    }
                },
                {
                    "id": "Ping",
                    "type": "Ping",
                    "source": {
                        "value": "urn:pixel:DataSource:Ping",
                        "type": "Text"
                    },
                    "from": {
                        "value": "059501e23885-172.17.0.2",
                        "type": "Text"
                    },
                    "when": {
                        "value": "2021-01-20T18:24:40+02:00",
                        "type": "Text"
                    }
                }               
            ]
        }   

   ],  
   "output": [{
         "name": "output",
         "category": "es-api",
         "type": "urn:pixel:DataModelResult:PingCount",
         "description": "output provided by the PingCount execution",
         "metadata": {},
         "options": [{
               "name": "url",
               "type": "string",
               "description": "URL string representing the API to reach Elasticsearch",
               "value": "http://localhost:9200"
            }, {
               "name": "es_index",
               "type": "string",
               "description": "Elasticsearch index to store the result  of the execution.",
               "value": "models-output-pingcount"
            }
         ]
      }
   ],
   "logging": [{
         "name": "logging",
         "category": "es-api",
         "type": "pixel-logging-format",
         "description": "activity logging for the pingCount model. id (id_execution, idRef (id_model) are given by the OT at invocation time",
         "verbose" : null,
         "metadata": {},
         "options": [{
               "name": "url",
               "type": "string",
               "description": "URL string representing the API to reach Elasticsearch",
               "value": "http://localhost:9200"
            }, {
               "name": "es_index",
               "type": "string",
               "description": "Elasticsearch index to store the logs of the execution (start, end, error)",
               "value": "models-logging"
            }
         ]
      }
   ]
}

Note the difference in this example with the previous section:

  • input: this element appears as empty, and it might even not appear, if all inputs are provided in the forceinput array. But if a model has several inputs, some of them may appear here whereas others may appear in the forceinput array. From the point of view of the logic of the model, when it starts checking for inputs, it shoulf FIRST chekc them in the forceinput array, in case it exists and THEN, if not, check in the input array. This means that forceinput has priority over input.

  • forceinput: this represents the raw data introduced directly here, as value field of the pingset element. This should theoretically be the same input (maybe with some transformations) as the model will obtain by invoking the IH with the url, sourceId, from and to options parameters.

The arrays output and logging remain the same.


Scheduling executions of a model (scheduledInstance.json)

Introduction. Instances

An scheduledInstance is an extension of an instance (see previous section) that can be launched several times according to a certain schedule. The set of parameters are identical to an instance, but additionally there is a new element called scheduleInfo:

"scheduleInfo" : {
        "start" : "2021-01-20T11:11:11+02:00",
        "unit" : "minute",
        "value" : 5
    }

This element is telling the OT when the model needs to be run: - First execution: will start at start time (in the example 2021-01-20T11:11:11+02:00). The date-time value is given in ISO 8601 format in the new version v0.2 (in previous version -v0.1 - it was UNIX time).

  • Next executions: will be launched every 5 minutes starting from start time.

If the start time had already passed by the time the scheduledInstance is created, the OT will act as if it had actually started in that time and will launch the first execution in the nearest 5-minute interval (according to the example) from the current date.


JSON format

In the previous section we provided a two instance.json examples (with and without forceinputs). For scheduledInstances this is irrelevant, as the main difference lies in the inclusion of the scheduleInfo element. Typically it will use more the input than the forceinput, so we provide an example with input for the pingcount model:

The JSON format is as follows:

{
   "idRef": "601088c715f76f0007542876",   
   "name": "pingcount-scheduledExecution1",
   "description": "pingcount execution 1",
   "mode": "ExecAsync",
   "user": null,
   "active":true,
   "scheduleInfo" : {
        "start" : "2021-01-20T11:11:11+02:00",
        "unit" : "minute",
        "value" : 5
    },
   "input": [{
         "name": "pingset",
         "category": "ih-api",
         "type": "[urn:pixel:DataSource:Ping]",
         "description": "array of JSON documents of type urn:pixel:DataSource:Ping",
         "metadata": {},
         "options": [{
               "name": "url",
               "type": "string",
               "description": "URL string representing the API to reach the IH",
               "value": "http://172.24.1.17:8080/archivingSystem/extractor/v1/data"
            }, {
               "name": "sourceId",
               "type": "string",
               "description": "sourceId, mapped to Index, within the IH where the NGSI generated pings are stored",
               "value": "urn:pixel:DataSource:Ping"
            }, {
               "name": "from",
               "type": "date-time",
               "description": "From timestamp. ISO8601 format",
               "value": "${DATE_MINUTE_INIT}"
            }, {
               "name": "to",
               "type": "date-time",
               "description": "To timestamp. ISO8601 format",
               "value": "${DATE_MINUTE_LAST}"
            }
         ]
      }
   ],   
   "output": [{
         "name": "output",
         "category": "es-api",
         "type": "urn:pixel:DataModelResult:PingCount",
         "description": "output provided by the PingCount execution",
         "metadata": {},
         "options": [{
               "name": "url",
               "type": "string",
               "description": "URL string representing the API to reach Elasticsearch",
               "value": "http://172.24.1.11:9200"
            }, {
               "name": "es_index",
               "type": "string",
               "description": "Elasticsearch index to store the result  of the execution.",
               "value": "models-output-pingcount"
            }
         ]
      }
   ],
   "logging": [{
         "name": "logging",
         "category": "es-api",
         "type": "pixel-logging-format",
         "description": "activity logging for the pingCount model. id_execution, idRef/id_model are given by the OT at invocation time",
         "verbose" : null,
         "metadata": {},
         "options": [{
               "name": "url",
               "type": "string",
               "description": "URL string representing the API to reach Elasticsearch",
               "value": "http://172.24.1.11:9200"
            }, {
               "name": "es_index",
               "type": "string",
               "description": "Elasticsearch index to store the logs of the execution. start, end, error",
               "value": "models-logging"
            }
         ]
      }
   ]
}

You can easily see that there is practically no difference compared to an instance, except for: - scheduleInfo: this has already been commented before

  • active: this field is no longer used since version v0.2 (you can leave it to true, or just remove it)

  • timing functions: you should have notices that timing parameters in the input element, such as from and to, have a special value (${DATE_MINUTE_INIT}, ${DATE_MINUTE_INIT}). This makes absolutely sense as otherwise, the pingcount model would always make the same operation with the same input. By means of theses special functions, data will always refer to the current minute, every 5 minutes (according to cheduleInfo in the example). Thus:

    • every 5 minutes the pingcount model will be launched
    • input data will be taken from the last minutes
    • calculation will be performed and the result will be stored in the IH

The timing functions are pre-processed by the OT, so the pingcount model will get an ISO 8601 date every time it is launched. This functionality is particularly useful to make periodic calculations every hour, day, week, month. The set of timing functions currently supported by the OT (v0.2 version) are:

Format Description(Unix format - millis) Potential Use
${DATE_current} Current date Models started by triggers?
${DATE_MINUTE_INIT} Date of the first second of the current minute test,RT data
${DATE_MINUTE_LAST} Date of the last second of the current minute test,RT data
${DATE_HOUR_INIT} Date of the first second of the current hour traffic,weather
${DATE_HOUR_LAST} Date of the last second of the current minute traffic,weather
${DATE_DAY_INIT} Date of the first second of the current day PAS
${DATE_DAY_LAST} Date of the last second of the current day PAS
${DATE_WEEK_INIT} Date of the first second of the current week (starts in Sunday) PEI
${DATE_WEEK_LAST} Date of the last second of the current week (ends in Saturday) PEI
${DATE_WEEK_AGO_INIT} Date of the first second of the last week (starts in Sunday) PEI,PAS
${DATE_WEEK_AGO_LAST} Date of the last second of the last week (ends in Saturday) PEI,PAS
${DATE_MONTH_INIT} Date of the first second of the current month PEI
${DATE_MONTH_LAST} Date of the last second of the current month PEI

Note: Additional timing functions could be added (by demand) on further versions of the OT

JSON schemas

For formality, JSON schemas are also provided for getInfo, instances and scheduledInstances. Therefore, you can cross-check/validate with your favourite libraries or with an online tool (e.g. https://www.jsonschemavalidator.net/) if a particular JSON file is valid.

  • You can see the getInfo schema here

  • You can see the instance schema here

  • You can see the scheduledInstance schema here

With the help of the schemas you can also generate online forms with the help of tools such as JSON Editor. You can find in this github repo more examples and a JSON-Editor Interactive Playground