Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 
saveriospa
Partner - Contributor III
Partner - Contributor III

How does an entity configured as Incremental works in Qlik Catalog?

Hi All,

we're using  Qlik Catalog to import huge tables (containing millions of records) as entities and we would like to leverage the Incremental feature. We read the Qlik Catalog Help page Incremental vs. snapshot base type in distribution tables but is not clear how it works.

We tried to define an entity as incremental (in the Source module, for the property Entity Base Type) and onboard data twice.  The first time the full dataset has been onboarded, and the second time the same happened (we expected the delta record there).

  • How does the Incremental works?
  • How the delta records are obtained?
    • There is a place where I can configure the field to use?
  • I'm following the wrong steps to onboard data in an incremental way? If yes, which are the steps to follow?

Please, can you help me?

 

Thanks in advance!

Saverio

Labels (1)
1 Solution

Accepted Solutions
brunnovb_in
Partner - Creator
Partner - Creator

Hello Saveriospa!

I'd like to share my experience about this subject.

The field " Entity Base Type" determines what will happen with loads in table. This way if you do full loads twice you can have:

Snapshot = The lastest load data.

- Incremental = The sum of the two data sets (as you said).

 

So I usually do it like this:

  • First load: I use the default configuration "Entity Base Type" = Snapshot.
  • Other loads: I use the "Entity Base Type" = Incremental, but I always manually change the property "src.file.glob" adding a clause WHERE with last id loaded.

How can I automate this load process?

  • I could create and use a script (shell script) calling the API (method /qdc/entity/v1/loadDataForEntities between others) and automatically control and change the "src.file.glob" property using the last id or using some timestamp field.

 

Regards,

Brunno.

View solution in original post

4 Replies
brunnovb_in
Partner - Creator
Partner - Creator

Hello Saveriospa!

I'd like to share my experience about this subject.

The field " Entity Base Type" determines what will happen with loads in table. This way if you do full loads twice you can have:

Snapshot = The lastest load data.

- Incremental = The sum of the two data sets (as you said).

 

So I usually do it like this:

  • First load: I use the default configuration "Entity Base Type" = Snapshot.
  • Other loads: I use the "Entity Base Type" = Incremental, but I always manually change the property "src.file.glob" adding a clause WHERE with last id loaded.

How can I automate this load process?

  • I could create and use a script (shell script) calling the API (method /qdc/entity/v1/loadDataForEntities between others) and automatically control and change the "src.file.glob" property using the last id or using some timestamp field.

 

Regards,

Brunno.

saveriospa
Partner - Contributor III
Partner - Contributor III
Author

Hello Brunno,
thanks a lot for your quick answer!

I tried what you suggested and it works!

When I execute the second time the data onboard (defining the Entity Base Type as Incremental) and the New radio button in the Load Wizard the system create a separate partition for the delta. After, if I query the entity in the Discover Module, the correct total number of record is shown.

I guess it works also if I select the Append radio button (selecting the first partition created) in the Load Wizard, but I think it is better to separate the different delta in ad-hoc partition.

I'm going to deepen the Qlik Catalog APIs available and the shell script, too!

 

Thanks again for your help!

Saverio

albertogiorgi
Contributor II
Contributor II

Hello Brunno,

do you have an example that uses the "src.file.glob" property in the API call /qdc/entity/v1/loadDataForEntities?

I am not sure where this property must be set inside the JSON.

[
  {
    "badRecordCount": 0,
    "chaffRecordCount": 0,
    "deliveryId": "string",
    "endTime": 0,
    "entityId": 0,
    "entityName": "string",
    "expectedRecordCount": 0,
    "externalData": {
      "additionalProp1": "string",
      "additionalProp2": "string",
      "additionalProp3": "string"
    },
    "goodRecordCount": 0,
    "id": 0,
    "infoMessage": "string",
    "lastUpdTs": 0,
    "loadTime": 0,
    "loadTimeInMillis": 0,
    "loadType": "APPEND",
    "orderSize": 0,
    "recordCount": 0,
    "sampleRecordCount": 0,
    "sourceId": 0,
    "sourceName": "string",
    "sourceType": "BUILTIN",
    "startTime": 0,
    "status": "FAILED",
    "uglyRecordCount": 0,
    "version": 0,
    "workorderProp": [
      {
        "createdBy": "string",
        "createdTs": 0,
        "createdUserId": 0,
        "externalData": {
          "additionalProp1": "string",
          "additionalProp2": "string",
          "additionalProp3": "string"
        },
        "id": 0,
        "lastUpdTs": 0,
        "modifiedBy": "string",
        "modifiedUserId": 0,
        "name": "string",
        "properties": [
          {
            "createdBy": "string",
            "createdTs": 0,
            "createdUserId": 0,
            "description": "string",
            "displayName": "string",
            "externalData": {
              "additionalProp1": "string",
              "additionalProp2": "string",
              "additionalProp3": "string"
            },
            "id": 0,
            "isRequired": true,
            "lastUpdTs": 0,
            "modifiedBy": "string",
            "modifiedUserId": 0,
            "name": "string",
            "nullable": true,
            "properties": [
              null
            ],
            "sortIndex": 0,
            "timeZoneOffset": 0,
            "value": "string",
            "version": 0
          }
        ],
        "timeZoneOffset": 0,
        "value": "string",
        "version": 0
      }
    ]
  }
]

Thank you

brunnovb_in
Partner - Creator
Partner - Creator

Hello albertogiorgi!

I'm attaching an example below.

 

[{
        "timeZoneOffset": 0,
        "id": 1094,
        "version": 0,
        "lastUpdTs": 1661433058312,
        "loadTime": 1661433058126,
        "startTime": null,
        "endTime": null,
        "status": "INITIALIZED",
        "orderSize": 0,
        "recordCount": 0,
        "deliveryId": "DEMO_SYNAPSE.funcionario.20220825131058",
        "infoMessage": null,
        "sampleRecordCount": 0,
        "loadTimeInMillis": 1661433058126,
        "entityId": 921,
        "sourceName": "DEMO_SYNAPSE",
        "sourceId": 207,
        "entityName": "funcionario",
        "sourceType": "JDBC_SOURCE",
        "goodRecordCount": 0,
        "badRecordCount": 0,
        "uglyRecordCount": 0,
        "chaffRecordCount": 0,
        "expectedRecordCount": 0,
        "loadType": "DATA",
        "workorderProp": [{
                "id": 10334,
                "version": 0,
                "createdTs": 1661433058337,
                "createdBy": "bsb",
                "lastUpdTs": 1661433058337,
                "modifiedBy": "bsb",
                "timeZoneOffset": 0,
                "name": "enable.profiling",
                "value": "true"
            }, {
                "id": 10333,
                "version": 0,
                "createdTs": 1661433058322,
                "createdBy": "bsb",
                "lastUpdTs": 1661433058322,
                "modifiedBy": "bsb",
                "timeZoneOffset": 0,
                "name": "enable.validation",
                "value": "true"
            }, {
                "id": 115797,
                "version": null,
                "timeZoneOffset": 0,
                "name": "entityProp.dataset.expected.record.count.adjustment",
                "value": "0"
            }, {
                "id": 115758,
                "version": null,
                "timeZoneOffset": 0,
                "name": "entityProp.default.field.embedded.enclosure.scheme",
                "value": "DOUBLE_EMBEDDED_ENCLOSURE"
            }, {
                "id": 115800,
                "version": null,
                "timeZoneOffset": 0,
                "name": "entityProp.default.field.max.legal.char.length",
                "value": "0"
            }, {
                "id": 115778,
                "version": null,
                "timeZoneOffset": 0,
                "name": "entityProp.default.field.min.legal.char.length",
                "value": "0"
            }, {
                "id": 115782,
                "version": null,
                "timeZoneOffset": 0,
                "name": "entityProp.default.field.nullif.pattern.is.regex",
                "value": "false"
            }, {
                "id": 115743,
                "version": null,
                "timeZoneOffset": 0,
                "name": "entityProp.default.field.trim.left",
                "value": "true"
            }, {
                "id": 115779,
                "version": null,
                "timeZoneOffset": 0,
                "name": "entityProp.default.field.trim.right",
                "value": "true"
            }, {
                "id": 115753,
                "version": null,
                "timeZoneOffset": 0,
                "name": "entityProp.distribution.excludes",
                "value": null
            }, {
                "id": 115806,
                "version": null,
                "timeZoneOffset": 0,
                "name": "entityProp.enable.distribution",
                "value": null
            }, {
                "id": 115792,
                "version": null,
                "timeZoneOffset": 0,
                "name": "entityProp.entity.base.type",
                "value": "Snapshot"
            }, {
                "id": 115732,
                "version": null,
                "timeZoneOffset": 0,
                "name": "entityProp.entity.level",
                "value": "MANAGED"
            }, {
                "id": 115783,
                "version": null,
                "timeZoneOffset": 0,
                "name": "entityProp.header.byte.count",
                "value": "0"
            }, {
                "id": 115819,
                "version": null,
                "timeZoneOffset": 0,
                "name": "entityProp.header.defines.field.names",
                "value": "false"
            }, {
                "id": 115805,
                "version": null,
                "timeZoneOffset": 0,
                "name": "entityProp.header.line.count",
                "value": "1"
            }, {
                "id": 115772,
                "version": null,
                "timeZoneOffset": 0,
                "name": "entityProp.header.record.count",
                "value": "0"
            }, {
                "id": 115738,
                "version": null,
                "timeZoneOffset": 0,
                "name": "entityProp.header.validation.pattern",
                "value": "\"id_func\",\"nome\",\"dt_nascimento\"\\r\\n"
            }, {
                "id": 115798,
                "version": null,
                "timeZoneOffset": 0,
                "name": "entityProp.header.validation.pattern.is.regex",
                "value": "false"
            }, {
                "id": 115786,
                "version": null,
                "timeZoneOffset": 0,
                "name": "entityProp.mapred.job.arguments",
                "value": null
            }, {
                "id": 115763,
                "version": null,
                "timeZoneOffset": 0,
                "name": "entityProp.original.name",
                "value": "funcionario"
            }, {
                "id": 115759,
                "version": null,
                "timeZoneOffset": 0,
                "name": "entityProp.qs.publish.count",
                "value": "0"
            }, {
                "id": 115820,
                "version": null,
                "timeZoneOffset": 0,
                "name": "entityProp.record.characterset",
                "value": "UTF_8"
            }, {
                "id": 115750,
                "version": null,
                "timeZoneOffset": 0,
                "name": "entityProp.record.charencoding.confidence",
                "value": "0.00"
            }, {
                "id": 115757,
                "version": null,
                "timeZoneOffset": 0,
                "name": "entityProp.record.close.quote",
                "value": "\""
            }, {
                "id": 115801,
                "version": null,
                "timeZoneOffset": 0,
                "name": "entityProp.record.field.delimiter",
                "value": ","
            }, {
                "id": 115827,
                "version": null,
                "timeZoneOffset": 0,
                "name": "entityProp.record.last.field.has.delimiter",
                "value": "false"
            }, {
                "id": 115821,
                "version": null,
                "timeZoneOffset": 0,
                "name": "entityProp.record.layout",
                "value": "VARIABLE_CHAR_LENGTH_TERMINATED"
            }, {
                "id": 115751,
                "version": null,
                "timeZoneOffset": 0,
                "name": "entityProp.record.max.byte.count",
                "value": "262144"
            }, {
                "id": 115760,
                "version": null,
                "timeZoneOffset": 0,
                "name": "entityProp.record.min.byte.count",
                "value": "1"
            }, {
                "id": 115781,
                "version": null,
                "timeZoneOffset": 0,
                "name": "entityProp.record.open.quote",
                "value": "\""
            }, {
                "id": 115737,
                "version": null,
                "timeZoneOffset": 0,
                "name": "entityProp.record.record.terminator",
                "value": "\\r\\n"
            }, {
                "id": 115817,
                "version": null,
                "timeZoneOffset": 0,
                "name": "entityProp.src.file.glob",
                "value": "SELECT [id_func],[nome],[dt_nascimento] FROM [rh].[funcionario] WHERE [id_func] > 100"
            }, {
                "id": 115822,
                "version": null,
                "timeZoneOffset": 0,
                "name": "entityProp.trailer.byte.count",
                "value": "0"
            }, {
                "id": 115823,
                "version": null,
                "timeZoneOffset": 0,
                "name": "entityProp.trailer.line.count",
                "value": "0"
            }, {
                "id": 115734,
                "version": null,
                "timeZoneOffset": 0,
                "name": "entityProp.trailer.record.count",
                "value": "0"
            }, {
                "id": 115740,
                "version": null,
                "timeZoneOffset": 0,
                "name": "entityProp.trailer.validation.pattern.is.regex",
                "value": "false"
            }, {
                "id": 115723,
                "version": null,
                "timeZoneOffset": 0,
                "name": "sourceProp.default.entity.level",
                "value": "MANAGED"
            }, {
                "id": 115724,
                "version": null,
                "timeZoneOffset": 0,
                "name": "sourceProp.default.field.allow.control.chars",
                "value": "true"
            }, {
                "id": 115726,
                "version": null,
                "timeZoneOffset": 0,
                "name": "sourceProp.default.field.allow.non.ascii.chars",
                "value": "true"
            }, {
                "id": 115727,
                "version": null,
                "timeZoneOffset": 0,
                "name": "sourceProp.distribution.excludes",
                "value": null
            }, {
                "id": 115722,
                "version": null,
                "timeZoneOffset": 0,
                "name": "sourceProp.enable.distribution",
                "value": null
            }, {
                "id": 115725,
                "version": null,
                "timeZoneOffset": 0,
                "name": "sourceProp.mapred.job.arguments",
                "value": null
            }, {
                "id": 115721,
                "version": null,
                "timeZoneOffset": 0,
                "name": "sourceProp.originalSourceName",
                "value": "rh"
            }
        ]
    }
]

 

 

 

You can see this snippet:

....

"name": "entityProp.src.file.glob",
"value": "SELECT [id_func],[nome],[dt_nascimento] FROM [rh].[funcionario]  WHERE [id_func] > 100"

....

 

Best,

Brunno.