How to validate on several fields in parallel and ... - Qlik Community

François_T · ‎2022-02-04

I would like to validate the content of several fields on a schema. For simple test like non null, I use the component SchmeaComplicanteCheck, it works fine.

For fields where I need to check if the value is in the authorized list, I have created Validation Rule on the schema. But I have to use one different component each time to check each rule. So if I want to check on 10 fields, i have to go through 10 components, it's correct ?

And if I want to test them all independently, is the best strategy a tReplicate, one branch per test, catch all the rejects ? But then I can't get the lines validating ALL the tests, a tUnite would give me all the lines validating at least 1 test.

So either I test them in a sequence, but each one will block some lines that the next ones won't test (bad), or I test in // and I can check each line for each test, but I can't get the lines validating all the tests very easily (except maybe with a TMAP after that, with inner join between all the branches ? )

Or maybe there is much easier to do that ? 🙂 Thanks in advance !

Anonymous · ‎2022-02-06

Hi

The components like tSchemaComplicanceCheck or tFilterRow will reject the bad lines, my idea is to add a sequence id for each line if there is no a key field, use a tMap to generate different output (only contains sequence id and one field) for each field, do the validation for each output. In next subjob, do an inner job to merge all fields back.

Regards

Shong

François_T · ‎2022-02-08

Clear, indeed it sounds like the simplest way to do it. Thanks for your input 🙂

I have another question though : if i'm setting some float columns to be non nullable, and use a tSchemaComplicanceCheck to check the data, i get a null pointer exception when sending some lines with null values in one of those columns. The non null test seems to be working for strings though. It's not the case for floats ?

Also, I was surprised to see that if I use a Tfileinput excel with this schema and lines with some null values, it does NOT refuse the lines, but instead, it put 0 in the column ! Very surprising and unexpected. Am I getting something wrong here ?

Anonymous · ‎2022-02-09

when you read null value from excel file, make sure the nullable box is checked on the schema.

François_T · ‎2022-02-09

But the purpose of the job is to validate the schema compliance. So for the initial component (the Excel input), I need NOT to use the schema to validate ? And allow nullable values on this component ?

Then why is tSchemaComplicanceCheck failing with null pointer exception when I try to validate the content with some null values in a column with non nullable ? It works fine with String columns, they are also non nullable, and the tSchemaComplicanceCheck is sending the lines in reject with the right error message. But for float columns, null pointer exception. Is that how it should be ?

Anonymous · ‎2022-02-09

It sounds like a bug, if i select 'Check all columns from schema' model and uncheck the nullable box on the component schema, i have the same NULL pointer exception, however, if I select 'custom defined' model and use a customized schema or selecting 'use another schema' model to validate, it works.

ddduser1643959971 · ‎2022-02-12

HI

I have a csv File

name; FirstName ; numero;adresse

1 a;b;12,xx

2 c;b;13;yy

3 x;y;47;zz

4 e;r;45;tt

I want identify the doublon row by FirstName( here row number 1 and 2 firstName =b )

and i want to update the csv file to add new column "status" NotUnique like this

name; FirstName ; numero;adresse;status

1 a;b;12,xx;NotUnique

2 c;b;13;yy;NotUnique

3 x;y;47;zz;unique

4 e;r;45;tt;unique

can some one help me

Thank you very mutch in advance

Anonymous · ‎2022-02-13

@duser dduser , I think it is easy to achieve it, but can you please open a new topic for your question?

Regards

Shong

How to validate on several fields in parallel and keep the lines validating ALL tests ?

Cloud

Talend Data Integration