Skip to main content
Announcements
See what Drew Clarke has to say about the Qlik Talend Cloud launch! READ THE BLOG
cancel
Showing results for 
Search instead for 
Did you mean: 
Obsyky
Contributor III
Contributor III

Do not load a data if an attribute value is already present

Hello,
I have these two files

ID VALUE
AA56 158
AB89 85
BA51 89
ID VALUE
AA56 158
AB89  
GB89 1000

 

If I load both files I get this table :

ID VALUE
AA56 158
AB89 85
AB89  
BA51 89
GB89 1000

 

But I would like the second AB89 (the one that has no value), not to be loaded.

So I use Exists in this way: Where Not Exists(ID)

Except that it doesn't load any files. What am I doing wrong?

Labels (1)
1 Solution

Accepted Solutions
RsQK
Creator II
Creator II

Okay, so, I assumed that column "C" = VALUE. I have two excel files in my folder that are identical to your example data. The script:

//Adjust this for your connection, the rest should work:
SET g_path_data = 'LIB://DATA';

RESULT:
LOAD
0 AS ID
AUTOGENERATE 0;

FOR EACH vFile IN FILELIST('$(g_path_data)/Files_indu_*.xlsx')

CONCATENATE (RESULT)
LOAD
A,
ID,
C, //In my excel files this column represents "VALUE"
D,
"E"
FROM [$(vFile)] (ooxml, embedded labels, table is indu)
WHERE NOT EXISTS(ID);

NEXT;

 

Obviously, you will need to adjust the columns that are loaded (could aswell do LOAD * if possible). 

View solution in original post

6 Replies
justISO
Specialist
Specialist

Hi, maybe use

WHERE not isnull(VALUE) and len(VALUE)>0

instead Where Not Exists(ID)

RsQK
Creator II
Creator II

Hi, try:

LOAD * INLINE [
ID,VALUE
AA56,158
AB89,85
BA51,89
];

LOAD * INLINE [
ID,VALUE
AA56,158
AB89,
GB89,1000
]
WHERE NOT EXISTS(ID);

Obsyky
Contributor III
Contributor III
Author

I can't do what you told me. The data I gave, is only an example. I have a dozen files, with a hundred attributes with thousands of lines.

The value attribute could have been filled in very well for both AA89. But I only want to keep the first one that comes to me

RsQK
Creator II
Creator II

That script wasnt meant to be taken as a ready solution, but more of an idea. To achieve what you need, you should use WHERE NOT EXISTS. If you can provide your current script example, i can probably adjust it as needed.

Obsyky
Contributor III
Contributor III
Author

TABLE:
LOAD Distinct
    "A" AS A,
    "ID" AS ID,
    "C" AS C,
    "D" AS D,
    "E" AS E,
    "F" AS F,
    "G" AS G,
    "H" AS H,
    "I" AS I,
    "J" AS J,
    "K" AS K,
    "L" AS L,
    "M" AS M,
    "N" AS N,
    "O" AS O,
    "P" AS P,
    "Q" AS Q,
    "R" AS R,
    "S" AS S,
    "T" AS T,
    "U" AS U,
    "V" AS V,
    "W" AS W,
    "X" AS X,
    "Y" AS Y,
    "Z" AS Z,
    "AA" AS AA,
    "AB" AS AB,
    "AC" AS AC,
    "AD" AS AD,
    "AE" AS AE,
    "AF" AS AF,
    "AG" AS AG,
    "AH" AS AH,
    "AI" AS AI,
    "AJ" AS AJ,
    "AK" AS AK,
    "AL" AS AL,
FROM [Files_indu_*.xlsm]
(ooxml, embedded labels, header is 4 lines, table is [indu]);

 

My code is very simple. I just load all the files that start with "Files_indu_"
I don't understand why it doesn't work.
I tried loading a first file, then after loading the others with Where Not Exists, but it doesn't work either

RsQK
Creator II
Creator II

Okay, so, I assumed that column "C" = VALUE. I have two excel files in my folder that are identical to your example data. The script:

//Adjust this for your connection, the rest should work:
SET g_path_data = 'LIB://DATA';

RESULT:
LOAD
0 AS ID
AUTOGENERATE 0;

FOR EACH vFile IN FILELIST('$(g_path_data)/Files_indu_*.xlsx')

CONCATENATE (RESULT)
LOAD
A,
ID,
C, //In my excel files this column represents "VALUE"
D,
"E"
FROM [$(vFile)] (ooxml, embedded labels, table is indu)
WHERE NOT EXISTS(ID);

NEXT;

 

Obviously, you will need to adjust the columns that are loaded (could aswell do LOAD * if possible).