Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
File1 has a few million records. timePersonnel is Time +Personnel
Time Stamp | Personnel Id | Info1 | Info2 | timePersonnel |
08:00 | 1001 | test1001 | test2001 | 08:00-1001 |
08:00 | 1002 | test1002 | test2002 | 08:00-1002 |
08:00 | 1003 | test1003 | test2003 | 08:00-1003 |
08:00 | 1004 | test1004 | test2004 | 08:00-1004 |
08:00 | 1005 | test1005 | test2005 | 08:00-1005 |
09:00 | 1001 | abc1001 | abc2001 | 09:00-1001 |
File2.csv has the following content
timePersonnel |
08:00-1001 |
08:00-1002 |
08:00-1003 |
09:00-1001 |
I would like to obtain the results similar to the below.
Time Stamp | Personnel Id | Info1 | Info2 | timePersonnel |
08:00 | 1001 | test1001 | test2001 | 08:00-1001 |
08:00 | 1002 | test1002 | test2002 | 08:00-1002 |
08:00 | 1003 | test1003 | test2003 | 08:00-1003 |
08:00 | 1004 | test1004 | test2004 | 08:00-1004 |
08:00 | 1005 | test1005 | test2005 | 08:00-1005 |
09:00 | 1001 | abc1001 | abc2001 | 09:00-1001 |
------------------------
SET ThousandSep=',';
SET DecimalSep='.';
SET MoneyThousandSep=',';
SET MoneyDecimalSep='.';
SET MoneyFormat='$#,##0.00;($#,##0.00)';
SET TimeFormat='h:mm:ss TT';
SET DateFormat='D/M/YYYY';
SET TimestampFormat='D/M/YYYY h:mm:ss[.fff] TT';
SET MonthNames='Jan;Feb;Mar;Apr;May;Jun;Jul;Aug;Sep;Oct;Nov;Dec';
SET DayNames='Mon;Tue;Wed;Thu;Fri;Sat;Sun';
Directory;
LOAD [Time Stamp],
[Personnel Id],
Info1,
Info2,
timePersonnel
FROM
(txt, utf8, embedded labels, delimiter is '\t', msq);
Where not exists
LOAD timePersonel
FROM
(txt, utf8, no labels, delimiter is ',', msq);
------------------------
I would like to exclude the items that are found in file2. I read somewhere about "where not exists". but it does not seem to work. Can i know how to exclude the IDs found in file2 in the final output? Previouls i have asked about inner join, to output the items found in file2.
thanks.
Try it the other way around:
T1:
LOAD timePersonel
FROM
(txt, utf8, no labels, delimiter is ',', msq);
T2:
LOAD [Time Stamp],
[Personnel Id],
Info1,
Info2,
timePersonnel
FROM
(txt, utf8, embedded labels, delimiter is '\t', msq)
where not exists(timePersonnel);
// optional:
// drop table T1;
Can someone let me know what i have missed out?
I have encountered the following error, and load fails. Did i do something wrong?
Thanks a lot.
-----------
Field not found - <timePesonnel>
T2:
LOAD [Time Stamp],
[Personnel Id],
Info1,
Info2,
timePersonnel
FROM
(txt, codepage is 936, embedded labels, delimiter is ',', msq)
where not exists (timePesonnel)
-----------
My script reads like this:
-----------
Directory;
T1:
LOAD timePersonnel
FROM
(txt, codepage is 936, embedded labels, delimiter is '\t', msq);
T2:
LOAD [Time Stamp],
[Personnel Id],
Info1,
Info2,
timePersonnel
FROM
(txt, codepage is 936, embedded labels, delimiter is ',', msq)
where not exists (timePesonnel);
Can someone advise? Apologies that this might be a simple question.
Find attachment It may help you..
If you need clarification reply me..
Karthik
Field not found - <timePesonnel>
Typo: timePesonnel instead of timePersonnel