Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi,
Hope you could help.
How do I exclude a certain list of IDs from loading?
I have the following script:
LOAD [Id],
[TimeStamp],
[Total Count],
[Info1]
[Info2]
[New ID]
FROM
(txt, utf8, embedded labels, delimiter is ',', msq);
I would like to exclude a list of [New ID], stored in C:\newID.csv]
THe file contains about 3million records, not "," delimited. Just a list of IDs.
Could you pls help? THanks in advance.
Could you please put your script here or share your application? I believe you are getting the "Table not found" error due to some misplacement of JOIN statement or so(probably the Temp table is getting auto concatenated with some earlier table), otherwise the script i posted would do the job for you.
Thanks.
TempTable:
Load * Inline
[
temp_id
1
2
3
4
5
];
Data:
Load *
Where Not Exists (temp_id,id)
;
Load * Inline
[
id, Sales
1, 10
2, 20
3, 30
4, 40
5, 50
6, 60
7, 70
8, 80
9, 90
10,100
];
Drop Table TempTable;
File1 has a few million records. timePersonnel is Time +Personnel
Time Stamp | Personnel Id | Info1 | Info2 | timePersonnel |
08:00 | 1001 | test1001 | test2001 | 08:00-1001 |
08:00 | 1002 | test1002 | test2002 | 08:00-1002 |
08:00 | 1003 | test1003 | test2003 | 08:00-1003 |
08:00 | 1004 | test1004 | test2004 | 08:00-1004 |
08:00 | 1005 | test1005 | test2005 | 08:00-1005 |
09:00 | 1001 | abc1001 | abc2001 | 09:00-1001 |
File2.csv has the following content
timePersonnel |
08:00-1001 |
08:00-1002 |
08:00-1003 |
09:00-1001 |
I would like to obtain the results similar to the below.
Time Stamp | Personnel Id | Info1 | Info2 | timePersonnel |
08:00 | 1001 | test1001 | test2001 | 08:00-1001 |
08:00 | 1002 | test1002 | test2002 | 08:00-1002 |
08:00 | 1003 | test1003 | test2003 | 08:00-1003 |
08:00 | 1004 | test1004 | test2004 | 08:00-1004 |
08:00 | 1005 | test1005 | test2005 | 08:00-1005 |
09:00 | 1001 | abc1001 | abc2001 | 09:00-1001 |
------------------------
SET ThousandSep=',';
SET DecimalSep='.';
SET MoneyThousandSep=',';
SET MoneyDecimalSep='.';
SET MoneyFormat='$#,##0.00;($#,##0.00)';
SET TimeFormat='h:mm:ss TT';
SET DateFormat='D/M/YYYY';
SET TimestampFormat='D/M/YYYY h:mm:ss[.fff] TT';
SET MonthNames='Jan;Feb;Mar;Apr;May;Jun;Jul;Aug;Sep;Oct;Nov;Dec';
SET DayNames='Mon;Tue;Wed;Thu;Fri;Sat;Sun';
Directory;
LOAD [Time Stamp],
[Personnel Id],
Info1,
Info2,
timePersonnel
FROM
(txt, utf8, embedded labels, delimiter is '\t', msq);
Inner Join
LOAD timePersonel
FROM
(txt, utf8, no labels, delimiter is ',', msq);
------------------------
With the script above,
error i encounter says:
Field not found - <timePersonnel>
Inner Join
LOAD timePersonnel
FROM
(txt, utf8, no labels, delimiter is ',', msq)
------------------------
Changing 2nd part of my code to
LOAD @1
FROM
(txt, utf8, no labels, delimiter is ',', msq);
------------------------
it just hangs
Can someone help me here?
thanks!
Changing 2nd part of my code to
LOAD @1
FROM
(txt, utf8, no labels, delimiter is ',', msq);
------------------------
it just hangs
Write this portion as
Load @1 as timePersonnel
FROM
(txt, utf8, no labels, delimiter is ',', msq);
It should work.
Thanks.
It still seems to hang... if i have about 5 million lines... will it take a long time to load and will it casue problems? I have been waiting for about 10 min.
Try 'Limited Load' under Debug window. Then you can figure out if the data size is the issue or not.