Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi experts,
Basically, I'm pulling many small files and putting them into 1 large table, making them unique by using FileName and a Field to make a Key.
[Text] is the same for every record in each File. It shows up as either
A,
B,
or AB
Here's my Script
LOAD
Filename() as [File],
@1 as [ID],
@2 as [TEXT],
Filename()&@2&@1 as [Key]
FROM
(txt, codepage is 1252, no labels, delimiter is ';', msq);
Here's Example Data
File1
ID TEXT
1 A
2 A
3 A
4 A
File2
ID TEXT
6 AB
7 AB
8 AB
9 AB
If it has AB in it to split these records into 2. So basically pull the entire file with all A's, then pull it again with all B's.
File2
ID TEXT
6 A
7 A
8 A
9 A
6 B
7 B
8 B
9 B
My [Key] file should make sure there are no duplicates in this case, since there can't be duplicate file names.
Any help would be greatly appreciated. Thanks
Here is a more dynamic solution:
Table:
LOAD Filename() as [File],
ID,
TEXT,
Len(TEXT) as Length
FROM
*.txt
(txt, codepage is 1252, embedded labels, delimiter is '\t', msq);
FinalTable:
LOAD *,
File&TEXT&ID as [Key];
LOAD File,
ID,
Mid(TEXT, IterNo(), 1) as TEXT
Resident Table
While IterNo() <= Length;
DROP Table Table;
May be like this:
Table:
LOAD Filename() as [File],
ID,
If(Len(TEXT) = 2, SubField(Left(TEXT, 1) & '|' & Right(TEXT, 1), '|'), TEXT) as TEXT
FROM
*.txt
(txt, codepage is 1252, embedded labels, delimiter is '\t', msq);
FinalTable:
LOAD *,
File&TEXT&ID as [Key]
Resident Table;
DROP Table Table;
Hi Sunny, this worked great for my Sample data. Unfortunately I simplified it a bit from my real life data.
What would you do to the Text column if it had the possibility of being anywhere from 1 character to 5 characters. And the results to act the exact same way as your solution did.
So instead of Just A or AB, it can be ABCDE
Here is a more dynamic solution:
Table:
LOAD Filename() as [File],
ID,
TEXT,
Len(TEXT) as Length
FROM
*.txt
(txt, codepage is 1252, embedded labels, delimiter is '\t', msq);
FinalTable:
LOAD *,
File&TEXT&ID as [Key];
LOAD File,
ID,
Mid(TEXT, IterNo(), 1) as TEXT
Resident Table
While IterNo() <= Length;
DROP Table Table;
you are a genius, works perfectly for the 3 columns. Thank you!
If there are more detail columns in the First table, how do I ensure these are brought over into the FinalTable as well?
Have not tested this, but try this:
Table:
LOAD Filename() as [File],
*,
Len(TEXT) as Length
FROM
*.txt
(txt, codepage is 1252, embedded labels, delimiter is '\t', msq);
FinalTable:
LOAD *,
File&TEXT1&ID as [Key];
LOAD *,
Mid(TEXT, IterNo(), 1) as TEXT1
Resident Table
While IterNo() <= Length;
DROP Table Table;
DROP Field TEXT;
RENAME FIELD TEXT1 to TEXT;
Thanks Again, I seem to be getting out of object memory alot now though (with many files being pulled)
With a lot of data in your application, you would need a lot of RAM to handle it
Sunny, what is the term for the Double LOAD that we do for table2? I want to understand more about it. Also I cannot seem to get a Lookup to work against this table. Keeps coming back as Field not found in FinalTable (whether I try looking for the alias I've created, or the original Fieldname) would you know why?
It is called Preceding load: Preceding Load
Would you be able to show the error message of share log file?