Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 
Not applicable

Duplicate a table if a field has specific characters

Hi experts,

Basically, I'm pulling many small files and putting them into 1 large table, making them unique by using FileName and a Field to make a Key.

[Text] is the same for every record in each File. It shows up as either

A,

B,

or AB

Here's my Script

LOAD

     Filename() as [File],

     @1 as [ID],

     @2 as [TEXT],

     Filename()&@2&@1 as [Key]

FROM

(txt, codepage is 1252, no labels, delimiter is ';', msq);

Here's Example Data

File1

ID     TEXT

1          A

2          A

3          A

4          A

File2

ID     TEXT

6          AB

7          AB

8          AB

9          AB

If it has AB in it to split these records into 2. So basically pull the entire file with all A's, then pull it again with all B's.

File2

ID     TEXT

6          A

7          A

8          A

9          A

6          B

7          B

8          B

9          B

My [Key] file should make sure there are no duplicates in this case, since there can't be duplicate file names.

Any help would be greatly appreciated. Thanks

1 Solution

Accepted Solutions
sunny_talwar

Here is a more dynamic solution:

Table:

LOAD Filename() as [File],

  ID,

  TEXT,

  Len(TEXT) as Length

FROM

*.txt

(txt, codepage is 1252, embedded labels, delimiter is '\t', msq);

FinalTable:

LOAD *,

  File&TEXT&ID as [Key];

LOAD File,

  ID,

  Mid(TEXT, IterNo(), 1) as TEXT

Resident Table

While IterNo() <= Length;

DROP Table Table;

View solution in original post

11 Replies
sunny_talwar

May be like this:

Table:

LOAD Filename() as [File],

  ID,

    If(Len(TEXT) = 2, SubField(Left(TEXT, 1) & '|' & Right(TEXT, 1), '|'), TEXT) as TEXT

FROM

*.txt

(txt, codepage is 1252, embedded labels, delimiter is '\t', msq);

FinalTable:

LOAD *,

  File&TEXT&ID as [Key]

Resident Table;

DROP Table Table;


Capture.PNG

Not applicable
Author

Hi Sunny, this worked great for my Sample data. Unfortunately I simplified it a bit from my real life data.

What would you do to the Text column if it had the possibility of being anywhere from 1 character to 5 characters. And the results to act the exact same way as your solution did.

So instead of Just A or AB, it can be  ABCDE

sunny_talwar

Here is a more dynamic solution:

Table:

LOAD Filename() as [File],

  ID,

  TEXT,

  Len(TEXT) as Length

FROM

*.txt

(txt, codepage is 1252, embedded labels, delimiter is '\t', msq);

FinalTable:

LOAD *,

  File&TEXT&ID as [Key];

LOAD File,

  ID,

  Mid(TEXT, IterNo(), 1) as TEXT

Resident Table

While IterNo() <= Length;

DROP Table Table;

Not applicable
Author

you are a genius, works perfectly for the 3 columns. Thank you!

If there are more detail columns in the First table, how do I ensure these are brought over into the FinalTable as well?

sunny_talwar

Have not tested this, but try this:

Table:

LOAD Filename() as [File],

*,

  Len(TEXT) as Length

FROM

*.txt

(txt, codepage is 1252, embedded labels, delimiter is '\t', msq);

FinalTable:

LOAD *,

  File&TEXT1&ID as [Key];

LOAD *,

  Mid(TEXT, IterNo(), 1) as TEXT1

Resident Table

While IterNo() <= Length;

DROP Table Table;


DROP Field TEXT;

RENAME FIELD TEXT1 to TEXT;

Not applicable
Author

Thanks Again, I seem to be getting out of object memory alot now though (with many files being pulled)

sunny_talwar

With a lot of data in your application, you would need a lot of RAM to handle it

Not applicable
Author

Sunny, what is the term for the Double LOAD that we do for table2? I want to understand more about it. Also I cannot seem to get a Lookup to work against this table. Keeps coming back as Field not found in FinalTable (whether I try looking for the alias I've created, or the original Fieldname) would you know why?

sunny_talwar

It is called Preceding load: Preceding Load

Would you be able to show the error message of share log file?