Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 
Not applicable

Load script - cleaning fragmented data


Hi all,

I am hoping someone could advise.

I have a dataset with over 2000 variables/columns - a lot of the data is fragmented and so I have to manaipulate the data within the load script. As an example, say I have 4 columns all within data that should really all be in one coloumn:

Col_a  Col_b  Col_c_  Col_d

To place al values in one column within the load script, I am having to write the following:

LOAD Col_a as col_x FROM [file address here]  (ooxml, embedded labels) ; Concatenate
LOAD Col_b as col_x FROM [file address here]  (ooxml, embedded labels) ; Concatenate
LOAD Col_c as col_x FROM [file address here]  (ooxml, embedded labels) ; Concatenate
LOAD Col_d as col_x FROM [file address here]  (ooxml, embedded labels) ;

The data is now within column col_x.


But this is a very inefficient way of achieving what I require. It does works but it is slowing down the

loading of the script. Are there any more effective ways anyone could think of?

I would be very greatful for some advise here.

Regards

Revlin

2 Replies
Not applicable
Author

No need for using the Concatenate keyword when table has a Same column Name col_x.

Table1:

LOAD Col_a as col_x FROM [file address here]  (ooxml, embedded labels) ;

LOAD Col_b as col_x FROM [file address here]  (ooxml, embedded labels) ;

LOAD Col_c as col_x FROM [file address here]  (ooxml, embedded labels) ;

LOAD Col_d as col_x FROM [file address here]  (ooxml, embedded labels) ;

I don't think, it may not be faster.

Regards,

Kabilan K.

Not applicable
Author

I am hoping there is a way with only one load (instead of 4 as in my example) where I can minimise the load time. I am guessing accessing/loading the file takes up a lot of time. So ideally, just one load.