Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hello everyone,
hope you're doing well !
please find below my datasctructure for today's problem:
colonne1 | colonne2 | colonne3 | colonne4 | colonne5 | colonne6 | colonne7 |
NA | ligne1.colonne2 | ligne1.colonne3 | ligne1.colonne4 | ligne1.colonne5 | ligne1.colonne6 | ligne1.colonne7 |
ligne2.colonne1 | NA | ligne2.colonne3 | ligne2.colonne4 | ligne2.colonne5 | ligne2.colonne6 | ligne2.colonne7 |
ligne3.colonne1 | ligne3.colonne2 | NA | ligne3.colonne4 | ligne3.colonne5 | ligne3.colonne6 | ligne3.colonne7 |
ligne4.colonne1 | ligne4.colonne2 | ligne4.colonne3 | NA | ligne4.colonne5 | ligne4.colonne6 | ligne4.colonne7 |
ligne5.colonne1 | ligne5.colonne2 | ligne5.colonne3 | ligne5.colonne4 | NA | ligne5.colonne6 | ligne5.colonne7 |
ligne6.colonne1 | ligne6.colonne2 | ligne6.colonne3 | ligne6.colonne4 | ligne6.colonne5 | NA | ligne6.colonne7 |
ligne7.colonne1 | ligne7.colonne2 | NA | NA | ligne7.colonne5 | ligne7.colonne6 | ligne7.colonne7 |
For every single line, I need to replace every NA value with next column value, like the result below:
colonne1 | colonne2 | colonne3 | colonne4 | colonne5 | colonne6 | colonne7 |
ligne1.colonne2 | ligne1.colonne3 | ligne1.colonne4 | ligne1.colonne5 | ligne1.colonne6 | ligne1.colonne7 | |
ligne2.colonne1 | ligne2.colonne3 | ligne2.colonne4 | ligne2.colonne5 | ligne2.colonne6 | ligne2.colonne7 | |
ligne3.colonne1 | ligne3.colonne2 | ligne3.colonne4 | ligne3.colonne5 | ligne3.colonne6 | ligne3.colonne7 | |
ligne4.colonne1 | ligne4.colonne2 | ligne4.colonne3 | ligne4.colonne5 | ligne4.colonne6 | ligne4.colonne7 | |
ligne5.colonne1 | ligne5.colonne2 | ligne5.colonne3 | ligne5.colonne4 | ligne5.colonne6 | ligne5.colonne7 | |
ligne6.colonne1 | ligne6.colonne2 | ligne6.colonne3 | ligne6.colonne4 | ligne6.colonne5 | ligne6.colonne7 | |
ligne7.colonne1 | ligne7.colonne2 | ligne7.colonne5 | ligne7.colonne6 | ligne7.colonne7 |
like you noticed on the line 7, there is multiple NA values, the idea is to replace them all with next columns values, and all cases are possible (only one NA, multiple NA side by side, multiple NA separated by columns values, etc...)
One solution consists of using A LOT of nested IF statements (A nightmare to write and maintain) so I'm searching for a slightly more efficient solution..
Thank you for your help and have a nice weekend 🙂
Youssef B
I used multiple preload steps to verify that it worked.
Data:
Load SubField(str3,',',1) as colonne1,
SubField(str3,',',2) as colonne2,
SubField(str3,',',3) as colonne3,
SubField(str3,',',4) as colonne4,
SubField(str3,',',5) as colonne5,
SubField(str3,',',6) as colonne6,
SubField(str3,',',7) as colonne7;
Load *,
str2&Repeat(',',7-FieldCount-1) as str3;
load *,
SubStringCount(str2,',') as FieldCount;
Load *,
replace(str1,'NA,','') as str2;
Load colonne1&','&colonne2&','&colonne3&','&colonne4&','&colonne5&','&colonne6&','&colonne7 as str1;
Load * Inline [
colonne1,colonne2,colonne3,colonne4,colonne5,colonne6,colonne7
NA,ligne1.colonne2,ligne1.colonne3,ligne1.colonne4,ligne1.colonne5,ligne1.colonne6,ligne1.colonne7
ligne2.colonne1,NA,ligne2.colonne3,ligne2.colonne4,ligne2.colonne5,ligne2.colonne6,ligne2.colonne7
ligne3.colonne1,ligne3.colonne2,NA,ligne3.colonne4,ligne3.colonne5,ligne3.colonne6,ligne3.colonne7
ligne4.colonne1,ligne4.colonne2,ligne4.colonne3,NA,ligne4.colonne5,ligne4.colonne6,ligne4.colonne7
ligne5.colonne1,ligne5.colonne2,ligne5.colonne3,ligne5.colonne4,NA,ligne5.colonne6,ligne5.colonne7
ligne6.colonne1,ligne6.colonne2,ligne6.colonne3,ligne6.colonne4,ligne6.colonne5,NA,ligne6.colonne7
ligne7.colonne1,ligne7.colonne2,NA,NA,ligne7.colonne5,ligne7.colonne6,ligne7.colonne7
];
I used multiple preload steps to verify that it worked.
Data:
Load SubField(str3,',',1) as colonne1,
SubField(str3,',',2) as colonne2,
SubField(str3,',',3) as colonne3,
SubField(str3,',',4) as colonne4,
SubField(str3,',',5) as colonne5,
SubField(str3,',',6) as colonne6,
SubField(str3,',',7) as colonne7;
Load *,
str2&Repeat(',',7-FieldCount-1) as str3;
load *,
SubStringCount(str2,',') as FieldCount;
Load *,
replace(str1,'NA,','') as str2;
Load colonne1&','&colonne2&','&colonne3&','&colonne4&','&colonne5&','&colonne6&','&colonne7 as str1;
Load * Inline [
colonne1,colonne2,colonne3,colonne4,colonne5,colonne6,colonne7
NA,ligne1.colonne2,ligne1.colonne3,ligne1.colonne4,ligne1.colonne5,ligne1.colonne6,ligne1.colonne7
ligne2.colonne1,NA,ligne2.colonne3,ligne2.colonne4,ligne2.colonne5,ligne2.colonne6,ligne2.colonne7
ligne3.colonne1,ligne3.colonne2,NA,ligne3.colonne4,ligne3.colonne5,ligne3.colonne6,ligne3.colonne7
ligne4.colonne1,ligne4.colonne2,ligne4.colonne3,NA,ligne4.colonne5,ligne4.colonne6,ligne4.colonne7
ligne5.colonne1,ligne5.colonne2,ligne5.colonne3,ligne5.colonne4,NA,ligne5.colonne6,ligne5.colonne7
ligne6.colonne1,ligne6.colonne2,ligne6.colonne3,ligne6.colonne4,ligne6.colonne5,NA,ligne6.colonne7
ligne7.colonne1,ligne7.colonne2,NA,NA,ligne7.colonne5,ligne7.colonne6,ligne7.colonne7
];