Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
hallo,
I have to extract and transpose some of text with this kind of condition (delimiter "|" should be erased)
to summarize, the key of this problem is :
-how to separate by "|" (and transpose) ,only if there is value between "|" and "|"
-if there is no value, please erase the row.
e.g :
||9000|-75000||2000| --> parse 3 values : 9000, -75000 and 2000
||3000|| --> parse 1 value : 3000
|||-100|||40||| --> parse 2 values: -100 and 40
input
COLUMN A// COLUMN B
1.X||9000|-75000||2000|
2.Y||3000||
3.Z|||-100|||40|||
4.C||
5.D||
expectable output:
COLUMN A// COLUMN B
1.X// 9000
1.X// -75000
1.X// 2000
2.Y// 3000
3.Z// -100
3.Z// 40
(there are no C and D column because the value between || is null.)
I already tried using tNormalize, but I don't meet the desired result.
I guess I have to use tjavarow with regex pattern match function, but I don't know how to write it properly.
I really appreciate your time and help.
Hi sakura99, is it really good example? You wrote COLUMN A//COLUMN B, but I don't see repeated field separator. You have two, or three pipes. How to differentiate where the first columns ends ?
Hello,
Could you please let us know if this community knowledge article helps:
https://community.talend.com/s/article/Converting-columns-to-rows-ok7sf
Best regards
Sabrina
Hello,
Could you please let us know if this community knowledge article helps:
https://community.talend.com/s/article/Converting-columns-to-rows-ok7sf
Best regards
Sabrina
hi, thank you for your time.
yes, it differentiate by //.
to summarize, the key of this problem is :
-how to separate by "|" (and transpose) ,only if there is value between "|" and "|"
-if there is no value, please erase the row.
e.g :
||9000|-75000||2000| --> parse 3 values : 9000, -75000 and 2000
||3000|| --> parse 1 value : 3000
|||-100|||40||| --> parse 2 values: -100 and 40
expectable output:
COLUMN A// COLUMN B
1.X //9000
1.X // -75000
1.X // 2000
2.Y // 3000
3.Z // -100
3.Z // 40
(there are no C and D column because the value between || is null.)
Would you mind to write it on tjavarow? my columns are dynamic, or even worse it would be 40 millions++ column (ya, the process will be automated)
so far, i already write like this:
output.B = input.B;
//
class StrSplit
{
public static void main(String []args)
{
String strMain = input.BT;
String[] arrSplit = strMain.split("|");
for (int i=0; i < arrSplit.length; i++)
{
System.out.println(arrSplit[i]);
}
}
}
but the code have a problem on the static and public thing..
thank you for your time, Sabrina.
but sorry it will be written on spark, so I cant use the tpivot.
thank you for your time, Sabrina.
but sorry it will be written on spark, so I cant use the tpivot.
Hi sakura99,
sample of tjavarow code. Maybe it's not the best (I'm the java begginer) but it's close to what you want. You have to think how to remove rows without any values after first column (like 4C and 5D)
String strMain = input_row.input;
String[] arrSplit = strMain.split("\\|");
String output = arrSplit[0]+"//";
for (int i=1; i < arrSplit.length; i++){
if (!arrSplit[i].equals(""))
output = output+arrSplit[i]+"//";
}
System.out.println("Input:"+input_row.input);
System.out.println("Output:"+output.substring(0, output.length() - 2));
hey, thank you. would you mind if i ask over the private email or something?
Hi,
just send me a private message 😉