Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hello,
colum 1 | colum 2 | column 3
q,w,e,r,t | a,s,d,f,g |z,x,c,v,b
I want to same this :
colum 1 | colum 2 | column 3
q | a, | z
w |s |x
etc...
I attached example file.Someone can help me ?Other columns duplicating when i use tnormalize.
output file should be excel.
Hi
Split the columns, normalize each column separately, add a sequence id to each row and store the temporary result to memory. In next subjob, read the result from memory and do a inner job based on the sequence id, and generate a output to contain all columns, the job looks like:
tfileinputdelimited--main--tmap--out1--tnormalized1--tmap2-->tHashOutput1
--out2--tnormalized2--tmap3-->tHashOutput2
--out3--tnormalized3--tmap4-->tHashOutput3
-onsubjobok-
thashinput1--main--tMap5-...tlogrow
thashinput2-lookup-
thashinput3-lookup-
out1: only has column1, out2: only has column2, out3: only has column3
tmap2: add a new column called id, and set its expression as: Numeric.sequence("s1",1,1)
tmap3: add a new column called id, and set its expression as: Numeric.sequence("s2",1,1)
tmap4: add a new column called id, and set its expression as: Numeric.sequence("s3",1,1)
tHashinput1: read data from thashoutput1
tHashinput2: read data from thashoutput2
tHashinput3: read data from thashoutput3
tMap5: do an inner job based on sequence id, and generate a output to contains three columns.
Hope it helps you!
Regards
Shong