Solved: Normalizing more than one column without using a c... - Qlik Community

DM_J · ‎2019-07-09

Hi,

tNormalize normalize only one row, I want to normalize 3 columns which are related to each other:

Sample data:

id	address	email	phone
1	add1~add2~add3	a@a.com~a@b.com	12345678~8765432~12345
2	add1~add2	b@b.com~c@c.com	12345678~8765432
3	add1~add2~add3	a@a.com~a@b.com~d@d.com	12345678~8765432~12345
4	add1~add2	b@b.com	123456~8765
5	add1~add2~add3
6	add1	b@b.com	8765432
7	add1~add2~add3	b@b.com~c@c.com	8765432

I want to normalize this table in a way which

first phone and first email go to the first address
the second phone and the second email go to the second address if they exist.
and so on...

Desired output for sample data

id	address	email	phone
1	add1	a@a.com	12345678
1	add2	a@b.com	8765432
1	add3		12345
2	add1	b@b.com	12345678
2	add2	c@c.com	8765432
3	add1	a@a.com	12345678
3	add2	a@b.com	12345678
3	add3	d@d.com	12345
4	add1	b@b.com	123456
4	add2		8765
5	add1
5	add2
5	add3
6	add1	b@b.com	8765432
7	add1	b@b.com	8765432
7	add2	c@c.com
7	add3

I there any other component that I can use?

Or any java thought that I can put in a tjava?

Thank you.

lennelei · ‎2019-07-10

Hi,

I see 2 options :

1) use a tJavaFlex. Assuming you are confident that you'll never have more that 3 (for example) address, you can use a tJavaFlex followed by a tFilterRow to produce the desired result:

<data_source> --(main)--> tJavaFlex --(main)--> tFilterRow

The idea is to create 3 output rows for each input rows and in each output row, put the nth item of the input row if it exists. The tFilterRow will simply discard rows with no data at all (or no address or whatever).

The tJavaFlex code will contain something like this :
BEGIN (replace 3 with the maximum number of data you can have) :

for (int i=0; i<3; i++) {

MAIN (row1 is the input row, row2 the output):

	String[] add;
	String[] email;
	String[] phone;
	
	row2.id = row1.id;
	
	add=row1.add.split("~", -1);
	if (add != null && add.length > i)
		row2.add=add[i];
	else
		row2.add="";
		
	email=row1.email.split("~", -1);
	if (email != null && email.length > i)
		row2.email=email[i];
	else
		row2.email="";		
		
	phone=row1.phone.split("~", -1);
	if (phone != null && phone.length > i)
		row2.phone=phone[i];
	else
		row2.phone="";

Please note that it's certainly possible to optimize this code! Take this as a guideline

The idea is to split each input data and take the nth item if it exists.

2) Another option would be to first split your data into 3 table (with a tMap for example) :
id,address / id,email / id,phone
You then normlize each table separatly and add a sequence for each "row" of the same id:
id,row,address / id,row,email / id,row,phone
with row being something such as Numeric.sequence("address_"+id,1,1)
And at the end, you join all 3 datasets back.

I think it should also work assuming the order is preserved will going through the tNormalize.

In doubt, I would probably prefer the first option: it has more java code but it's probably safer and more efficient.

Regards.

View solution in original post

lennelei · ‎2019-07-10

Hi,