Skip to main content
Announcements
Have questions about Qlik Connect? Join us live on April 10th, at 11 AM ET: SIGN UP NOW
cancel
Showing results for 
Search instead for 
Did you mean: 
beaubellamy
Partner - Contributor III
Partner - Contributor III

concat distinct strings on load

I am trying to create a list of locations associated with each value in a column (eg stopping pattern for a bus route), but the data sometimes has each location twice recording the arrival and departure. I am currently getting the list correct, but it includes the location twice when it occurs twice in the data. I would only like to see the location once even if the data records the arrival and departure.

The order of the location is important as it describes the order of stops.

eg

temp:
load * inline
[
busRoute, sequence, activity, location, time
B1, 1, O, loc_A, 9:00
B1, 10, A, loc_B, 9:10
B1, 11, D, loc_B, 9:11
B1, 20, A, loc_C, 9:20
B1, 21, D, loc_C, 9:21
B1, 31, T, loc_D, 9:30
B2, 1, O, loc_P, 9:00
B2, 10, A, loc_Q, 10:10
B2, 11, D, loc_Q, 10:11
B2, 20, A, loc_R, 10:20
B2, 21, D, loc_R, 10:21
B2, 31, T, loc_S, 10:30
];

// O - Origin

// A - Arrival

// D - Departure

// T - Terminate

left Join
load busRoute,
concat(DISTINCT location,'-',sequence) as stop_pattern
Resident temp
Group by busRoute;

produces stopping patterns for 

B1: loc_A-loc_B-loc_B-loc_C-Loc_C-loc_D

B2: loc_P-loc_Q-loc_Q-loc_R-Loc_R-loc_S

 

What I'm looking for is:

B1: loc_A-loc_B-Loc_C-loc_D

B2: loc_P-loc_Q-loc_R-loc_S

Is there a way to achieve this?

 

 

 

Labels (4)
1 Solution

Accepted Solutions
Saravanan_Desingh

Can you check this?

temp:
load * inline
[
busRoute, sequence, activity, location, time
B1, 1, O, loc_D, 9:00
B1, 10, A, loc_C, 9:10
B1, 11, D, loc_C, 9:11
B1, 20, A, loc_B, 9:20
B1, 21, D, loc_B, 9:21
B1, 31, T, loc_A, 9:30
B2, 1, O, loc_P, 9:00
B2, 10, A, loc_Q, 10:10
B2, 11, D, loc_Q, 10:11
B2, 20, A, loc_R, 10:20
B2, 21, D, loc_R, 10:21
B2, 31, T, loc_S, 10:30
];

Left Join(temp)
LOAD busRoute, Concat(DISTINCT If(activity <> 'A',location),'-',sequence) As Path
Resident temp
Group By busRoute
;

View solution in original post

5 Replies
Lisa_P
Employee
Employee

With this simple example, this works, not sure if your real data is more complex..

temp:
load * inline
[
busRoute, sequence, activity, location, time
B1, 1, O, loc_A, 9:00
B1, 10, A, loc_B, 9:10
B1, 11, D, loc_B, 9:11
B1, 20, A, loc_C, 9:20
B1, 21, D, loc_C, 9:21
B1, 31, T, loc_D, 9:30
B2, 1, O, loc_P, 9:00
B2, 10, A, loc_Q, 10:10
B2, 11, D, loc_Q, 10:11
B2, 20, A, loc_R, 10:20
B2, 21, D, loc_R, 10:21
B2, 31, T, loc_S, 10:30
];

// O - Origin

// A - Arrival

// D - Departure

// T - Terminate


Distinct:
Load distinct busRoute,
location as stop
Resident temp;

left Join(temp)
load busRoute,
concat(stop,'-') as stop_pattern
Resident Distinct
Group by busRoute;

Drop table Distinct;

Saravanan_Desingh

Using Hierarchy.

tab1:
Hierarchy(location, ParentLoc,Node,,location,,'-')
LOAD *, location As Node, If(activity<>'O',Peek(location)) As ParentLoc
Where activity <> 'A';
LOAD * INLINE [
busRoute, sequence, activity, location, time
B1, 1, O, loc_A, 9:00
B1, 10, A, loc_B, 9:10
B1, 11, D, loc_B, 9:11
B1, 20, A, loc_C, 9:20
B1, 21, D, loc_C, 9:21
B1, 31, T, loc_D, 9:30
B2, 1, O, loc_P, 9:00
B2, 10, A, loc_Q, 10:10
B2, 11, D, loc_Q, 10:11
B2, 20, A, loc_R, 10:20
B2, 21, D, loc_R, 10:21
B2, 31, T, loc_S, 10:30
];

tab2:
NoConcatenate
LOAD busRoute&':'&Path As Output
Resident tab1
Where activity = 'T';

Drop Table tab1;
Saravanan_Desingh

Output.

commQV43.PNG

commQV44.PNG

beaubellamy
Partner - Contributor III
Partner - Contributor III
Author

This one doesn't quite work the way i was hoping. The example may have just been a bit too simple.

If we reverse the location labels of one bus route (B1), the sequence is in alphabetical order, and i need it in the order of sequence. I added sequence to the distinct load, but that just allows the duplicate label to be seen as a distinct label because it has a distinct sequence number.

temp:
load * inline
[
busRoute, sequence, activity, location, time
B1, 1, O, loc_D, 9:00
B1, 10, A, loc_C, 9:10
B1, 11, D, loc_C, 9:11
B1, 20, A, loc_B, 9:20
B1, 21, D, loc_B, 9:21
B1, 31, T, loc_A, 9:30
B2, 1, O, loc_P, 9:00
B2, 10, A, loc_Q, 10:10
B2, 11, D, loc_Q, 10:11
B2, 20, A, loc_R, 10:20
B2, 21, D, loc_R, 10:21
B2, 31, T, loc_S, 10:30
];

 

In this example the desired output is:

B1: loc_D-loc_C-loc_B-loc_A

B2: loc_P-loc_Q-loc_R-loc_S

 

 

Saravanan_Desingh

Can you check this?

temp:
load * inline
[
busRoute, sequence, activity, location, time
B1, 1, O, loc_D, 9:00
B1, 10, A, loc_C, 9:10
B1, 11, D, loc_C, 9:11
B1, 20, A, loc_B, 9:20
B1, 21, D, loc_B, 9:21
B1, 31, T, loc_A, 9:30
B2, 1, O, loc_P, 9:00
B2, 10, A, loc_Q, 10:10
B2, 11, D, loc_Q, 10:11
B2, 20, A, loc_R, 10:20
B2, 21, D, loc_R, 10:21
B2, 31, T, loc_S, 10:30
];

Left Join(temp)
LOAD busRoute, Concat(DISTINCT If(activity <> 'A',location),'-',sequence) As Path
Resident temp
Group By busRoute
;