Skip to main content
Announcements
Qlik Connect 2024! Seize endless possibilities! LEARN MORE
cancel
Showing results for 
Search instead for 
Did you mean: 
annick
Contributor III
Contributor III

Left Join duplicates records

Hello,

I have this load script:

TableA:
LOAD * INLINE [
    BusNumber,  FromStation, ToStation, Month, Departure, Arrival, Transit
    201, A, C, Jan,   10:00, 18:00, B
    201, A, B, Jan,   10:00, 12:00, None
    201, B, C, Jan,   14:00, 18:00, None
    201, A, C, Feb,   11:00, 19:00, B
    201, A, B, Feb,   11:00, 13:00, None
    201, B, C, Feb,   15:00, 19:00, None   
];
 
NewTableA:
NoConcatenate
LOAD *
Resident TableA;
Left Join (NewTableA)
LOAD ToStation as Transit,
  FromStation as FromTransitStation,
  ToStation as ToTransitStation,
  Departure as TransitDeparture,
  Arrival as TransitArrival
Resident TableA;

Drop table TableA;
 
which produces this table:
 
BusNumberFromStationToStationDepartureArrivalFromTransitStationToTransitStationMonthTransitDepartureTransitDepartureTransitDeparture
201AB10:0012:00--Jan---
201AC10:0018:00ABJan10:0010:0010:00
201AC10:0018:00ABJan11:0011:0011:00
201BC14:0018:00--Jan---
201AB11:0013:00--Feb---
201AC11:0019:00ABFeb10:0010:0010:00
201AC11:0019:00ABFeb11:0011:0011:00
201BC15:0019:00--Feb--

-

 

 

 
 
There are some extra lines duplicated by month which are not correct. When performing the join, I would like to have a match on the departure month, in order to obtain the following table:
 
BusNumberFromStationToStationDepartureArrivalFromTransitStationToTransitStationMonthTransitDepartureTransitDepartureTransitDeparture
201AB10:0012:00--Jan---
201AC10:0018:00ABJan10:0010:0010:00
201BC14:0018:00--Jan---
201AB11:0013:00--Feb---
201AC11:0019:00ABFeb11:0011:0011:00
201BC15:0019:00--Feb---
 
Many thanks in advance for your help,
 
Annick
 
Labels (1)
  • join

2 Solutions

Accepted Solutions
Gysbert_Wassenaar

Apparently you don't want that the data is joined on only the Transit field, but also on the Departure field.
So try this:

TableA:
LOAD * INLINE [
BusNumber, FromStation, ToStation, Month, Departure, Arrival, Transit
201, A, C, Jan, 10:00, 18:00, B
201, A, B, Jan, 10:00, 12:00, None
201, B, C, Jan, 14:00, 18:00, None
201, A, C, Feb, 11:00, 19:00, B
201, A, B, Feb, 11:00, 13:00, None
201, B, C, Feb, 15:00, 19:00, None
];

NewTableA:
NoConcatenate
LOAD * Resident TableA;

Left Join (NewTableA)

LOAD
    ToStation as Transit,
    FromStation as FromTransitStation,
    ToStation as ToTransitStation,
    Departure as TransitDeparture,
    Departure,
    Arrival as TransitArrival
Resident TableA;

Drop table TableA;

 


talk is cheap, supply exceeds demand

View solution in original post

Vegar
MVP
MVP

It looks like you at not including month in your join. This is causing double"hits" for transit B

NewTableA:
NoConcatenate
LOAD *
Resident TableA;
Left Join (NewTableA)
LOAD ToStation as Transit,
Month as Month, //add this row
FromStation as FromTransitStation,
ToStation as ToTransitStation,
Departure as TransitDeparture,
Arrival as TransitArrival
Resident TableA;

View solution in original post

2 Replies
Gysbert_Wassenaar

Apparently you don't want that the data is joined on only the Transit field, but also on the Departure field.
So try this:

TableA:
LOAD * INLINE [
BusNumber, FromStation, ToStation, Month, Departure, Arrival, Transit
201, A, C, Jan, 10:00, 18:00, B
201, A, B, Jan, 10:00, 12:00, None
201, B, C, Jan, 14:00, 18:00, None
201, A, C, Feb, 11:00, 19:00, B
201, A, B, Feb, 11:00, 13:00, None
201, B, C, Feb, 15:00, 19:00, None
];

NewTableA:
NoConcatenate
LOAD * Resident TableA;

Left Join (NewTableA)

LOAD
    ToStation as Transit,
    FromStation as FromTransitStation,
    ToStation as ToTransitStation,
    Departure as TransitDeparture,
    Departure,
    Arrival as TransitArrival
Resident TableA;

Drop table TableA;

 


talk is cheap, supply exceeds demand
Vegar
MVP
MVP

It looks like you at not including month in your join. This is causing double"hits" for transit B

NewTableA:
NoConcatenate
LOAD *
Resident TableA;
Left Join (NewTableA)
LOAD ToStation as Transit,
Month as Month, //add this row
FromStation as FromTransitStation,
ToStation as ToTransitStation,
Departure as TransitDeparture,
Arrival as TransitArrival
Resident TableA;