Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi There,
I'm hoping someone can assist me in improving a script I have running. The idea is to get the start time for each till, which is the minimum transaction time. The problem is that my source data contains many millions of rows, and the group by in TABLE2 takes a long time to complete. Are there any alternatives I can try? Any help would be greatly appreciated. The below is my current code
FOR each file in FileList('Myfilepath.qvd')
TABLE1
LOAD
Branch,
Till,
Date,
Time
FROM
$(file) (qvd);
NEXT
TABLE2:
LEFT JOIN(TABLE1)
LOAD
Branch,
Till,
Date,
TIME(MIN([Time])) as [Time]
Resident TABLE1
Group by Till,
Date,
Branch;
FOR each file in FileList('Myfilepath.qvd')
TABLE1:
LOAD
AutoNumber( Branch & '-' & Till & '-' & Date) AS %Key,
Branch,
Till,
Date,
Time
FROM
$(file) (qvd);
NEXT
MIN_TABLE:
LOAD
AutoNumber( Branch & '-' & Till & '-' & Date) AS %Key,
Min(Time) AS MinTime
RESIDENT TABLE1
GROUP BY Till,Date,Branch;
So you effectively create yourself a Min Date dimension.
I suspect that it is not the Group By that is the problem - instead the Join most likely is. So, I would try not joining.
FOR each file in FileList('Myfilepath.qvd')
TABLE1:
LOAD
Branch & '|' & Till & '|' & Date as Key,
Branch, Till, Date, Time
FROM $(file) (qvd);
NEXT
TABLE2:
LOAD
Key,
TIME(MIN([Time])) as [Time]
Resident TABLE1
Group by Key;
HIC