Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi Guys ive got an issue at the moment I cannot resolve, I am trying to exclude duplicate string values. I am wanting to load values with only a 1:1 relationship here is an example (FROM EXCEL)
INSTALATION | PREMISE
11223344 10011001
44332211 10011001
55555555 22222222
66666666 22222222
Above the 'Premise' has a many to one relationship
I only want to load where 'Premise' has only one Instalation entry on the Spreadsheet.
Help would be much appreciated.
Paul
Untested and thinking off the top of my head...there are probably better ways.
Data_temp:
LOAD
Instalation
,Premise
,Count(Instalation) AS InstalationCount
FROM....
GROUP BY Premise;
Data:
LOAD
Instalation
,Premise
RESIDENT Data_temp
WHERE InstalationCount = 1;
DROP TABLE Data_temp;
Hope this helps,
Jason
The ONLY() aggregate function is good for this - if there's only one distinct value in the set, then it returns that value; otherwise, it returns null. So, you'd use it like this:
LOAD
PREMISE,
INSTALATION
where not isnull(INSTALATION)
;
LOAD
PREMISE,
ONLY(INSTALATION) AS INSTALATION
From YourData (..)
GROUP BY PREMISE;
Note that I've done a chained LOAD here - the 2nd LOAD statement is evaluated first, and the resulting (temporary) table is used as the input to the 1st LOAD statement.
One thing to watch out for: if you're doing a Resident LOAD rather than a From LOAD (I've used a From load in the example i.e. 'FROM YourData (...)', then alter the 1st LOAD statement so that at least one of the fields is renamed (using AS).
Angus.