Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi,
I need help to find only unique rows (postnr) from a excel list containing duplicated postnr.
The table below shows the excel file containing swedish zip codes (postnr) and information about that zip code.
postnr | postort | länskod | län | kommunkod | kommun | A-region | H-region | H-region info |
---|---|---|---|---|---|---|---|---|
15593 | NYKVARN | 1 | STOCKHOLM | 140 | NYKVARN | 1 | H1 | Stockholm/Södertälje A-region |
15593 | NYKVARN | 4 | STOCKHOLM | 486 | STRÄNGNÄS | 7 | H4 | Mellanbygden |
15594 | NYKVARN | 1 | STOCKHOLM | 140 | NYKVARN | 1 | H1 | Stockholm/Södertälje A-region |
15594 | NYKVARN | 4 | STOCKHOLM | 461 | GNESTA | 5 | H3 | Större städer |
The column postnr is joined with the customer table - used for analysing on how many customers we have in a certain zip code and what sort of H-region that zip code has.
As you can see in the table above we have some duplicated postnr - because one zip code can have many H-region (we only interested in the lowest H-region).
How can I - preferably from the loading script - only load the postnr with the lowest H-region (going from H1 - H9)?
The result of the table above should be, after the loading script:
postnr | postort | länskod | län | kommunkod | kommun | A-region | H-region | H-region info |
---|---|---|---|---|---|---|---|---|
15593 | NYKVARN | 1 | STOCKHOLM | 140 | NYKVARN | 1 | H1 | Stockholm/Södertälje A-region |
15594 | NYKVARN | 1 | STOCKHOLM | 140 | NYKVARN | 1 | H1 | Stockholm/Södertälje A-region |
Thanks ahead!
Hey,
You could use something like the following code:
Test:
LOAD *
FROM
C:\test\Duplicate.csv
(txt, codepage is 1252, embedded labels, delimiter is ';', msq)
;
//Sort the data like you want;
Test2:
NoConcatenate Load *
Resident Test
order by postnr,[H-region];
drop tables Test;
//Create a flag which indicates a duplicate record;
Test3:
NoConcatenate Load
postnr,
[H-region],
if(Previous(postnr)=postnr,1,0) as Select
Resident Test2;
drop tables Test2;
//Select the non duplicate records;
Test4:
NoConcatenate load
postnr,
[H-region]
Resident Test3
where Select=0;
drop tables Test3;
Hope this helps
Hi
Is the lowest H region always H1?
Regards
Jonathan
Hi,
Yes its always H1.
/Fredrik
Hi, here's a simple way to do it without relaying that the lowest region is always H1
hi,
Try the attach the application.
Deepak
Hi
Then all you need is something like:
LOAD *
From Source
Where [H-Region] = 'H1'
Hope that helps
Jonathan
Hello,
Use below code
Temp:
Load HRegion from Table;
Test:
LOAD HRegion as HR Resident Temp Order By HRegion asc;
LET vHRegion =Peek('HR',0,'Test');
DROP Table Test;
DROP Table Temp;
Main:
LOAD * From Table
Where HRegion='$(vHRegion)';
Hope it helps you
Cheers!
Jagan