Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 
Anonymous
Not applicable

How do i remove duplicates on this load

hello

below is a table i have by concatenating two temp tables together.

but i get duplicate rows. like below..  (3 lines for   bisqe-7335-0001707-007335-hash)

how can i do it where i would only get one.. and preferably the one that has a value in the http://148 section?

dup.JPG

thanks

8 Replies
tschullo
Creator III
Creator III

Load Field1,

Field2,

Field3,

Field4,

firstsortedvalue(Field5, Len(Field5)*RowNo()+RowNo(),-1) as URL

Resident Table1

group by Field1,Field2, Field3, Field4;

Len(Field5)*RowNo()+RowNo() will generate a unique value for each row within the matching set.

-1 says sort in descending order, so largest number will be returned, likely one that has a URL

Anonymous
Not applicable
Author

hi

i keep getting an error

join(FabImages)

load

bcItemCode as BCItemCode,

BCColorCode as BCColorcode,

Swatch as swlocation,

firstsortedvalue(Swatch,Len(Swatch)*RowNo()+RowNo(),-1) as swlocation2,

hash128(bcItemCode,BCColorCode) as fabcolor

resident temp14;

tschullo
Creator III
Creator III

You must have the Group By statement or it will not work

I'm not sure what you are joining to or on

Your group by needs to be by all your columns except the one you are sorting

and if you are pulling Swatch as swlocation AND sorting by it, it may not like that.

Anonymous
Not applicable
Author

yup.. i needed the grouping..

it works.. it narrowed it down.. but for some reason im still getting blanks.. even if i have one with a link

so close..

dup2.JPG

i tried grouping by the hash and it doesnt work..

join(FabImages)

load

bcItemCode as BCItemCode,

BCColorCode as BCColorcode,

//Swatch as swlocation,

firstsortedvalue(Swatch,Len(Swatch)*RowNo()+RowNo(),-1) as swlocation,

hash128(bcItemCode,BCColorCode) as fabcolor

resident temp14

group by bcItemCode,BCColorCode;

tschullo
Creator III
Creator III

So the only bummer is that if you have two winners, i.e. your sort value is the same for your two top records, it returns NULL.

So I gave you a start at generating a unique sort value, but you may need to test out and display those values so you can see if you are getting dupes and if so maybe come up with a better formula for a sort value.

tschullo
Creator III
Creator III

I think what I've done in the past is tantamount to "Len(string).rowNo()"

so:

0.1

0.2

5.3

5.4

22.5

23.6

23.7

So you will always have a unique winner, in this case 23.7 - I'd have to go look at how I built that float.

Anonymous
Not applicable
Author

i think the issue is null is there.. and its not being sorted..  i did the show value thing..

all the dups are null..  and the other record is max value from formula... so im not sure how to fix this

or i need to do it in another temp and do it from there

stonecold111
Creator III
Creator III

replace null values with 0's . try len(trim(field)=0,'0',field)   and then sort

you can also use expression in place of field