Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Join us in Bucharest on Sept 18th for Qlik's AI Reality Tour! Register Now
cancel
Showing results for 
Search instead for 
Did you mean: 
qluser01
Creator
Creator

Remove duplicates from table

Hi all!

I have a table like:

Load * inline

[

RiskID, RiskSeverity, RiskProbability, otherStuff

1, high, low, abc

2, low, average, abcddd

3, average, high, adesds

4, low, high, adefsd

3, high, low, abcewew

2, high, high, aderwc

];

etc

bad quality input data

i.d. not distinct RiskID,  RiskID has several Severity and Probability values

I need to create a new table and load into this table:

1. distinct RiskID

2. the highest Probability (among all probabilities available for this RiskID) there are only 3 types (low. average, high)

3. the Highest Severity - same as p. 2

4. otherStuff (only 1 version exists for each RiskID, no duplicates here) - that are all other fields in the table

So the idea is to clean up table from bad data in columns RiskSeverity and RiskProbability - need to take the highest only and make RiskID distinct.

Thanks!

1 Solution

Accepted Solutions
uacg0009
Partner - Specialist
Partner - Specialist

Hi eeacrw,

I have a method that is a little similar with Marcus'.

First you need to left join twice.

Data:

Load * inline

[

RiskID, RiskSeverity, RiskProbability, otherStuff

1, high, low, abc

2, low, average, abcddd

3, average, high, adesds

4, low, high, adefsd

3, high, low, adesds

2, high, high, abcddd

];

left join

Load * inline

[

RiskSeverity, SeverityNum

high, 3

average, 2

low, 1

];

left join

Load * inline

[

RiskProbability, ProbabilityNum

high, 3

average, 2

low, 1

];

Final:

noconcatenate

Load distinct RiskID,

                    otherStuff,

                    if(max(SeverityNum) = 3,'high', if(max(SeverityNum) = 2,'average','low')) as RiskSeverity,

                    if(max(ProbabilityNum) = 3,'high', if(max(ProbabilityNum) = 2,'average','low')) as RiskProbability

resident Data

group by RiskID,otherStuff;


drop table Data;


Thanks.


Aiolos Zhao

View solution in original post

4 Replies
vishsaggi
Champion III
Champion III

What is your expected output based on your Inline Load should be ?

marcus_sommer

One way might be to add an additional numeric field for your risks or probably better to use dual-values for it like:

dual('low', 1)

dual('average', 2)

an then you could just make an aggregation-query like:

// optional: inner join (YourLoadedTable)

load RiskID, max(RiskProbability) as MaxRiskProbability resident YourLoadedTable;

- Marcus

qluser01
Creator
Creator
Author

The result should look like:

Load * inline

[

RiskID, RiskSeverity, RiskProbability, otherStuff

1, high, low, abc

2, high, high, abcddd

3, high, high, adesds

4, low, high, adefsd

];

Sorry, I've shown not unique data in otherStuff table... actually it is unique for each RiskID

the correct source data is:

Load * inline

[

RiskID, RiskSeverity, RiskProbability, otherStuff

1, high, low, abc

2, low, average, abcddd

3, average, high, adesds

4, low, high, adefsd

3, high, low, adesds

2, high, high, abcddd

];

etc

uacg0009
Partner - Specialist
Partner - Specialist

Hi eeacrw,

I have a method that is a little similar with Marcus'.

First you need to left join twice.

Data:

Load * inline

[

RiskID, RiskSeverity, RiskProbability, otherStuff

1, high, low, abc

2, low, average, abcddd

3, average, high, adesds

4, low, high, adefsd

3, high, low, adesds

2, high, high, abcddd

];

left join

Load * inline

[

RiskSeverity, SeverityNum

high, 3

average, 2

low, 1

];

left join

Load * inline

[

RiskProbability, ProbabilityNum

high, 3

average, 2

low, 1

];

Final:

noconcatenate

Load distinct RiskID,

                    otherStuff,

                    if(max(SeverityNum) = 3,'high', if(max(SeverityNum) = 2,'average','low')) as RiskSeverity,

                    if(max(ProbabilityNum) = 3,'high', if(max(ProbabilityNum) = 2,'average','low')) as RiskProbability

resident Data

group by RiskID,otherStuff;


drop table Data;


Thanks.


Aiolos Zhao