Skip to main content
Announcements
Qlik Introduces a New Era of Visualization! READ ALL ABOUT IT
cancel
Showing results for 
Search instead for 
Did you mean: 
kelsie_muller
Partner - Contributor II
Partner - Contributor II

Loading only Duplicate values in a Load Script

Hello,

I am attempting to load only duplicate values in a large data set with the following query in Qlik Sense. The data load source is a Snowflake integration, so I have some limitations on the formatting of the query.

I'd like to only return rows where there is more than one value  in the SI_Master_S3_Key value. 

 

LIB CONNECT TO 'Snowflake hda76102';
select * from (

select clip_id
, max(supplier_name) as Owner
, max(clip_name) as Resource_Name
, max(concat(year(clip_ingested),'-',month(clip_ingested),'-',day(clip_ingested))) as Ingest_Date
, max(case when CLIP_DATA_NAME = 'SI.Master.S3.Bucket' then CLIP_DATA_VALUE else null end) as SI_Master_S3_Bucket
, max(case when CLIP_DATA_NAME = 'SI.Master.S3.Key' then CLIP_DATA_VALUE else null end) as SI_Master_S3_Key
, max(case when CLIP_DATA_NAME = 'SI.Master.Filename' then CLIP_DATA_VALUE else null end) as SI_Master_Filename
, max(case when CLIP_DATA_NAME = 'Format.FileSize' then CLIP_DATA_VALUE else null end) as Format_FileSize
, max(case when CLIP_DATA_NAME = 'SI.Master.Extension' then CLIP_DATA_VALUE else null end) as SI_Master_Extension
, max(case when CLIP_DATA_NAME = 'TE.OriginalName' then CLIP_DATA_VALUE else null end) as TE_OriginalName
from ANALYTICS.REPORTING.DMH_CLIP_DATA
where supplier_name = 'SPS'
group by clip_id
order by SI_Master_S3_Key
)xx;

Can you please help with this query? Alternatively if there is a way to filter a table to only display duplicate values on the whole data set - that would work as well. 

Labels (2)
4 Replies
kelsie_muller
Partner - Contributor II
Partner - Contributor II
Author

To clarify - I do not want to only load where there is more than one value - I would like to load only where there are more than one of the same value in the SI_Master_S3_Key value. 

steeefan
Luminary
Luminary

My suggestion is to leave the query as-is and to load the data into a QS table. Based on that table, you could do:

 

Duplicate:
NOCONCATENATE LOAD
  *
WHERE
  SI_Master_S3_Key_Count > 1;
LOAD
  SI_Master_S3_Key,
  COUNT(SI_Master_S3_Key) AS SI_Master_S3_Key_Count
RESIDENT
  S3Table
GROUP BY
  SI_Master_S3_Key;

 

This gives you a table containing all the values of SI_Master_S3_Key for which there is more than one entrie in your S3 table.

kelsie_muller
Partner - Contributor II
Partner - Contributor II
Author

Thank you for assisting, Steeefan!

Can you please provide the full load script? When I add that piece either before or after the existing portions of my load script I get an error:

The following error occurred:
Table 'S3Table' not found
steeefan
Luminary
Luminary

Sure thing: You have to first load the data you are getting from the SQL query into a Qlik Sense table, the one I named S3Table in my script.

S3Table:
NOCONCATENATE LOAD
  *;
SQL SELECT
  //..
)xx;

 Replace SELECT //..)xx; with your statement you posted above. After this block, place the script I posted earlier.