Skip to main content
Announcements
See why Qlik is a Leader in the 2024 Gartner® Magic Quadrant™ for Analytics & BI Platforms. Download Now
cancel
Showing results for 
Search instead for 
Did you mean: 
cbaqir
Specialist II
Specialist II

Add/Append Data

I have a dashboard that identifies duplicate records for data integrity cleanup.

The source data does not currently have a date stamp but I would like to show trending over time. Therefore, I want additional files to be appended to the data already there. I DO NOT want duplicates removed. This doesn't seem to work though because After the second load, I only see today's date and the 5/15/17 records are removed. Not sure what's wrong.

On the first load, my script looks like:

DUPE_PROV_RECORDS:
LOAD Npi_of_providers as NPI,
Lastname_of_providers,
Firstname_of_providers,
Middlename_of_providers,
Lastname_of_providers & ', ' & Firstname_of_providers & ' ' & Middlename_of_providers as FULL_NAME,
Npi_of_providers & ' ' & Lastname_of_providers & ', ' & Firstname_of_providers & ' ' & Middlename_of_providers & ' ' &  Specialty_rtk_of_providerspecialties as PROVIDER,
Displaydegrees_short_of_providers as CREDENTIALS,
Specialty_rtk_of_providerspecialties as SPECIALTIES,
Entity_k_of_entityassociations as FACILITY,
date('5/15/2017') as ETL_DATE
// date(Now()) as ETL_DATE
FROM
[..\External_Data\Duplicate Specialty Listing w NPI.xlsx]
(
ooxml, embedded labels, table is Sheet1);

Subsequent loads would use:

DUPE_PROV_RECORDS:
ADD LOAD Npi_of_providers as NPI,
Lastname_of_providers,
Firstname_of_providers,
Middlename_of_providers,
Lastname_of_providers & ', ' & Firstname_of_providers & ' ' & Middlename_of_providers as FULL_NAME,
Npi_of_providers & ' ' & Lastname_of_providers & ', ' & Firstname_of_providers & ' ' & Middlename_of_providers & ' ' &  Specialty_rtk_of_providerspecialties as PROVIDER,
Displaydegrees_short_of_providers as CREDENTIALS,
Specialty_rtk_of_providerspecialties as SPECIALTIES,
Entity_k_of_entityassociations as FACILITY,
//date('5/15/2017') as ETL_DATE
   date(Now()) as ETL_DATE
FROM
[..\External_Data\Duplicate Specialty Listing w NPI.xlsx]
(
ooxml, embedded labels, table is Sheet1);

9 Replies
Anonymous
Not applicable

DUPE_PROV_RECORDS:
LOAD Npi_of_providers as NPI,
Lastname_of_providers,
Firstname_of_providers,
Middlename_of_providers,
Lastname_of_providers & ', ' & Firstname_of_providers & ' ' & Middlename_of_providers as FULL_NAME,
Npi_of_providers & ' ' & Lastname_of_providers & ', ' & Firstname_of_providers & ' ' & Middlename_of_providers & ' ' &  Specialty_rtk_of_providerspecialties as PROVIDER,
Displaydegrees_short_of_providers as CREDENTIALS,
Specialty_rtk_of_providerspecialties as SPECIALTIES,
Entity_k_of_entityassociations as FACILITY,
date('5/15/2017') as ETL_DATE
// date(Now()) as ETL_DATE
FROM
[..\External_Data\Duplicate Specialty Listing w NPI.xlsx]
(
ooxml, embedded labels, table is Sheet1);

concatenate

DUPE_PROV_RECORDS:
ADD LOAD Npi_of_providers as NPI,
Lastname_of_providers,
Firstname_of_providers,
Middlename_of_providers,
Lastname_of_providers & ', ' & Firstname_of_providers & ' ' & Middlename_of_providers as FULL_NAME,
Npi_of_providers & ' ' & Lastname_of_providers & ', ' & Firstname_of_providers & ' ' & Middlename_of_providers & ' ' &  Specialty_rtk_of_providerspecialties as PROVIDER,
Displaydegrees_short_of_providers as CREDENTIALS,
Specialty_rtk_of_providerspecialties as SPECIALTIES,
Entity_k_of_entityassociations as FACILITY,
//date('5/15/2017') as ETL_DATE
  date(Now()) as ETL_DATE
FROM
[..\External_Data\Duplicate Specialty Listing w NPI.xlsx]
(
ooxml, embedded labels, table is Sheet1);

cbaqir
Specialist II
Specialist II
Author

That fails for me. Here is the app.

Anonymous
Not applicable

I can't reload it without sample input files, don't understand, why you use twice:

[..\External_Data\Duplicate Specialty Listing w NPI.xlsx] ?!?


Shouldn't that be 2 different inputs?

cbaqir
Specialist II
Specialist II
Author

I'm not loading it twice. The first time I hard code the date. After that, the date is based on the current date.

Anonymous
Not applicable

could you provide your excel file with some dummy data?

So your plan is you load it everyday with a partial reload with the current date?

cbaqir
Specialist II
Specialist II
Author

I am trying to add trending and since I don't have a date field in the file, I need to create one during load. Loads will be every week or so. I am trying to add all new rows from the new files to the current tables to see the trend on changes to the number of issues.

Anonymous
Not applicable

OK now I understand...

Step one you have to save an initial version:

DUPE_PROV_RECORDS:
LOAD Npi_of_providers as NPI,
Lastname_of_providers,
Firstname_of_providers,
Middlename_of_providers,
Lastname_of_providers & ', ' & Firstname_of_providers & ' ' & Middlename_of_providers as FULL_NAME,
Npi_of_providers & ' ' & Lastname_of_providers & ', ' & Firstname_of_providers & ' '  & Middlename_of_providers & ' ' &  Specialty_rtk_of_providerspecialties as PROVIDER,
Displaydegrees_short_of_providers as CREDENTIALS,
Specialty_rtk_of_providerspecialties as SPECIALTIES,
Entity_k_of_entityassociations as FACILITY,
date('5/15/2017') as ETL_DATE
FROM
[.\Duplicate Specialty Listing w NPI.xlsx]  (
ooxml, embedded labels, table is Sheet1);

STORE DUPE_PROV_RECORDS into DUPE_PROV_RECORDS.qvd (qvd) ;

Then you created an initial qvd, after that you change your script and add data day by day like that:

DUPE_PROV_RECORDS:
LOAD * from  DUPE_PROV_RECORDS.qvd (qvd);

Concatenate

LOAD Npi_of_providers as NPI,
Lastname_of_providers,
Firstname_of_providers,
Middlename_of_providers,
Lastname_of_providers & ', ' & Firstname_of_providers & ' '  & Middlename_of_providers as FULL_NAME,
Npi_of_providers & ' ' & Lastname_of_providers & ', ' & Firstname_of_providers & ' '  & Middlename_of_providers & ' ' &  Specialty_rtk_of_providerspecialties as PROVIDER,
Displaydegrees_short_of_providers as CREDENTIALS,
Specialty_rtk_of_providerspecialties as SPECIALTIES,
Entity_k_of_entityassociations as FACILITY,
Date(Today(1)) as ETL_DATE
FROM [.\Duplicate Specialty Listing w NPI.xlsx] (ooxml, embedded labels, table is Sheet1);

STORE DUPE_PROV_RECORDS into DUPE_PROV_RECORDS.qvd (qvd);

cbaqir
Specialist II
Specialist II
Author

The date is still overwriting the original date. Not sure why. I want it to add, not overwrite.

Anonymous
Not applicable

I just tried it, for me it works...

you may use Step1.qvw only once.