Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi again. Question for all you experts out there, especially the folks working in the medical field. I am still working on a presentation regarding Ovarian cancer and am at the point where I am creating outputs and drill down around pharmaceutical usage at the drug class category, drug class, and generic name levels.
My problem seems to be synthetic keys due to the nature of medical claims data. Drug claims come in from pharmacies under an NDC code. But they also come in from clinics and physician offices as a HCPCS code which is on the claim as a procedure code (proc_cde). The claims (FACT) table contains both sets of codes in separate columns since they follow completely different schemas. Problem is that the procedure codes also represent all of the management services and procedures done by physicians and other providers in the same column, not just pharmacy. There is a record type field and a rectype of 'P' means a pharmacy claim so finding/isolating these is not difficult.
What I am creating is an aggregate of the pharmacy utilization and cost so it needs to aggregate the pharma claims from both of these columns. In sql this is easy to do but I am struggling with QlikView. I have the needed reference tables and I have tried creating two separate reference tables (one for the ndc codes and one for the pharma procedure codes) but I end up with a synthetic key and the data does not link or show throughout the document. I also created a master pharmacy reference table but I still have to have separate columns for NDC and proc_cde and the same thing happens.
I also used an alias changing the name in the load process of the proc_cde to nds by an AS statement and that seems to help somewhat but still does not fix things.
I really do not see a way to create a unique key or a linked table because the proc_cde in the claims table is not just pharma data and I need all the other data for other charts, tables and queries. It seems to me that it may be possible to do this if a filter restricting the proc_cde to a rectype of 'P' could be added to the definition of the key or link but I have not seen where or how to do this. The base claims table is downloaded directly from SQL Server and these are typically very large so trying to do data transformation in the claims table is not practical and they are also used for other projects and other queries so they really cannot be changed anyway. This particular claims table is 1.5 million rows and that is small in my world.
Any help/suggestions appreciated. Great group of folks out there and us newbies do appreciate this forum and your help.
Steve
I have tried creating two separate reference tables (one for the ndc codes and one for the pharma procedure codes
Why two reference tables? Why not concatenate them and add a field to distinguish between ndc and pharma codes?
What's in these reference tables? Is joining them to the fact table an option?
It would help if you could provide more information about your data model. A picture would help. A small qlikview document that illustrates the problem would be great. See this document for how to create such an example document safely:Preparing examples for Upload - Reduction and Data Scrambling
Thanks for answering and the suggestions. Not sure about uploading a data example (this is a client's data) but the reference table has all of the detailed information about the drugs. It does not have to have two. I tried it both ways. I added a field for rectype (Good Idea!) but it still does not seem to solve the issue and I am still creating a second synthetic key.
In SQL all you need to do is to add an OR statement so that the query looks at ndc codes OR proc_cde where rectype='P' and you can pull and aggregate all of the data at the same time but I do not see a way to do that here and the joins and synthetic keys are automatically formed.
There are a whole series of tabs with different views, tables,and charts in this project and it was requested that way.
Here is the current data model.
that diagram is neither correct nor accurate. Not sure why it will not upload correctly. Let me try again'
Here is what the data model actually looks like.
Not sure if this would help or not but here is part (not the complete script) of the sql query we use to populate the table (in aggregate fashion) in Excel. Do not want to aggregate it here because I want it to link to the other parts of the workbook and they are tied to pat_id, but it shows the UNION statement used to do this in SQL.
SELECT
[GPI_Drug_Class_Category]'Drug Class Category',
pcl.Imputed_Claimno 'Claims',
pcl.pat_id 'Patients',
pop.pat_region 'Region',
pop.Patient_Age_Category 'Age_Group',
pop.der_sex 'Gender'
-- iif(pcl.quan=9999,0,pcl.quan) 'Quantity',
-- iif(pcl.dayssup=9999,0,pcl.dayssup) 'Days Supply'
FROM [dbo].[OvarianCA_Year3_Claims] pcl,
[dbo].[Pharmacotherapy_Reference] pr,
[dbo].[OvarianCA_Year3_Study_Population] pop
WHERE
pcl.ndc = pr.ndc and
pop.pat_id = pcl.pat_id
UNION
SELECT
pm.[Drug_Class_Category] 'Drug Class Category',
pcl.Imputed_Claimno 'Claims',
pcl.pat_id 'Patients',
pop.pat_region 'Region',
pop.Patient_Age_Category 'Age_Group',
pop.der_sex 'Gender'
-- iif(pcl.quan=9999,0,pcl.quan)'Quantity',
-- iif(pcl.dayssup=9999,0,pcl.dayssup) 'Days Supply'
FROM [dbo].[OvarianCA_Year3_Claims] pcl,
[dbo].[Pharmacotherapy_Markers] pm,
[dbo].[OvarianCA_Year3_Study_Population] pop
WHERE
pop.pat_id = pcl.pat_id and
pcl.proc_cde = pm.code_from_range and
pm.drug_type = 'J-Code'
)
pharma
group by
pharma.[Drug Class Category],
pharma.[Age_Group],
pharma.[Region],
pharma.[Gender]
) Pharma_totals
group by
Pharma_totals.[Drug Class Category]
order by 1,2
Do I understand the problem correctly if I say that you want Pharma_Ref_Combined associated with the claims table using the field proc_cde if rectype is 'P' and otherwise using the field ncr? If you've got rectype in both tables then you can use that to create the key field to associate the two tables, If(rectype='P', proc_cde, ncr) as MyKey, and rename ncr and proc_cde in one or both tables so they're not used to associate the tables.
Hi,
Tried to respond on the website but it would not let me. I cannot do what you suggest because the proc_cde’s also represent a lot of the other services that are done by physicians and other providers to the patient. Only those records which have a procedure code and a record type of pharmacy are pharmacy claims.
Because I need those other codes for a lot of other analyses, in this project and others, I cannot rename either the ndc code column or the proc_cde (CPT code) column.
That claims table is a pretty standard format for medical claims except that I added some additional demographic columns to it so I would not need a separate population table. But I need to be able to find out how to use Qlik with this format of claims table because they will all be similar in other projects as well. The ndc codes are only drug claims and they only come from pharmacies. But the proc_cde can be drug (if the rectype is P) but they can also be inpatient, outpatient, management or surgical claims. Each claim line represents a different thing that was done to and/or for the patient.
Steve
Thank you, thank you! Got up this morning and reread your suggestions and it clicked. I was thinking about this backwards thinking I had to delete data in the fact table (the claims). Other way around!
I created a new column (Pharma_Key) in the claims concatenating the ndc, proc_cde, and record type but only where the record type was P. Then created the same key in the dimension table by concatenating the same three fields and deleted the ndc and proc_cde columns from the dimension table.
Bingo. It worked.
Thank you again
Steve
Great! Glad to hear it worked for you.