Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hello all,
I am having an issue with joining data from multiple tables with values not showing correctly because of null values. Below is a dummy example of what a part of my load editor script looks like:
LOAD
Date,
special_id,
campaign,
measure1,
FROM [lib://QS Data/table_1.qvd]
(qvd);
LOAD
Date,
campaign,
measure2,
FROM [lib://QS Data/table_2.qvd]
(qvd);
The special_id does not have a value for all values in the campaign field, so when I create a table with campaign, measure1, and measure2 - measure2 only shows when there is a special_id associated with the campaign, even though there is actually a value for measure2.
Can someone help me so I can restructure my load editor script to still show measure 2 even if there are no values in special_id?
I think you got lots of synthetic keys if you are using more than one field to associate your tables. Probably that could be one of the reason behind the problem but not sure of the complete situation here.
Would be helpful if you can share data model viewer image to understand how you are connecting your tables. If possible, make a composite key or rename fields if they make sense to the single table only. The ideal situation is, you got one single key field in one table to associate with any other table. If your logic needs two fields to associate with other table then make a new field by combining them e.g. Field A & '-' & Field B AS NewKeyfield, do that in both the tables which require association.
Thanks,
Hi, there must be something else, if the campaign is the same for measure1 and measure2, the value of special_id should not matter. something else must be affecting the value.
Maybe you are calculating measure 2 as a measure and without an aggreagtion function(sum, avg)... if ther is more than one possible vale for measure 2 it will show null.
It could be have other causes, maybe a join done later, or selections that filters values with special_id, but only with this script it should work.
Hi @Ruben,
measure2 has been converted to a master measure using a distinct count aggregation. I've only deduced that it is special_id that is causing the issue because when I remove it from my load editor, the measure2 data shows correctly by campaign in my table.
My load editor actually has multiple tables in with special_id in it. For some tables, not all days have a value in fields because there is just no result on that particular day; when I remove the special_id from these tables that don't have results everyday and leave it only on the tables that do, the data seems to come through as expected.
Hi, maybe you have a join in the tables, when special_id is on tha table it tries to find records that mactches by date and special_id, when you remove it, it only matches by date.
You can add "exit script" clauses to check when it dissapears. Set "exit script;" after the table that loads measure2 and check if data exists, if it is there, place the "exit script;" later and check if it still exists... or post more of your sript to check what it's doing.
Hi Ruben,
Below is the full script:
LOAD
datetime,
"Date",
click.Date,
"Date time local",
message_id,
click.event_id,
click.id,
campaign_name,
client_name,
client_os,
client_os_family,
client_type,
contact_email,
person__id,
flow_id,
person__email,
email_uuid
FROM [lib://QS Data/email_clicks.qvd]
(qvd);
LOAD
"Date",
receive.Date,
"Date time local",
campaign_name,
flow_id,
receive_id,
email_domain
FROM [lib://QS Data/email_receive.qvd]
(qvd);
LOAD
"Date",
unsubscribe.Date,
"Date time local",
campaign_name,
flow_id,
unsub_id,
contact_email
FROM [lib://QS Data/email_unsubscribe.qvd]
(qvd);
LOAD
"Date",
bounce.Date,
"Date time local",
message_id,
bounce_id,
flow_id,
campaign_name,
contact_email
FROM [lib://QS Data/email_bounce.qvd]
(qvd);
LOAD
"Date",
subtolist.Date,
"Date time local",
list,
subtolist_id,
contact_email as unsub_email,
contact_email
FROM [lib://QS Data/email_subtolist.qvd]
(qvd);
LOAD
"Date",
markasspam.Date,
"Date time local",
campaign_name,
mark_as_spam.id,
contact_email,
flow_id,
markasspam.email,
markasspam.personid
FROM [lib://QS Data/email_markasspam.qvd]
(qvd);
LOAD
ga_user_id,
ga_session_Id,
"date"as Date,
totalTransactionRevenue,
transactions,
campaign_name,
medium,
channelGrouping,
source
FROM [lib://QS Data/BQ_web_sessions.qvd]
(qvd)
Where channelGrouping = 'Email' ;
I highlighted the last table in orange as this data source is the only one that seems to be impacted from the tables being joined; only measures from this table are incorrect. It is also the only table where the qvd file is from a different data source.
The flow_id field is the special_id example that I was talking about. When I use campaign_name in a table, the campaigns without a flow_id show no result when there should be one.
In general, I am finding that when the tables that don't have fields everyday are in the load edtior, the orange script table does now show the correct value.
You might need to check if there is anything being used in the chart from other tables. If the chart is using all the fields from the orange table it should show correct values in the measure. On another note I see 'contact_email', 'Date' etc in many tables of your script, hope you have handled it by renaming it to avoid synthetic key/circular reference etc.
Thanks,
Hi,
In both scenarios where in a chart the measure is only from the orange table, and the other scenario where it is mixed with the other table - the data is not populating under campaign_name.
RE handling the renaming; I have named them purposely as there are common values that can be joined for those fields. However, there are instances where there are shared values plus additional values in a joined field, for example the orange table campaign_name field has common values plus other values which don't exist in the other tables.
Do you think there is something with my naming that I need to change?
I think you got lots of synthetic keys if you are using more than one field to associate your tables. Probably that could be one of the reason behind the problem but not sure of the complete situation here.
Would be helpful if you can share data model viewer image to understand how you are connecting your tables. If possible, make a composite key or rename fields if they make sense to the single table only. The ideal situation is, you got one single key field in one table to associate with any other table. If your logic needs two fields to associate with other table then make a new field by combining them e.g. Field A & '-' & Field B AS NewKeyfield, do that in both the tables which require association.
Thanks,
Hi, I agree with @Digvijay_Singh the relations could me making a complex model with a lot of synthetic keys, a quick fix could be to concatenate all tables in a big fact table, just add "Concatenate (FirstTableName)" before all this loads.
It would not be the best solution but it's a start.
I cleaned up my load script so that there were only really one synthetic keys and it looks like it worked. Thank you!
I thought it would be ok just to add fields as long as there are common values in other tables despite having many synthetic keys but looks like I was wrong...