Skip to main content
Announcements
Happy New Year! Cheers to another year of collaboration, connections and success.
cancel
Showing results for 
Search instead for 
Did you mean: 
martynwb
Contributor III
Contributor III

Table not joining correctly despite field names matching

I have an app with two tables that need to be joined via the transaction table. There is a foreign key field in both tables which is meant to relate them: %Key_HOST_312E40B6043C9FFE As QualysHostKey.

However, these tables are not relating. No information in either table is relating to information in the other table, even though I've checked the raw data and all of the key information is there and matches. 

The load script is below. Any help would be appreciated:

Concatenate(Transactions)
LOAD
'Daily' As DataType,
'QUALYS_Detection_Daily' as TransactionSource,
QualysHostID,
QualysHostIP,
QualysHostTrkMethod,
QualysHostOS,
QualysHostLastScan,
QualysHostScanDuration,
%Key_HOST_312E40B6043C9FFE As QualysHostKey,
QualysHostLastVMScanDate,
QualysHostLastVMScanDur,
QualysHostDNS,
QualysHostOSCPE,
QualysHostNetBios,
QualysHostQGHostID,
QualysHostLastVMAuthScanDate,
QualysHostLastVMAuthScannedDur,
QualysHostHostname,
QualysHostDomain,
QualysHostFQDN,
%Key_HOST_LIST_VM_DETECTION_OUTPUT_087173C108D6BD52,
SubDivision,
ApplyMap('MAP_DIV',SubDivision,'NoDivision') as Division,
ApplyMap('MAP_PILLAR',SubDivision,'NoDivision') as Pillar,
QualysHostPublicProvate,
QualysHostType,
QualysHostEOL
FROM [lib://QVD_S1/InfoSec/Qualys_Extract_Host_ALL.qvd]
(qvd);

Concatenate(Transactions)
LOAD
'Daily' As DataType,
'QUALYS_HOST_Daily' as TransactionSource,
QualysID,
QualysDetectType,
QualysDetectSeverity,
Pick(WildMatch(QualysDetectSeverity, '5', '4', '3', '2', '1'),'14', '30', '60', '90', '90') as QualysDetectSevDaysToFixReq,
Pick(WildMatch(QualysDetectSeverity,'5','4','3','2','1'),'Critical','High','Moderate','Low','Informational') As QualysDetectSeverityDesc,
QualysDetectPort,
QualysDetectProtocol,
QualysDetectSSL,
QualysDetectResults,
QualysDetectStatus,
QualysDetectFirstFoundDateTime,
QualysDetectFirstFoundTimeStamp,
QualysDetectFirstFoundShortDate,
QualysDetectLastFoundDateTime,
QualysDetectTimesFound,
QualysDetectLastTestDateTime,
QualysDetectLastUpdateDateTime,
QualysDetectIsIgnored,
QualysDetectIsDisabled,
QualysDetectLastProcessedDateTime,
QualysDetectLastFixedDateTime,
QualysDetectLastFixedDateTime - QualysDetectFirstFoundDateTime As QualysDetectDaysToFix,
if(IsNull(QualysDetectLastFixedShortDate),
if((today() - QualysDetectFirstFoundDateTime) > Pick(WildMatch(QualysDetectSeverity, '5', '4', '3', '2', '1'), '14', '30', '60', '90', '90'), 0, 1),
if(QualysDetectLastFixedDateTime - QualysDetectFirstFoundDateTime > Pick(WildMatch(QualysDetectSeverity, '5', '4', '3', '2', '1'),'14', '30', '60', '90', '90') ,0,1) ) As QualysDetectWithinPolicy,
QualysDetectLastFixedShortDate,
QualysDetectFQND,
%Key_HOST_312E40B6043C9FFE As QualysHostKey
FROM [lib://QVD_S1/InfoSec/Qualys_Extract_Detection.qvd]
(qvd);

1 Solution

Accepted Solutions
martynwb
Contributor III
Contributor III
Author

For anybody reading this and having similar issues. I'll summarise how I resolved mine.

Essentially I wanted to join two tables together (concatenate wouldn't work) and then concatenate them to my Transactions table (where all of my data is stored). 

So I used NoConcatenate on the table to join, then fed the resulting table into Transactions via Concatenate, before dropping the originally joined table. This removed the synthetic keys resulting from the joins.

View solution in original post

12 Replies
marcus_sommer

On one side the values may be a string and on the other side numeric. Also there might be any invisible chars included. Helpful in such cases could be functions like num(num#()) or text() to convert/format the values appropriate and trim() or keepchar() / purgechar() / replace() to include/exclude the proper chars.

- Marcus 

martynwb
Contributor III
Contributor III
Author

Thanks for your reply. It doesn't seem to work though. 

Oddly, I can get the above to work outside of the transaction table. But I'm also loading a 'monthly snapshot' using the same fields. As soon as I do that, the dataset breaks down. 

The above only works if I don't load the monthly snapshot and I join the second table rather than concatenate it...

marcus_sommer

You may need to do it on all sides and depending on the values there you might need (slightly) different logics. Usually such things should be already done by loading the source-data and not afterwards by loading from the qvd's.

- Marcus

martynwb
Contributor III
Contributor III
Author

That's sort of the issue though, this is done pre-load and it works. The data can be downloaded and associated and it works. It works if I load it into a separate table and join the second table. But all data in the entire data model is lost when I create another table with the same data for the purposes of monthly snapshotting.

marcus_sommer

If I understand it right you want to load a dynamic fact-table and also a static (snapshot) fact-table and linking them? If both tables have the identical data-structure they will be auto-concatenated (unless you used a noconcatenate statement) so that you will get only one table. Depending on your logic you may delete this table in the end and then getting an empty datamodel.

Like mentioned you could force it to separate tables but without an appropriate re-naming of the fields you will create multiple synthetic keys and/or circular loops and the application won't work properly anymore. If you do all the renaming you will avoid the above issue but handling all those different fields could become quite nasty.

I assume you want to compare the dynamic and the static data to detect any changes between them and for this I would concatenate the tables and adding an appropriate source-information like: 'dynamic' as Source. This flag-field could be then used to select the different subsets of data and/or to use it as set analysis condition like:

sum({< Source = {'dynamic'}>} value)

and 

sum({< Source = {'static'}>} value)

and comparing both expressions side by side or

sum({< Source = {'dynamic'}>} value) - sum({< Source = {'static'}>} value)

calculating directly a difference with it.

- Marcus

martynwb
Contributor III
Contributor III
Author

That's the problem though, that is exactly what I'm doing for a number of fact tables (including this one). It's for trending purposes. It works in all other cases but when I load the dataset above (twice), the dataset for the whole app is broken and nothing will load. I think it's something to do with the join (concatenate doesn't work).

The data works fine if I load just the script above, but fails when I load it twice (once for trending and once for 'live'). Even if I rename all of the fields, putting the word 'monthly' in, it still creates 12 synthetic keys and breaks everything.

Any ideas? I can post both load statements if you prefer?

 

marcus_sommer

I don't want to say that's not possible is to build datamodels with multiple fact-tables but it's far away from trivial and has often massiv disadvantages in regard to the performance + usability.

Therefore again my suggestion to concatenate those fact-tables and developing the datamodel in the official recommended direction of a star-scheme.

- Marcus

martynwb
Contributor III
Contributor III
Author

Thanks Marcus. I can't do that in this situation unfortunately.

Also, I may have misled. I don't compare the 'live' table with the 'trend tables'. I just compare the trend tables with each other. I.e. a QVD is created every month, which are then used for trend analysis. The operational table is used to show the 'current truth'.

I'm sure I've just done something stupid to prevent this from loading but I think it's too complex to articulate over a forum, unfortunately.

marcus_sommer

So this sounds as if you had just a normal dataset which is by the facts sliced into monthly data. This means the data-structure is mainly the same (a few differences over the time don't matter) and so all data belong together into a single fact-table.

There is no need to create separate tables. With the single table you have everything what you need to create your needed views - just pull the year-month as dimension into your objects and/or use it as set analysis conditions. It's really very simple. Everything else with multiple tables leads in the wrong direction and making it much more complex as necessary.

- Marcus