Skip to main content
Announcements
Qlik Connect 2024! Seize endless possibilities! LEARN MORE
cancel
Showing results for 
Search instead for 
Did you mean: 
rustyfishbones
Master II
Master II

Extract over Different DB's best practise

Hi All,

I am taking Data from a Summarized table.

I need to add additional fields to the Summarized table from other DB

The other DB's contain millions of rows of data when compared to the Summarized Table.

I reloading the Summarized table every minute and it's no problem, what is the best approach to add the additional fields from the other DB, considering I only need trx numbers that I have in my Summarized table.

Summarized table only contains 300k trx numbers

Other tables contain millions of trx numbers?

What's the best approach

Regards

Alan

7 Replies
maxgro
MVP
MVP

maybe: filter the other DB with min/max from Qlk trx numbers

rustyfishbones
Master II
Master II
Author

If I do that on the SQL SELECT, it takes too long.

I could extract all the Data into QVD Layer and then use where exists, however I don't really want to have to extract 30 million records just to get the 300k that I need.

I want the option that provides the best performance? Any ideas

Regards

Alan

Anonymous
Not applicable

For the 30 million row table could you do an incremental load into QVD ?

maxgro
MVP
MVP

From what I understand you want to add some fields to a Qlk table you load every minute.

These fields came from a different DB, some millons of rows, so you also have to query this other DB table every minute; you only need 300k rows (maybe a group by or something else) but you need to query some milions of rows to get the 300k row you need. 

can you use an incremental load on other db?

every minute you query the other DB for the changed records (only)

rustyfishbones
Master II
Master II
Author

Hi Bill,

Yes I can, but is their a solution where I don't need to take the full 30 million.

Is that the best solution, or is their another possibility?

Regards

Alan

maxgro
MVP
MVP

another one

how many transactions change every minute (on average)? for few changes

you can build a dynamic sql for query the other DB asking only for changed transactions

you have to build the where clause for the other DB, something like

where trx_number in (trx1, trx2, trx3, .....)

trx1, ..... are the changed one in the Summrized table

stevedark
Partner Ambassador/MVP
Partner Ambassador/MVP

Hi Alan,

Can you transfer the list of 300K ID's from the one database to the other?  QlikView would allow you to create a text file of the numbers.  SSIS would be able to pull this list into a table in the database with the 30M rows.  Once loaded you could do an INNER JOIN between the table with the millions of rows and the list of ID's of the thousands.  It's not a pure QlikView solution, but it should work.

If you can't write back to the database you would have to craft a load script that has a where statement in it that limits the rows correctly.  You could enumerate around the IDs you want, to build a string that could be passed in a WHERE statement.  I'm not sure how many you would be able to pass in one go though - at some point the SQL Statement would break.

Do the rows in the 30M row table change once created, or are they static?  If they are only appended to it would certainly be possible to load them all once as you suggest (QVD of all, then a WHERE EXISTS) and then craft a WHERE statement for only the rows that are new each run, then append them to the main data.

With anything involving QVD loads it is imperative that loads are Optimised.  Anywhere you have SQL JOINS it is critical that you have sensible indexes.

Hope that helps point you in the right direction.

Steve