Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hello All,
I have a very big fact table in Teradata (data lab) which i need to pull in my Qlik sense application. This table has approx 30 Million records for each month and i have pull 24 months. Number of columns are about 25 and all of them are necessary (Most of them are flags and IDs which connect to dimension tables).
I am using 2 tier structure so i have one QVD generator which creates all QVDs per month and one application app which pulls all of these QVDs in app including dimension QVDs . (All of my dimensions tables are already stored in QVDs)
Pulling data from DB is taking around 2 hour for one month and hence it takes around 2 days to pull all the data (Can't do incremental load as they change history data as well).Now this process runs once in a month but i want to know is there any other way i can optimize this process ?
Idea (not sure if it is correct) -: Can i have multi QVD generator running in parallel (on weekends only so no load on server)?
Any other suggestion ?
Regards,
Sajal
Two hours for 30 million records seems somewhat slow loading times from your source. Are there any aggregations performed in the extract SQL statement? Are you loading from a database view (with on the fly joins) and not a table?
Normally you may run several SELECT statements parallel in QlikView loading from the same table in a SQL database. But be careful with aggregations in the SQL statements (as they are performed in the source as opposed to Load aggregations) since that may cause problems on the source side.
One possible solution is loading only changed records and update or delete affected rows in your QVD files on a regular basis. That would require a change date or a flag to identify the latest record for each row in the source table.
Two hours for 30 million records seems somewhat slow loading times from your source. Are there any aggregations performed in the extract SQL statement? Are you loading from a database view (with on the fly joins) and not a table?
Normally you may run several SELECT statements parallel in QlikView loading from the same table in a SQL database. But be careful with aggregations in the SQL statements (as they are performed in the source as opposed to Load aggregations) since that may cause problems on the source side.
One possible solution is loading only changed records and update or delete affected rows in your QVD files on a regular basis. That would require a change date or a flag to identify the latest record for each row in the source table.
Thanks for the reply.
I am not using view, i have a simple fact table and i only do Select * from Table, store it in QVD for each month.
No where condition or join nothing..
In future i am going to use incremental load (Once every thing is fixed in historical data). I have also created index on table to improve the extraction process.
Meanwhile i am using parallel sel * from table and it is taking 6 hours to load all data (24 Months). (4 Apps running in parallel )
Sajal