I'm confused that how long it will take to open a 300 million rows data in one qvw file?
Could be opened within 6s without caching? and what is the normal time if not?
Now we have 1 fact table and 24 dimension tables in the qvw file, about 300 million rows data.
And we need to achieve the functions that switch 11 currencies and 3 languages, contain detail tables with invoice numbers and invoice date, the important function is that select invoice date by using two calendars in each sheet.
For hardware, we have upgrade to twelve 2.4 GHz core CPU and 512G RAM, and added two 150G SSD in server.
First I tried to aggr the data in HIVE so that I can reduce the rows of data, but it only reduce 1/5 of 300 million, when I concatenated to the details data, we got about 550 million data, so I given up the point.
And for application, I've tried lots of optimization:
Synthetic keys removed from data model
Remove system keys/timestamps from data model
Unused fields removed from data model
Remove unneeded snow flaked tables (consolidate)
Break concatenated dim. fields into distinct fields
Remove link tables from very large data models, table concatenation is a possible alternative
Use integers to join tables where possible
Use Auto number to replace large concatenated keys