Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 
bgerchikov
Partner - Creator III
Partner - Creator III

Open big document issue

Hello,

We have a big document about 50 GB.

To open this document in "Clear" state or with binary reload  takes about 3-5 minutes. However, if any selection has been made before this document was closed, it might take 1 hour or even more. When the document is coming up, the task manager shows up to 100% CPU utilization for 1-2 minutes and 48GB of memory (out of 384GB), then it fells to 3-4% CPU utilization and waiting.  

CPU is Intel(R) Xeon(R) CPU E5-2690 0 @ 2.90GHz.

Application itself has  8 charts with the average calculation time about 30 seconds.

Would you provide any suggestions?

Thanks!

Boris

14 Replies
mrybalko
Creator II
Creator II

Hello, Boris

I suggest to split this document using document chaining.

More information Best Practices for Data Modelling (page 19)

giakoum
Partner - Master II
Partner - Master II

This is a really big app. What is the app size on disk?

DavidFoster1
Specialist
Specialist

That is an impressive document .

To improve first page rendering try:

  1. Avoid tables that show more than 1000 rows without a calculation condition.
  2. Have a filter condition (e.g. current year) triggered OnOpen.
  3. Run the document through Rob Wunderlich Document Analyser to see if you can remove any unneeded data.


You didn't mention how many CPUs and Cores your server has. Given a very rough rule-of-thumb of 8GB per core you should have 48 cores (so 8 6-core CPUs or something like that)

bgerchikov
Partner - Creator III
Partner - Creator III
Author

Thanks!

Here are some answers:

Document has been analyzed, redundant fields have been removed... all data in one fact table (550 million records, 180 columns) + 1 small (60 records) dimension. Size on the disk is 44GB

The server has 16 cores - you mean it's not enough?. Filter on the open doesn't help.

giakoum
Partner - Master II
Partner - Master II

Both rows and columns are a lot. 180 columns and you use them all? What do they contain? text? how many distinct values per column? if you cannot reduce columns and distinct values, you should seriously consider splitting the application.

marcus_sommer

If there are high-cardinality fields in your app like row-id's or timestamps - you could reduce the app-size (and open-time) a lot by splitting these fields into two or more fields, like these patterns:

date(floor(Timestamp), FORMAT) as Date,

time(frac(Timestamp), FORMAT) as Time

This logic could be applied by other numeric or string fields, too.

- Marcus

giakoum
Partner - Master II
Partner - Master II

By the way, on disk it is 44 and in memory near 50GB? What compression format are you using?

jfkinspari
Partner - Specialist
Partner - Specialist

Do you use Section Access in the document?

This can increase load time while binary loads are unaffected.

DavidFoster1
Specialist
Specialist

180 columns! Ouch!

When you are dealing with those kinds of row volumes then you need to be looking at 20-30 columns no more. Especially if you have lots of text values.

Do you have a table/chart that tries to show more than 1000 rows/bars/lines/dots on open?

Is this a VM or physical server?