Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 
Not applicable

Extremly bad performance on load of wide table when making it one column wider.

When I add one more column to a quite wide (ca 60 columns) table performance just goes down the drain.

I have a table with approx 60 columns. I load this from SQL and save as a QVD.

For some reason when I ad one more column perfomance goes from acceptable to impossible.

Suddenly it would take days instead of an hour to complete and the CPU-usage goes to max on all cores and basically makes the machine unusable.

It seems quite deterministic. If I ad one column, but remove another one, it is ok, but increasing the total number of columns, and perfomance goes bad.

I do realize that there might be better ways to organize the data in a snowflake pattern to reduce the number of columns in the table.

But the behavoiur with sudden extreme drop of perfomance is strange. Do you know what the reason for this is? If there is a magic number around 60 columns. Or what can be done for this not to happen?

The table is mostly just fetching data from the sql and storing, but there is also around 15 flags and processed parameters like sums and date-extractions.

(I am running on a big server with lots of ram and CPUs)

events:

LOAD "field1",

    "field2",

    "field3",

    Date(DayStart(timestamp1),'YYYY-MM-DD') as at_date,   //some generated fields ...

    Date(MonthStart(timestamp1), 'YYYY-MMM') as at_year_month,

    …

    …

    …

    "field60";

SQL SELECT "field1",

    "field2",

    "field3",

    "timestamp1",

   …

   …

   …

    "field60"

FROM some_table"

;

store events into events.qvd;

Advice would be greatly appreciated.

/Andreas

2 Replies
Alexander_Thor
Employee
Employee

Hey Andreas,

You don't happend to load from the QVD file before or after the SQL query so QV creates syn tables for you?

Afaik there is no column resitriction as long as your machine can handle the volume.

I'm sitting here with a 100+ column wide query and it works fine even though I store straight into a QVD, dropping my sql connection and reading back the QVD to do transformations.

PS. You can floor() your timestamps instead of doing DayStart which should in theory be faster.

Not applicable
Author

Alexander, thanks for your reply.

I'm not quite sure I understand your question " load from the QVD file before or after the SQL query"

This document basically just load s couple of tables and saves them to qvd files.

How do you drop the connection? Are you saying its better not to do any transformations like Date and such on the intial load to qvd?