Skip to main content
Announcements
Have questions about Qlik Connect? Join us live on April 10th, at 11 AM ET: SIGN UP NOW
cancel
Showing results for 
Search instead for 
Did you mean: 
mp802377
Creator II
Creator II

Dimensions with multiple columns - hundreds of millions of rows

I have a table that is about 200 million rows. 60 fields are in that table.  I have several fields that I want to link to one dimension. I have done the whole create crosstab, and created the tables. The problem is I need to do it to several fields

Market1 Market2 Market3 Nextfield1 Nextfield2 Nextfield3 Vendor1 Vendor2 Vendor3 dropDate1 dropDate2 dropDate3
row1 Colorado Utah NewMexico BV MN EL AA BB CC 5/6/2020 9/20/2021 12/20/2021
row2 Montana Nevada Wyoming MS IE LW WN PO SN 1/15/2021 10/25/2020 2/25/2021

Each table was set up like below. I have done this with other projects and it worked fine, but those projects pulled in millions of rows. Not hundreds of millions. Its star schema is one main table, 200 million rows, approx 50 fields with four other tables around it with 200 million rows each. It takes about 10 minutes just to open it.

I want one table. Having one main table with 200 million rows bumped up in a star schema with 4 or 5 other tables with 200 million rows each is killing my resources. This takes hours upon hours to run. I am looking to find a way to keep everything on one table. 

I would like the dimension Market to link someway to Market1 Market2 Market3 without it being in a separate table. 

 

This query works. I have used it many times. But here, it is a problem, because I have a star schema with 4 or 5 tables, each with 200 million rows

Market:

LOAD
[Trans Id], Market1 as [MarketCarrier]
Resident FactTable
WHERE not Market1 '-' or Len(Market1) > 1;

Join(Market)
LOAD
[Trans Id], Market2 as [MarketCarrier]
Resident FactTable
WHERE not Market2 like '-' or Len(Market2) > 1;

Join(Market)
LOAD
[Trans Id], Market3 as [MarketCarrier]
Resident FactTable
WHERE not Market3 like '-' or Len(Market3 ) > 1;

I had created separate tables from what others had done. The market had its own table, Nextfield1 had its own table, Vendor had its own table like shown below. But the problem is, there are 200 million rows in the dataset. And each (one time for main, one time for Market, one time for Nextfield1, one time for  Vendor1,  and one time for dropDate1 ) it has to pull in the 200 million rows. Just to pull in the 200 million rows, it takes Qlikview 2 hours. Last night this ran all night and barely finished by morning.

Is there a better way to do this that doesn't take up all of my resources? Also, this is only 10 days of data. I can't go less than that on the number of days. 

Crosstables do not work for this. Again. Main table has 60+ fields. Matching up against a cross table will cause duplication of data. I cannot concatenate a crosstable to the main table, again it will cause duplication of data in the main fields I am counting or summing. It will count 1 transaction as 4. And that would be bad. 

How do I shut this thread down? It is clear that this isn't something Qlikview can do, and it shouldn't stay open.

Thank you,

Martha

 

Labels (2)
9 Replies
rwunderlich
Partner Ambassador/MVP
Partner Ambassador/MVP

What is this Where clause supposed to do?

WHERE not Market1 '-' or Len(Market1) > 1;

-Rob

mp802377
Creator II
Creator II
Author

I don't want to pull in records with a - in it. And I don't want null values.  Sometimes the data comes in like -. The suppress - in the chart rarely works for me.

Vegar
MVP
MVP

What are you trying to achieve by the Joins in the market table? You have identical field names in all joins. Could it be that you could do CONCATENATE instead of Join?

rwunderlich
Partner Ambassador/MVP
Partner Ambassador/MVP

The "not Market '-'" does not look like correct syntax to me. 

As @Vegar suggested, I would use Concatenate instead of Join.  Then I would remove the Where clause temporarily to see what impact the Where filtering has.  There could be more efficient ways to filter out those rows, like when you create the Resident table.  What source is the the Resident table created from?

mp802377
Creator II
Creator II
Author

No. they are not identical field names. The field appear in the database like this: Market1 Market2 Market3 Nextfield1 Nextfield2 Nextfield3 Vendor1 Vendor2 Vendor3 dropDate1 dropDate2 dropDate3. Very different field names. 

The query I gave worked before. As I said, I have used it many times. But the issue is I have 200 million rows. Each new table I am creating has 200 million rows. I am trying to find another way to do this. And I really don't think Qlikview can. Having one main table with 200 million rows in a star schema that is bumped up against 4 other tables with 200 million rows is killing my resources. 

I was looking for a way to keep everything on one table. I guess I wasn't clear on that. I want everything in one table instead of 5 tables that contain 200 million rows each. It is killing resources and takes 8 hours to fully run. I am not looking for a way to modify the each table (market, vendor, nextField, and dropDate). I would like to keep it all in one table like it is now and have the chart dimension recognize that the fields Market1 Market2 Market3 are all the same dimension.

mp802377
Creator II
Creator II
Author

The query I gave worked before. As I said, I have used it many times. But the issue is I have 200 million rows. Each new table I am creating has 200 million rows. I am trying to find another way to do this. And I really don't think Qlikview can. Having one main table with 200 million rows in a star schema that is bumped up against 4 other tables with 200 million rows is killing my resources. I need to have it filter out the - and the null because the suppress when null never seems to work in Qlikview charts. I can't have those values in the charts.

I was looking for a way to keep everything on one table. I guess I wasn't clear on that. I want everything in one table instead of 5 tables that contain 200 million rows each. It is killing resources and takes 8 hours to fully run. I am not looking for a way to modify the each table (market, vendor, nextField, and dropDate). I would like to keep it all in one table like it is now and have the chart dimension recognize that the fields Market1 Market2 Market3 are all the same dimension.

Vegar
MVP
MVP

I think we misunderstand each other, I see that you have different Field names,  but you are renaming them  before doing your joins.

Screenshot_20210305-183953_Edge.png

Is it possible to give us a few sample rows as an example and then explain what the desired output of these row would be.

...and as Row was saying. "Market3 like '-' " is not Qlik syntax.

mp802377
Creator II
Creator II
Author

The output of the query above would create below (I didn't add the transID field). But this is not what I am asking. I want one table. What I did below, I did with other Qlikview apps, but that was with millions of lines and that worked fine. But I am bumping up 200 million row table with 4 or 5 other tables like below (each having 200 million rows). It is killing my resources. I just want one table. 

 

Market:
transID MarketCarrier
1 Colorado
1 Utah
1 NewMexico
2 Montana
2 Nevada
2 Wyoming


NextField:

transID NextField
1 BV
1 MN
1 EL
2 AA
2 BB
2 CC

I am looking a way to have one table. I want to have fields Market1, Market2, Market3 match with the dimension in the chart MarketCarrier. I don't want 4 or 5 separate tables. I want one. And as I said, since the comments keep coming back to me creating separate tables, I don't think this is something Qlikview can handle. I can't have a report that takes 8-10 hours to run. And our datasets are larger than this. As I said, 200 million rows is only 10 days. 

Market1 Market2 Market3 Nextfield1 Nextfield2 Nextfield3 Vendor1 Vendor2 Vendor3 dropDate1 dropDate2 dropDate3
row1 Colorado Utah NewMexico BV MN EL AA BB CC 5/6/2020 9/20/2021 12/20/2021
row2 Montana Nevada Wyoming MS IE LW WN PO SN 1/15/2021 10/25/2020 2/25/2021

 

 

Vegar
MVP
MVP

What about this approach?

RAW:
LOAD * INLINE [
RowID, Market1,Market2,Market3,Nextfield1,Nextfield2,Nextfield3,Vendor1,Vendor2,Vendor3,dropDate1,dropDate2,dropDate3
1,Colorado,Utah,NewMexico,BV,MN,EL,AA,BB,CC,5/6/2020,9/20/2021,12/20/2021
2,Montana,Nevada,Wyoming,MS,IE,LW,WN,PO,SN,1/15/2021,10/25/2020,2/25/2021
];

Market:
CrossTable ('Market type','Market')
LOAD RowID, Market1,Market2,Market3
Resident RAW;
DROP Field [Market type]; //Drop if you don't want to keep that info.


Nextfield:
CrossTable ('Nextfield type','Nextfield')
LOAD RowID, Nextfield1,Nextfield2,Nextfield3
Resident RAW;
DROP Field [Nextfield type]; //Drop if you don't want to keep that info.


Nextfield:
CrossTable ('Vendor type','Vendor')
LOAD RowID, Vendor1,Vendor2,Vendor3
Resident RAW;
DROP Field [Vendor type]; //Drop if you don't want to keep that info.


Nextfield:
CrossTable ('dropDate type','dropDate')
LOAD RowID, dropDate1,dropDate2,dropDate3
Resident RAW;
DROP Field [dropDate type]; //Drop if you don't want to keep that info.

DROP TABLE RAW;