Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Mastering Change Data Capture: Read Now
cancel
Showing results for 
Search instead for 
Did you mean: 
jasmina_karcic
Creator III
Creator III

Left join (big tables)

Hi experts,

I am loading one table (cca 600 000 000 rows, 20 columns)... It is going pretty quickly. But, I need a new column add to this, from another table. Another table have like 5 000 000 rows. 

It is very slowly if I use left join, what can make it faster?

 

Thank you

 

Jasmina

Labels (1)
  • SaaS

2 Solutions

Accepted Solutions
marcus_sommer

Personally I would use mapping like the others already suggested. It avoids all the risks which could occur with joins (adding or removing records depending on the join-type) and it will be surely faster as a join. Further it provides the possibility to define any default-value if there is no match and also to do further calculations at once - and not like by join which would require another load for any further checks/calculations. Also I would look if the mapping-table might not be reduced to a lesser number of records with some kind of advanced filtering. This may look like:

TableForExists: load article from ArticleRawdata where category = 'very important';

/* it's just an example - it may also some generic approach like the recent one from yours (this with the concatenating) with creating the dates from the current year */

ExtractMappingData: load lookup, return from SecondTable where exists(article);

Map: mapping load * resident ExtractMappingData;

drop tables ExtractMappingData, TableForExists;

It may look like expensive efforts but nothing worked better to optimize the load-performance by large datasets - unless you used incremental approaches. In your case with such large tables I would try to implement incremental loadings at first and if it's then further necessary to reduce the load-times the above mentioned approach could be also implemented on top.

- Marcus

View solution in original post

marcus_sommer

I think you could do the second step within the first one, like:

if(column2=0, applymap(), column2)

- Marcus

View solution in original post

14 Replies
Anil_Babu_Samineni

Can't you add directly into the fact table for new column also?

Best Anil, When applicable please mark the correct/appropriate replies as "solution" (you can mark up to 3 "solutions". Please LIKE threads if the provided solution is helpful
jasmina_karcic
Creator III
Creator III
Author

In this moment no.

What is fact, I have something like this:

In big table ( more than half billion rows) i have something like this

column1                           column2

1                                           0

2                                           0

3                                           0

4                                          40

5                                          60

In other table i have 

column1                      column2

1                                     30

2                                     20

3                                     20

 

This column2 from second table is from big importance to me, I need that data...But how...

I was thinking like:

table1:

load * from table1.

left join (table1)

load column1,

          column2 as different_column

from table2;

 

From this I will get a new column, but I would have data. I need it in this table1, because this table1 is connected to other data that I need.

 

Jasmina

Mauritz_SA
Partner - Specialist
Partner - Specialist

Hi @jasmina_karcic 

Have you looked at using ApplyMap? Here is an article by Henric Cronström explaining why it is better than joining if you just want to get a single value from a table.

Hope this helps.

Regards,

Mauritz

Anil_Babu_Samineni

Since there was same column, Simply do concatenate on both

Load column1, column2 From T1:

Concatenate

Load column1, column2 From T2:

Or, you can define using ApplyMap concept

Best Anil, When applicable please mark the correct/appropriate replies as "solution" (you can mark up to 3 "solutions". Please LIKE threads if the provided solution is helpful
jasmina_karcic
Creator III
Creator III
Author

If I use concatenate and get this:

 

column1  column2           column3             column4

1                      0                         A                              B

2                      0                         A                              B

3                      0                         A                              B

1                      20                       null                            null

2                      30                        null                            null

3                      20                        null                            null

 

then how can i make table:

column1  column2           column3             column4

1                      20                        A                              B

2                     30                       A                              B

3                      20                        A                              B

 

Jasmina

 

p.s. now I will look applying map

petter
Partner - Champion III
Partner - Champion III

You will get the requested table iIf column1 is a unique key then having:

1) column1 as a dimension

2) column2 as a measure with the expression Sum(column2)

3) column3 as a measure with the expression Concat(column3) or Maxstring(column3) or Minstring(column3)

4) Same as for column3 but with column4 instead.

 

Mauritz_SA
Partner - Specialist
Partner - Specialist

Hi @jasmina_karcic 

Below should explain the ApplyMap principle.

Table2:
Mapping
LOAD * INLINE [
column1,column2
1,30
2,20
3,20
];

Table1:
LOAD * INLINE [
column1,column2
1,0
2,0
3,0
];

Rename Table Table1 to Temp;
NoConcatenate
Table1:
LOAD
column1,
ApplyMap('Table2', column1, Null()) as column2
Resident Temp;

Drop Table Temp;

In your case you won't have to do the rename and drop as you'll already have the table. You can obviously use logic to only use ApplyMap if the original table didn't have a value, or if it was 0, etc.

Regards,

Mauritz

jasmina_karcic
Creator III
Creator III
Author

columns that i present are not measures, just dimensions....so i cannot use sum

petter
Partner - Champion III
Partner - Champion III

Then you can have a preceding LOAD with a GROUP BY clause to contract the table into what you want. It doesnt matter whether the fields are dimensions or not in the load script. What you want to do is get rid of the null values and merge the rows that belong together....

But you might have to do a RESIDENT on the big concatenated table so I don't know if the performance or memory consumption will be good... ApplyMap might work better for you here. You'll have to give it a try.