16 Replies Latest reply: Jun 3, 2015 12:17 PM by Ralf Becher RSS

    Concatenate big tables

    Boris Gerchikov

      Hello,

      I'm trying to concatenate two very big tables to implement star schema in Data model. Having these tables linked makes chart performance very slow.

      The first one has 50 fields and 440 millions records

      The second one has about 160 fields (80% of them are flags) and about 100 million records.

       

      The problem is the Concatenate operation uses one core only, so this process might take couple days to be accomplished.

       

      Has anybody met such problem?

       

      Thanks!

      Boris

        • Re: Concatenate big tables
          Fernando Keuroglian

          Hi boris,

           

          in the two tables all the fields are different? you have any similar field?

           

          you will have to concatenate one time only and the you will use it from a qvd , could be an option

          • Re: Concatenate big tables
            Ralf Becher

            Hi Boris,

             

            can you post your script? Is it an un-optimized load from QVD due to some transformations or additional functions you use? Usually, an optimized QVD load should work smoothly, as long as you have enough memory available..

             

            - Ralf

              • Re: Concatenate big tables
                Boris Gerchikov

                Hi Ralf,

                 

                I cannot post the script, but I can say that the qvd load is optimized. The problem is that Concatinate operation uses 1 thread only while the server has 16 CPUs. It takes approximately the same time eather for the preprocessed in the same script tables or loaded from QVD.

                 

                Thanks!

                  • Re: Concatenate big tables
                    Ralf Becher

                    It's still not clear to me what's happen in your script. However, sometimes load from resident is slower than from QVD. So, it could help to store a processed resident table (esp. when loaded distinct) into QVD and then load again from it, and then concatenate it..

                • Re: Concatenate big tables
                  Boris Gerchikov

                  Ralf,

                   

                  I'm trying to concatenate two large tables. I was trying to do this in several ways:

                   

                  Both tables are residential

                  One residential, second loaded from QVD (optimized load)

                  Both tables are from QVD (optimized load)

                   

                  The problem is this process is extremely slow because only one CPU thread is enabled, When I join these tables QV is using several cores. It looks like Concatenate operation has the same problem as Group By or Inner Join. So, the question is has anybody mitigated similar issue in some way.

                   

                  Thanks!  

                  • Re: Concatenate big tables
                    Mikael Kristensen

                    What is wrong with the standard outer join in this case?

                    "Both tables have common key field used to link them. All other fields are different."

                    If your data are solid it should work?

                     

                    Table_test:

                    Load * Inline

                    [key, testfield, flag1, flag2, flag3

                    A, 125, 1,0,1

                    B, 123, 1,0,1

                    C, 155, 0,1,0

                    D, 110, 0,0,1];

                     

                    Join

                    //Concatenate

                     

                    Table_test2:

                    Load * Inline

                    [key, testfield2 ,flag4, flag5, flag6

                    A, 125, 0,1,0

                    B, What, 1,0,1

                    C, Why, 0,1,0

                    D, How, 1,1,1];

                     

                    Then you have:

                     

                    With concatenate activated instead you have:

                     

                    So why do you need the concatenate version?

                      • Re: Concatenate big tables
                        Martin Pohl

                        I've got the same problem.

                        800 Mio datas from table 1.

                        then concatenate datas from table 2.

                        There are common fields and different fields.

                        While concatenate load the memory grows by 8 Gb - why, because there are no datas in, only a null value....

                         

                        So I load a fact table at the beginning, including all fields from all tables I want to concatenate.

                        The script is not freezing to generate the null values but the memory size is still too large.

                         

                        to join the facts is not the answer.

                        Maybe the material number is the field you want to join.

                        You would load for example 10 datas for one material with sales information.

                        When now joining the facts with storage information (also 10 datasets) you would have 100 records at all, each record is duplicated 10 times.

                        Even if you join by material number and year/month, it could be possible to duplicate datas if there are more than one record in one case.

                        When using all dimesion fields as join fields there are no datas to join, the outer join would create the same result as a concatenate load.

                         

                        To load the datas in different fact tables and using a link table is also a problem.

                        One fact is using material customer year month,

                        the other material, supplier, year month,

                        the next material workplace, year month

                        and mybe another workplace and yearmonth.

                        So a multible number of link fields what also generates a huge link table.

                         

                        I'm thinking there is not THE answer at all

                      • Re: Concatenate big tables
                        yusuuf rawat

                        Concatenate will be processed via a single  CPU core  and therefore is slower 

                        Use  Outer Join which is faster , which will use all available CPU Cores of the server