4 Replies Latest reply: Mar 16, 2012 12:32 PM by Henric Cronström RSS

Script conflicts - exponential computing

lollo figo

Hi all,

 

I'm having a little trouble solving the following issue. I loading data from 2 separate files. Once the files have been loaded I have to calculate a series of new fields based on the data that has just been loaded. Nothing easier...I thought.

Well, I get to the result somehow but not in the right way I'm guessing. In other words the script does make all the calculations but creates for each step separate tables, each with the extension -x. On the reference manual I read this is done when tables go in conflict.

 

Well then, I re-designed the script as follows:

 

[MAIN]: 
 LOAD [n contract], 
     [n Master],
     [carrier] as NUMCIE, 
     [Year]
     Polizza, 
     mid(Polizza,findoneof(Polizza, '123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ\/')) as polizza_short
FROM
[Z:\Query Master 2012.csv]
(txt, codepage is 1252, embedded labels, delimiter is ',', msq);

[MAIN]:
concatenate LOAD [n contract], 
     [n Master],
     [carrier] as NUMCIE, 
     [Year]
     Polizza, 
     mid(Polizza,findoneof(Polizza, '123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ\/')) as polizza_short
FROM
[Z:\Query Master 2011.csv]
(txt, codepage is 1252, embedded labels, delimiter is ',', msq);

join LOAD PURGECHAR (NUMCON, '="') AS [n contract], 
        PURGECHAR (REGCON, '="') AS [Facility]
FROM
Z:\FCLTY1.CSV
(txt, codepage is 1252, embedded labels, delimiter is ';', msq);

join LOAD [n contract], [n Master], Facility, (if(isnull([n Master]), if(not isnull(Facility), Facility), if(isnull(Facility), [n Master], Facility))) as [Desc_Master] resident [MAIN];

join LOAD [Desc_Master],  (if(Desc_Master = '11421086', 'RCO', if(Desc_Master = '11668620', LIAB', **NOTHING**)) as description_master resident [MAIN];

join LOAD [Polizza], if(left([polizza_short], 3)='920', 'Global',   if(left([polizza_short], 3)='609', 'Global', 'PRIV')) as [3_cifre_polizza_short] resident [MAIN];

 

The script above works, as long as in the two Query Master files I import have just a very limited number of rows. What I noticed is the following: with the first two loads I import some 482 rows. The third load imports another 370 rows. No prob here.

What happens next is that in the last 3 steps (the last 3 join LOAD...) the number of loaded rows is absurd: 849, 1.696, 104.515 respectively.

 

I can't understand where the mistake is. Seems as if there is a wrong join somewhere that's making row expand exponentially. But where?

 

Any help will be greatly appreciated. Thanks

  • Script conflicts - exponential computing
    Jagan Mohan

    Hi,

     

    Try using Left/Inner join.

     

    Regards,

    Jagan.

  • Re: Script conflicts - exponential computing
    Henric Cronström

    Your three last LOAD statements are basically statements where you take the "Main" table and join it onto itsef. Which of course will lead to exponential growth of the "Main" table as soon as you have multiple records with the same value.

     

    It will probably work better if you instead use temporary tables. Leave the first three LOAD statements as they are, then replace the last three with:

     

    Main2:

    LOAD [n contract], [n Master], Facility, (if(isnull([n Master]), if(not isnull(Facility), Facility), if(isnull(Facility), [n Master], Facility))) as [Desc_Master] resident [MAIN];

    Drop table MAIN;

     

    Main3:

    LOAD [Desc_Master],  (if(Desc_Master = '11421086', 'RCO', if(Desc_Master = '11668620', LIAB', **NOTHING**)) as description_master resident [Main2];

    Drop table Main2;

     

    FinalMain:

    LOAD [Polizza], if(left([polizza_short], 3)='920', 'Global',   if(left([polizza_short], 3)='609', 'Global', 'PRIV')) as [3_cifre_polizza_short] resident [Main3];Load

    Drop table Main3;

    • Script conflicts - exponential computing
      Oleg Troyansky

      Henric is correct, of course, about the reason for growing volumes. Using Join in those cases is very troublesome. Let me add a slight correction to the suggested code:

       

      1. All 3 transformations can be performed using preceeding loads.

      2. Only 2 of the three transofrmations need to be performed sequentially, so we don't have to stack up all three of them.

      3. Don't forget to include "*" in order to carry over all other fields that don't require transformation, or you'll lose them on the way. So, the final result could look like this:

       

      ...

      Main2:

      load

           *,

      (if(Desc_Master = '11421086', 'RCO', if(Desc_Master = '11668620', LIAB', **NOTHING**)) as description_master,

      if(left([polizza_short], 3)='920', 'Global',   if(left([polizza_short], 3)='609', 'Global', 'PRIV')) as [3_cifre_polizza_short]

      ;

      LOAD

           *,

           (if(isnull([n Master]), if(not isnull(Facility), Facility), if(isnull(Facility), [n Master], Facility))) as [Desc_Master]

       

      resident [MAIN];

      Drop table MAIN;

      rename table Main2 to MAIN;