15 Replies Latest reply: Dec 19, 2012 8:28 AM by Michael Ionkin RSS

    Splitting the data - On the way to granularity

    Michael Ionkin

      Hello community,

       

      I have a really tricky problem:

       

      One Table has State-Information (Field lState)

      This state has a start time and a end time.

      lStatetStarttEndDuration
      122345420121126 11:04:3320121126 11:10:406 min
      122345520121126 06:00:0020121127 13:00:001 d 7 h

       

      I would like to achive a more detailed granularity:

       

      I have three time zones per day:

      TZ1: 06:00:00 - 14:00:00

      TZ2: 14:00:00 - 22:00:00

      TZ3: 22:00:00 - 06:00:00 (on the next day)

       

      The split should lead to the following output:

       

      IDlStatetStarttEndDurationDayTimeZone
      1122345420121126 11:04:3320121126 11:10:406 min20121126TZ1
      2122345520121126 06:00:0020121126 14:00:008 h20121126TZ1
      3122345520121126 14:00:0020121126 22:00:008 h20121126TZ2
      4122345520121126 22:00:0020121127 06:00:008 h20121126TZ3
      5122345520121127 06:00:0020121127 13:00:007 h20121127TZ1

       

      For sure I'll need some for next loops , but as I'm quite new to QlikView this is a too big fish for me.

       

      Maybe somebody is interested in solving this nice problem or has already solved a similar problem.

       

       

      Any help is welcome

       

      Cheers

       

      //chesterluck

        • Re: Splitting the data - On the way to granularity
          Gysbert Wassenaar

          See attached qvw. Somebody can probably do this more efficiently, but this kludge seems to work at least.

            • Re: Splitting the data - On the way to granularity
              Michael Ionkin

              Hey it looks already not bad, but still there is is a logical error.

               

              I loaded new record to test the split:

               

              lState,tStart,tEnd,Duration

              1223456,20121128 02:00:33, 20121130 01:02:02, 1 d 4 h

              Here is the output:

              ScreenSplit.jpg

              The target-output should look like this:

               


              Original dataset: 1223456, 20121128 02:00:33, 20121130 01:02:02

              New_IdlStatetBegintEndDuration (Calculated)TimeZoneDate
              1223456_1122345620121128 02:00:3320121128 05:59:59tEnd-tBeginTZ320121128
              1223456_2122345620121128 06:00:0020121128 13:59:59tEnd-tBeginTZ120121128
              1223456_3122345620121128 14:00:0020121128 21:59:59tEnd-tBeginTZ220121128
              1223456_4122345620121128 22:00:0020121129 05:59:59tEnd-tBeginTZ320121129
              1223456_5122345620121129 06:00:0020121129 13:59:59tEnd-tBeginTZ120121129
              1223456_6122345620121129 14:00:0020121129 21:59:59tEnd-tBeginTZ320121129
              1223456_7122345620121129 22:00:0020121130 01:02:02tEnd-tBeginTZ320121130

              I ll try to change the code, but it would be great if you also would have a look

              Thanks a lot

              //chesterluck

                • Re: Splitting the data - On the way to granularity
                  Michael Ionkin

                  Hi all,

                   

                  I elaborated a bit and wrote soe code, maybe it is helpful to answer the question. As I said, Im quite new to QlikView

                  and please you to exuse my mistakes. Here the code:

                   

                   

                  ps Thanks a lot

                  Inline_Load:


                  load
                  lState,
                  Timestamp#(tStart,'YYYYMMDD hh:mm:ss') as tStart,
                  Timestamp#(tEnd,'YYYYMMDD hh:mm:ss') as tEnd,
                  hour(Timestamp#(tStart,'YYYYMMDD hh:mm:ss')) as hStart,
                  hour(Timestamp#(tEnd,'YYYYMMDD hh:mm:ss')) as hEnd, 
                  tEnd - tStart as Duration

                  Inline [
                  lState,tStart,tEnd
                  1223454,20121126 11:04:33, 20121128 13:10:40
                  ];


                  //Variables
                  Set tSplitEnd = 'YYYYMMDD hh:mm:ss';
                  Set tSplitStart = 'YYYYMMDD hh:mm:ss';


                  If ((hStart >= 22 and hEnd < 6 and Duration <= 8) OR (hStart >= 6 and hEnd < 14 and Duration <= 8) OR (hStart >= 14 and hEnd < 22 and Duration <= 8)) Then
                  //MsgBox("Do Nothing")
                  ELSE

                   

                  Let tSplitStart = tStart;



                  If hStart > 22 Then
                  Let tSplitEnd = Timestamp#(Date#(tBegin, 'YYYYMMDD') + 1 + MakeTime(5,59,59), 'YYYYMMDD hh:mm:ss');

                  ELSEIF hStart < 6 Then
                  Let tSplitEnd = Timestamp#(Date#(tBegin, 'YYYYMMDD') + MakeTime(5,59,59), 'YYYYMMDD hh:mm:ss');

                  ELSEIF hStart < 14 Then
                  Let tSplitEnd = Timestamp#(Date#(tBegin, 'YYYYMMDD') + MakeTime(13,59,59), 'YYYYMMDD hh:mm:ss');

                  ELSE
                  Let tSplitEnd = Timestamp#(Date#(tBegin, 'YYYYMMDD') + MakeTime(21,59,59), 'YYYYMMDD hh:mm:ss');

                   

                  End If



                  For i = 1 to Duration/8 + 2 //2 as appx add. quant
                    If tSplitBegin < tEnd Then
                   
                     tStart = tSplitStart;
                    
                     If tSplitEnd <= tEnd Then
                     Let tEnd = tSplitEnd;
                     Else
                     Let tEnd = tEnd;
                     End If
                    
                     tSplitBegin = Timestamp#(Date#(tSplitBegin, 'YYYYMMDD') + MakeTime(8,0,0), 'YYYYMMDD hh:mm:ss');
                        tSplitEnd = Timestamp#(Date#(tSplitEnd, 'YYYYMMDD') + MakeTime(8,0,0), 'YYYYMMDD hh:mm:ss');
                     
                          ELSE
                       End If
                       
                      Next
                  //Should be used somewhere in the code
                  //concatenate (Inline_Load) load

                  • Re: Splitting the data - On the way to granularity
                    Gysbert Wassenaar

                    Ok, so you want Day to be the day part of tEnd and not of tStart. That can't be determined from your first example. But it's easy enough to do. Copy the calculation for tEnd, change the date format to 'YYYYMMDD' (leaving out th hh:mm:ss) and name it as Day.

                      • Re: Splitting the data - On the way to granularity
                        Michael Ionkin

                        ScreenSplit.jpg

                         

                        Im a bit unsure whether you understood me right.

                        I also figured it out with the end date , so thats fine.

                         

                        But in your solution, as you can see (red color), its counting 8 hrs to the start timestamp.

                        As you can see below, there is exactly one data set with the original tBegin and exactly one with the original tEnd (Red color). Depending on the TimeZone (defined in the original question) all the added tBegin can have three possible characteristics (06:00:00;14:00:00;22:00:00); all the added tEnd (05:59:59;13:59:59;21:59:59) :

                         

                        Unbenannt.jpg

                          • Re: Splitting the data - On the way to granularity
                            Stefan Wühl

                            Try like this:

                             

                            INPUT:

                            load

                            lState,

                            Timestamp#(tStart,'YYYYMMDD hh:mm:ss') as tStart,

                            Timestamp#(tEnd,'YYYYMMDD hh:mm:ss') as tEnd

                            Inline [

                            lState,tStart,tEnd,Duration

                            1223454,20121126 11:04:33,    20121126 11:10:40,6 min

                            1223455,20121126 06:00:00, 20121127 13:00:00,1 d 7 h

                            1223456,20121128 02:00:33, 20121130 01:02:02, 1 d 4 h

                            1223457,20121128 19:00:33, 20121130 01:02:02, 1 d 4 h

                            ];

                             

                            RESULT:

                            LOAD *, interval(End-Start) as Duration;

                            LOAD *, if(tStart > TZStart, tStart, TZStart) as Start, if(tEnd < TZEnd, tEnd, TZEnd) as End ;

                            LOAD iterno() as ShiftID, lState, tStart, tEnd,

                            timestamp(floor(tStart,1/3,1/4)+(iterno()-1)*1/3) as TZStart,

                            timestamp(floor(tStart,1/3,1/4)+iterno()*1/3-interval#(1,'fff') ) as TZEnd,

                            mod(round((floor(tStart,1/3,1/4)+(iterno()-1)*1/3)*12),3)+1 as TZ

                            Resident INPUT

                            while floor(tStart,1/3,1/4) + (iterno()-1)/3 <= floor(tEnd,1/3,1/4);

                             

                            drop table INPUT;

                              • Re: Splitting the data - On the way to granularity
                                Michael Ionkin

                                Guys you're the best. Thanks a lot!!! In the meantime I'll try to understand the code

                                 

                                //chesterluck

                                • Re: Splitting the data - On the way to granularity
                                  Michael Ionkin

                                  Hi swuehl,

                                   

                                  the splitt works fine ..almost: I figured out one example where it doesnt work :

                                   

                                   

                                  I have no clue where the mistake in the code is.. Normally it should produce 2 raws - 2nd raw ShiftID 2 from 14:00:00 until 21:55:18...Thank you in advance

                                   

                                  error.png

                                    • Re: Splitting the data - On the way to granularity
                                      Stefan Wühl

                                      I can't open the attached image. Could you post the problematic input data:

                                       

                                      LOAD * Inline [

                                      lState,tStart,tEnd

                                      ....

                                      ]

                                       


                                        • Re: Splitting the data - On the way to granularity
                                          Michael Ionkin
                                          lMacStaRefShiftIDStartEndDurationtBegintEndTZTZStartTZEnd
                                          15687141102.06.2012 11:00:0002.06.2012 14:00:003:00:0002.06.2012 11:00:0002.06.2012 21:55:18102.06.2012 06:00:0002.06.2012 14:00:00

                                           

                                          End and TZEnd sould be 13:59:59

                                          The second shift is missing and should be

                                           

                                          15687141 - 2 -  02.06.2012 14:00:00   -  02.06.2012 21:55:18 - 9:55:18 - 02.06.2012 11:00:00 - 02.06.2012 21:55:18 - 2 - 02.06.2012 14:00:00 - 02.06.2012 21:59:59

                                            • Re: Splitting the data - On the way to granularity
                                              Stefan Wühl

                                              Yep, it's never a good idea to compare two floating point values ;-)

                                               

                                              To make my approach stable, we need to add a small amount of time (say 1 ms) to the one end of the WHILE comparison:

                                               

                                              ...

                                              while floor(tStart,1/3,1/4) + (iterno()-1)/3 < floor(tEnd,1/3,1/4)+interval#(1,'fff');

                                                • Re: Splitting the data - On the way to granularity
                                                  Michael Ionkin

                                                  So the problem is that at the end of a while loop the comp might still calc and the while already goes further?? was that the problem??

                                                   

                                                  I let you know whether its running smooth now

                                                   

                                                  thx a lot already

                                                  • Re: Splitting the data - On the way to granularity
                                                    Michael Ionkin

                                                    Ok, it works fine..

                                                     

                                                    I have a last question:

                                                     

                                                    Where should I split a quantity field (Number of parts) - during the split or can i do it after.

                                                     

                                                    I started to do it and im basically just counting the persentage of a Timezonepart to the complete StateTime and then i multiply it with the Quantity nr.

                                                     

                                                    round((DurationStatePart/DurationState)*ShotsAll) as NumberOfShotsInTheStatePart

                                                     

                                                    sth like this.

                                                     

                                                    The problem iswhen the numbers are like this:

                                                    20

                                                    20,5

                                                    19,5

                                                     

                                                    While rounding I would get one part more....

                                                     

                                                    any idea?

                                                      • Re: Splitting the data - On the way to granularity
                                                        Stefan Wühl

                                                        >So the problem is that at the end of a while loop the comp might still calc and the while already goes further?? was that the problem??

                                                         

                                                        Hm, no, I don't think so. The problem is that floating point calculations might be a bit off due to the internal representation even when the calculation is logically / mathematically correct.

                                                         

                                                        So something like

                                                         

                                                        floor(tStart,1/3,1/4) + 1/3 <= floor(tEnd,1/3,1/4)

                                                         

                                                        might return false even (with your numbers for tStart and tEnd entered), even when it should return true, because floating point arithmetic calculations were not 100% correct, just a little off (for example, 1/3 cannot be represented correctly in the used base system).

                                                         

                                                        [edit]: Even this should be more stable than my first version, though in a mathmetical sense, both should be equal:

                                                        ...

                                                        while floor(tStart+ (iterno()-1)/3 ,1/3,1/4)  <= floor(tEnd,1/3,1/4);

                                                         

                                                         

                                                        >While rounding I would get one part more.... any idea?

                                                         

                                                        So your requirement is that the sum of your rounded numbers equals ShotsAll, right? So you need an unbiased tie-breaking when rounding.

                                                         

                                                        I don't think there is the one and only one correct answer to this, maybe you can try something like

                                                         

                                                        TEST:

                                                        LOAD recno() as ID,

                                                            TEST,

                                                            floor(TEST) as FLOOR,

                                                            TEST - floor(TEST) as DIFF

                                                        INLINE [

                                                        TEST

                                                        20

                                                        20.5

                                                        19.5

                                                        18.5

                                                        21.5

                                                        18.6

                                                        19.4

                                                        ];

                                                         

                                                        JOIN LOAD

                                                            round(sum(TEST)-SUM(FLOOR)) as TOSHARE

                                                        Resident TEST;

                                                         

                                                        LOAD ID,

                                                            RANGESUM(FLOOR, if(recno()<=TOSHARE,1,0)) as RESULT 

                                                        Resident TEST order by DIFF desc, FLOOR desc;