Skip to main content
Announcements
Live today at 11 AM ET. Get your questions about Qlik Connect answered, or just listen in. SIGN UP NOW
cancel
Showing results for 
Search instead for 
Did you mean: 
Not applicable

How to reduce three records to one?

Hello All,

I have been trying to do this for a while now but have been unable to do so.

Hopefully one of you would be able to help if you have the time.

I have a table that is used to store timecard information which is gathered from biometric terminals for when employees turn up for work.

The table has the following configuration:

ID (INT & UNIQUE)

RECDATE,

RECTIME,

CANCELLUNCH (BOOL),

EMPLOYEENUM (INT).

Every day an employee will turn up to work and sign in biometrically. This will add a record to the table. At the end of their shift they swipe out, again adding another record. They can then swipe again and cancel the lunch deduction if they did not take a lunch. So at a minimum each employee will have two records per shift with an optional third should they cancel the lunch deduction (shown below in table).

Data in the table would look as follows (uk date format):

ID RECDATE  RECTIME   CANCELLUNCH  EMPLOYEENUM

1  23/07/12     13:00:00     0                        200

2  24/07/12     01:00:00     0                        200

3  24/07/12     01:00:00     1                        200

4  23/07/12     09:00:00     0                        201

5  23/07/12     18:30:00     0                        201

The ID column does not show the exact order of the scans as this is just an example of such data. Aka in real life ID 4 would be in the system before 3.

With record 3 this would be done physically straight after record 2 but in the database they have the exact sme time (01:00:00) for some reason. Possibly the system relating that employees punch out to the cancel lunch record via employee-date composite?

Now ID 1, 2 and 3 relate in business terms yet in the table other than by EMPLOYEENUM they do not. They are one shift for employee 200. I can see this from the table by looking but qlikview cannot see this as the records do not relate via the data (difficult to explain what I mean but hope I have done so).

What I am looking to have is a new table which takes each related series of records down to one as seen with the below table:

ID  DATEIN     TIMEIN       DATEOUT    TIMEOUT  CANCELLUNCH   EMPLOYEENUM

1   23/07/12    13:00:00     24/07/12      01:00:00    1                         200

2   23/07/12    09:00:00     23/07/12      18:30:00    0                         201

As seen this has now made two related records based on the data available.

The IDs may not follow in order for each employees visit clocking out at we have 12+devices feeding the information to the database.

Any idea on how this can be done would be very helpful as this is proving to be quite difficult for me to accomplish.

The reason why I am trying to get these onto one record is that they can then relate. I can add another field called ShiftHours that will take for that shift the number of hours between the time in and time out. This is not possible from what I understand with the data in seperate records as they are now (as they do not relate).

Thanks in advance for any help...

Regards,

Jim

1 Solution

Accepted Solutions
Not applicable
Author

Hi Jim,

I think I got it. May be its not the very final solution but it points into the right direction: Try the following code snippet:

RawData:

LOAD

     EMPLOYEENUM,

     ID,

     date(floor(RECDATE))    AS RECDATE,

     Time( RECTIME)            AS RECTIME,

     floor(CANCELLUNCH)    AS CANCELLUNCH

FROM

(ooxml, embedded labels, table is Sheet1)

Where (EMPLOYEENUM ) > 0

;

LEFT Join (RawData)

LOAD

     EMPLOYEENUM,

     RECDATE,

     Max(RecNo())    AS Shift_ID

Resident RawData

Group By

     EMPLOYEENUM,

     RECDATE

Order BY ID desc

;

   

FinalData:

LOAD

    EMPLOYEENUM                         AS EmpNo,

//    Shift_ID,

    FirstSortedValue(RECTIME, ID, 1)    AS TIMEIN,

    FirstSortedValue(RECDATE, ID, 1)    AS DATEIN,

    FirstSortedValue(RECTIME, ID, 2)    AS TIMEOUT,

    FirstSortedValue(RECDATE, ID, 2)    AS DATEOUT,

    Max(CANCELLUNCH)                    AS CANCEL

Resident RawData

Group By EMPLOYEENUM, Shift_ID

;

//drop Table RawData;

RR

View solution in original post

12 Replies
perumal_41
Partner - Specialist II
Partner - Specialist II

Hi

Use this script .

Load  min(ID ) as id, min(RECTIME)  as TIMEIN , max(RECDATE) as DATEOUT, max(RECTIME)  as TIMEOUT,

max(CANCELLUNCH) as CANCELLUNCH, EMPLOYEENUM

From Table1 group by EMPLOYEENUM;

Regards

Perumal A

Not applicable
Author

Hello Perumal,

This looks promising.

I have added the code however now get an error of:

"SQL error: Column 'PUNCHEVENT.ID' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.

Script line: SELECT * FROM PUNCHEVENT group by EMPLOYEEID

SQL SELECT * PUNCHEVENT group by EMPLOYEEID"

Looking at what your code does I would like to clarify a point.

From what I understand this will look at an employees minimum and maximum records and combine them... but what about the ones between?

The example table I added only shows a subset of data but each employee will have hundreds of shifts.

If I started in 2002 and its now 2012 wouldnt the code give me one records as such:

ID  DATEIN     TIMEIN       DATEOUT    TIMEOUT  CANCELLUNCH   EMPLOYEENUM

1   23/07/02    13:00:00     25/07/12      01:00:00    1                         200

This would be incorrect as it needs to be one record for each related shift... If that makes sense.

So the result of:

ID  DATEIN     TIMEIN       DATEOUT    TIMEOUT  CANCELLUNCH   EMPLOYEENUM

1   23/07/12    13:00:00     24/07/12      01:00:00    1                         200

2   24/07/12    09:00:00     24/07/12      18:00:00    0                         200

..n ..n             ..n             ..n               ..n             ..n                       ..n

Regards,

Jim

perumal_41
Partner - Specialist II
Partner - Specialist II

Hi.

Now use this script .

Load  min(ID ) as id, min(RECTIME)  as TIMEIN , max(RECDATE) as DATEOUT, max(RECTIME)  as TIMEOUT,

max(CANCELLUNCH) as CANCELLUNCH, EMPLOYEENUM

group by EMPLOYEENUM ;

SQL SELECT * PUNCHEVENT;

if any problem will come upload sample application and sample data in excel sheet

Regards

Perumal A

Not applicable
Author

Hello Perumal,

That script now loads however the issue raised does happen, so this solution does not work.

A person does a shift daily. Some people can have had hundreds of shifts. Each of these have a clock in/out and possibly a cancel lunch.

So when you select the min date and max date it will select the date of their first shift and latest shift. This is the whole range of thir employemnt... not the range of each shift.

Im looking for many records in the end table for each person, but each record to only show the information for that one shift.

Any ideas on this?

Regards,

perumal_41
Partner - Specialist II
Partner - Specialist II

Hi

able upload ur Sample Application for reference.

Not applicable
Author

Hello,

Sample data attached in excel file. Originally sql server but cannot upload that so in excel.

Load statement is below:

PUNCH:

LOAD

    min(ID),

    min(time(RECTIME,'hh:mm:ss')) as TIMEIN,

    max(time(RECTIME,'hh:mm:ss')) as TIMEOUT,

    min(date(RECDATE,'DD/MM/YYYY')) as DATEIN,

    max(date(RECDATE,'DD/MM/YYYY')) as DATEOUT,   

    max(CANCELLUNCH), EMPLOYEEID

group by EMPLOYEEID;

SQL SELECT * FROM PUNCHEVENT;

Table was called PUNCHEVENT in SQL server but now would need to be linked to excel unless you put it onto SQL server to match.

The qv dashboard just has a table showing the data ordered by datein desc.

This is the shifts for one person. On tab 2 of the excel document I have added an example of what I am trying to transform this data into. I have the original data on the right of tab 2, and on the left by colour what the records should equate to...

I am greatful for the help so far!

Look forward to your thoughts.

Thank you,

Jim

Not applicable
Author

Hi Jimmy,

as far as I understood you, the main topic of your post is to find out a way to tell QV which records belong together, means belong to one shift. As you said you are able to find that "link(s)" by looking at the table. But there seems to be no relation between the data itself. And this is independent of the tool or prog-language you want to use.

So,if you are able to tell me (the guys in the forum) how to group your records, the rest within QV should be easy. May be there exists a possibility to create automatically a new field with a shift ID; in your exam: RecNo 1,2,3 with shift_ID = 1, RecNo 4,5 with shift_ID = 2 and so on.

(Partly) correct?

Roland

edit: OK, meanwhile an new post with data. Lets have a look....

Not applicable
Author

Hello Roland,

Thank you for your reply.

Yes, that is correct the data has no unique or group relation. But as human 'visually' it is simple to see by comparing the rows.

Im trying to get the related records on one row per shift as then I can do other interesting things with the data.

I have tried quite a few ways of doing this but so far unsuccessful. Hence being here.

I believe the solution would be something like this, but have so far been unable to solve:

1) Create the table in qv load statement and group by employeeid and order by date, time asc (not sure exact syntax). This will put for each employee their shift data in order and together from farthers record to most up to date.

2) Create another table based on this one then with a loop of some sort do the following. Starting from the first record take the employeeid, date, time as variables. (due to step 1 this will be their first ever shift and time of clocking in). Go to next record. Take the time, date and cancellunch status as variables. These will be the time they clocked out, date of clock out, and cancellunch status. [always 0 unless the next record relates].  Go to next record. If the third record is not a cancel lunch record (cancel lunch = 1 and date/time from previous record = date/time of this record) then insert to the next table, otherwise make cancellunch 1 and insert... and repeat.

The end table will have all related shifts on one line.

Does this make sense or am I way off? - just not sure how to bring this all together.

Another idea I have been looking at is to load the cancellunch status records of 0 into one table. And the 1ns into another. Then to create a composite key in both tables of date & employeeid. This will then show that in table A:

ID '1/1/11 19:00 200' LUNCH: 0 is related to table B:

ID '1/1/11 19:00 200' LUNCH: 1

By doing a join I would then have on one record the composite key followed by 0 1  if they cancelled lunch or  00 if they didnt. This would then need to be joined to their start time...

As im sure you can see im quite confused but I am sure the answer is somewhere in the above of flying around and any help you give would be fantastic!

Thanks so far,

Jim

Regards,

Jim

Not applicable
Author

Hi Jim,

I think I got it. May be its not the very final solution but it points into the right direction: Try the following code snippet:

RawData:

LOAD

     EMPLOYEENUM,

     ID,

     date(floor(RECDATE))    AS RECDATE,

     Time( RECTIME)            AS RECTIME,

     floor(CANCELLUNCH)    AS CANCELLUNCH

FROM

(ooxml, embedded labels, table is Sheet1)

Where (EMPLOYEENUM ) > 0

;

LEFT Join (RawData)

LOAD

     EMPLOYEENUM,

     RECDATE,

     Max(RecNo())    AS Shift_ID

Resident RawData

Group By

     EMPLOYEENUM,

     RECDATE

Order BY ID desc

;

   

FinalData:

LOAD

    EMPLOYEENUM                         AS EmpNo,

//    Shift_ID,

    FirstSortedValue(RECTIME, ID, 1)    AS TIMEIN,

    FirstSortedValue(RECDATE, ID, 1)    AS DATEIN,

    FirstSortedValue(RECTIME, ID, 2)    AS TIMEOUT,

    FirstSortedValue(RECDATE, ID, 2)    AS DATEOUT,

    Max(CANCELLUNCH)                    AS CANCEL

Resident RawData

Group By EMPLOYEENUM, Shift_ID

;

//drop Table RawData;

RR