Skip to main content
Announcements
See what Drew Clarke has to say about the Qlik Talend Cloud launch! READ THE BLOG
cancel
Showing results for 
Search instead for 
Did you mean: 
Not applicable

Problem with multiple records when spliting data

Hello,

I need to split data depending on the time after the in service date of an employee. The problem I encounter when spliting the data is that I have more than one record with the in service date of an employee available and the split result is a multiple of the actual total data.

The data I want to split is available per employee and date. The total data is available with the simply expression =SUM(Value.Valuefield). When I use the expression =SUM(IF(Date.Valuefield-Employee.InServiceDate>=29,Value.Valuefield,0)) the total value per employee gets multiplier by the number of records with in service dates available for that employee, which is logical, but unwanted. Is there a way to use only one of the employee in service date per employee?

Example:

Available value record (Date.Valuefield, Value.Valuefield, Employee):

26-08-2013, 8, employee1

27-08-2013, 8, employee1

Available employee records with in service date (Employee, Employee.InServiceDate, Month):

employee1, 01-07-2013, July

employee1, 01-07-2013, August

The simple sum expression results in 16

The expression which checks the in service date results in 32

Has anyone an idea? Many thanks!

5 Replies
rajeshvaswani77
Specialist III
Specialist III

Hi Raimund,

Please check the attached QVW.

thanks,

Rajesh Vaswani

Not applicable
Author

Hi Rajesh,

Thanks for your reaction. I noticed my description is not clear enough. I have a lot of different employees, lots of Values I like to add up and also a lot of records per employee with the Inservice date.

I adjusted the qvw a bit, perhaps that clearifies it.

Not applicable
Author

Hello,

I found a solution to the problem. Thanks for your time.

Solution I found: by adding a column "employed" in the employee records, with a 1 in the month in which the employee starts and a 0 in the other months. Now I added an IF to the expression regarding the Employee.InServiceDate.

sum(if(Date.Valuefield-(IF(Employed=1,Employee.InServiceDate,))<29,Value.Valuefield,0))

Not applicable
Author

Hi Raimund,

When I have had similar issues in that I have a lower level of granularity in my data than I need for a particular expression I have used peek and order by to give me an identifier for the distinct records that I want.

1) First define your compound key i.e. employee & date in your case I think

2) Perform a resident load and order by your compound key

3) Use the peek function to check for each distinct occurrence of your compound key during the resident load

e.g.

Table 1:

LOAD

     Field 1 & '-' & Field 2               as [%Compound Key],

     Field 1,

     Field 2,

     Field 3,

From xyz.qvd

Table 2:

LOAD

     [%Compound Key],

     Field 1,

     Field 2,

     Field 3,

     If([%Compound Key] = Peek('%Compound Key'),

          0,

          1)                                   as _DistinctCompoundKey

Resident Table 1

Order By [%Compound Key];

This way you can use set analysis expression to sum up and field only for the distinct occurrences of your compound key. i.e

sum({$<_DistinctCompoundKey = {1}>}Field 3) or even

sum(_DistinctCompoundKey) depending on what you want to measure.

I believe this is a more efficient way of accomplishing what you need as it won't consume as much resource when calculating.

Andy

Not applicable
Author

Hello Andy,

Many thanks for your extra information!

Raimund