Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Join us to spark ideas for how to put the latest capabilities into action. Register here!
cancel
Showing results for 
Search instead for 
Did you mean: 
rob_vander
Creator
Creator

Keep only 4 consecutive records per category

Hi All,

I have below dataset. 

Load * Inline [
ID,Category,Date,Flag

11a1,a,01/01/2024,1
2a11,a,10/01/2024,1
3b23,a,15/01/2024,1
4a5b,a,16/01/2024,1
4j5b,a,17/01/2024,0

1aa1,b,01/01/2024,1
2ab1,b,02/01/2024,1
3bbc,b,03/01/2024,0

11a1,c,01/01/2024,1
2211,c,10/01/2024,1
3bd3,c,15/01/2024,1
aa5b,c,16/01/2024,1
ah5b,c,17/01/2024,1


1ca1,d,01/01/2024,1
2b11,d,10/01/2024,1
3g23,d,15/01/2024,1
4b5b,d,16/01/2024,1

110c,e,14/12/2023,0
1ca1,e,01/01/2024,1
2b11,e,10/01/2024,1
33ba,e,11,01/2024,0
3g23,e,15/01/2024,1
4b5b,e,16/01/2024,1 ]

I want to keep only highlighted records from above table where Flag =1 is consecutively repeating only 4 times, it might have 0 afterwards which should be fine. If Flag=1 is repeated less than 4 or greater than 4 then discard those category records.

Labels (1)
1 Solution

Accepted Solutions
Kushal_Chawda

@rob_vander  try below

Data:
Load * Inline [
ID,Category,Date,Flag
11a1,a,01/01/2024,1
2a11,a,10/01/2024,1
3b23,a,15/01/2024,1
4a5b,a,16/01/2024,1
4j5b,a,16/01/2024,0

1aa1,b,01/01/2024,1
2ab1,b,02/01/2024,1
3bbc,b,03/01/2024,0

11a1,c,01/01/2024,1
2211,c,10/01/2024,1
3bd3,c,15/01/2024,1
aa5b,c,16/01/2024,1
ah5b,c,17/01/2024,1

110c,d,14/12/2023,0
1ca1,d,01/01/2024,1
2b11,d,10/01/2024,1
3g23,d,15/01/2024,1
4b5b,d,16/01/2024,1

110c,e,14/12/2023,0
1ca1,e,01/01/2024,1
2b11,e,10/01/2024,1
33ba,e,11/01/2024,0
3g23,e,15/01/2024,1
4b5b,e,16/01/2024,1

a10c,f,14/12/2023,1
1aa1,f,01/01/2024,1
2v11,f,10/01/2024,1
3sba,f,11/01/2024,0
3a23,f,15/01/2024,1
4b4b,f,16/01/2024,1
3g13,f,17/01/2024,1
4b1b,f,18/01/2024,1];

New:
Load *,
if(Category<>Previous(Category) and Flag=1 or (Category=Previous(Category) and Previous(Flag)=0 and Flag=1),1,
if(Category=Previous(Category) and (Flag=1 and Previous(Flag)=1), rangesum(Peek('sequence'),Flag))) as sequence
Resident Data
Order by Category,Date;

Drop Table Data;

Left Join(New)
Load Category,
max(sequence) as max_sequence
Resident New
Group by Category;

Final:
NoConcatenate
Load *
Resident New
where max_sequence=4;

drop Table New;

View solution in original post

7 Replies
Kushal_Chawda

@rob_vander  what if we have below data, Last 4 records repeated 4 times, do you still want all the records for that category?

110c,e,14/12/2023,0
1ca1,e,01/01/2024,1
2b11,e,10/01/2024,1
33ba,e,11,01/2024,0
3g23,e,15/01/2024,1
4b5b,e,16/01/2024,1

ff22,e,17/01/2024,1

22ds,3,18/01/2024,1

igoralcantara
Partner Ambassador/MVP
Partner Ambassador/MVP

This is an interesting challenge. Here is a script I quickly wrote that might help you:

TempTable:
Load * Inline [
ID,Category,Date,Flag
 
11a1,a,01/01/2024,1
2a11,a,10/01/2024,1
3b23,a,15/01/2024,1
4a5b,a,16/01/2024,1
4j5b,a,17/01/2024,0
 
1aa1,b,01/01/2024,1
2ab1,b,02/01/2024,1
3bbc,b,03/01/2024,0
 
11a1,c,01/01/2024,1
2211,c,10/01/2024,1
3bd3,c,15/01/2024,1
aa5b,c,16/01/2024,1
ah5b,c,17/01/2024,1
 
1ca1,d,01/01/2024,1
2b11,d,10/01/2024,1
3g23,d,15/01/2024,1
4b5b,d,16/01/2024,1
 
110c,e,14/12/2023,0
1ca1,e,01/01/2024,1
2b11,e,10/01/2024,1
33ba,e,11,01/2024,0
3g23,e,15/01/2024,1
4b5b,e,16/01/2024,1
];
 
SortedTable:
Load *,
     If(Flag = 1, Peek('FlagGroup') + 1, 0) as FlagGroup
Resident TempTable
Order by Category, Date;
 
Drop Table TempTable;
 
GroupCheck:
Load Category,
     FlagGroup,
     Count(Flag) as FlagCount
Resident SortedTable
Where Flag = 1
Group by Category, FlagGroup;
 
FinalFlagGroup:
Load
Category,
    FlagGroup
Where CountFlagGroup = 1
;
Load Category,  
     Max(FlagGroup) as FlagGroup,
     Count(FlagGroup) As CountFlagGroup
Resident GroupCheck
Where FlagCount = 4
Group by Category
;
 
FinalRecords:
NoConcatenate
Load *
Resident SortedTable
Where Exists(FlagGroup, Category & '-' & FlagGroup);
 
Drop Table SortedTable, GroupCheck, FinalFlagGroup;
 
FinalTable:
Load ID,
     Category,
     Date,
     Flag
Resident FinalRecords;
 
Drop Table FinalRecords;

 

Check out my latest posts at datavoyagers.net
rob_vander
Creator
Creator
Author

@Kushal_Chawda  In this case we need to keep all the records for that category. There won't be any scenario where  Flag is repeating 4 times and more than 4 time at same time for that category

QFabian
MVP
MVP

Hi @rob_vander , here another option, creating 2 filters fields :

 QFabian_0-1723138257821.png

Aux:
Load * Inline [
ID, Category, Date, Flag
11a1, a, 01/01/2024, 1
2a11, a, 10/01/2024, 1
3b23, a, 15/01/2024, 1
4a5b, a, 16/01/2024, 1
4j5b, a, 17/01/2024, 0
1aa1, b, 01/01/2024, 1
2ab1, b, 02/01/2024, 1
3bbc, b, 03/01/2024, 0
11a1, c, 01/01/2024, 1
2211, c, 10/01/2024, 1
3bd3, c, 15/01/2024, 1
aa5b, c, 16/01/2024, 1
ah5b, c, 17/01/2024, 1
1ca1, d, 01/01/2024, 1
2b11, d, 10/01/2024, 1
3g23, d, 15/01/2024, 1
4b5b, d, 16/01/2024, 1
110c, e, 14/12/2023, 0
1ca1, e, 01/01/2024, 1
2b11, e, 10/01/2024, 1
33ba, e, 11,01/2024, 0
3g23, e, 15/01/2024, 1
4b5b, e, 16/01/2024, 1
];
 
//qualify *;
 
Data:
Load
recno(),
    rowno(),
if(recno() = 1, 
    Flag, 
    if(Category = previous(Category),
    Flag + peek('Acum'),
        Flag
        ))  as Acum,       
        
ID,
Category,
Date,
Flag
Resident Aux;
drop table Aux;
 
Filter:
Load
Category,
    sum(Flag) as Filter
Resident Data
Where
Acum <= 4
Group By
Category;
 
 
Filter2:
Load
Category,
    sum(Flag) as Filter2
Resident Data
// Where
// Acum <= 4
Group By
Category;
 
 
//drop table Data;
 
exit script; /*
Greetings!! Fabián Quezada (QFabian)
did it work for you? give like and mark the solution as accepted.
rob_vander
Creator
Creator
Author

@igoralcantara  It is  not working. Giving 0 records at end of the script

rob_vander
Creator
Creator
Author

@QFabian  It's not giving expected output.

Kushal_Chawda

@rob_vander  try below

Data:
Load * Inline [
ID,Category,Date,Flag
11a1,a,01/01/2024,1
2a11,a,10/01/2024,1
3b23,a,15/01/2024,1
4a5b,a,16/01/2024,1
4j5b,a,16/01/2024,0

1aa1,b,01/01/2024,1
2ab1,b,02/01/2024,1
3bbc,b,03/01/2024,0

11a1,c,01/01/2024,1
2211,c,10/01/2024,1
3bd3,c,15/01/2024,1
aa5b,c,16/01/2024,1
ah5b,c,17/01/2024,1

110c,d,14/12/2023,0
1ca1,d,01/01/2024,1
2b11,d,10/01/2024,1
3g23,d,15/01/2024,1
4b5b,d,16/01/2024,1

110c,e,14/12/2023,0
1ca1,e,01/01/2024,1
2b11,e,10/01/2024,1
33ba,e,11/01/2024,0
3g23,e,15/01/2024,1
4b5b,e,16/01/2024,1

a10c,f,14/12/2023,1
1aa1,f,01/01/2024,1
2v11,f,10/01/2024,1
3sba,f,11/01/2024,0
3a23,f,15/01/2024,1
4b4b,f,16/01/2024,1
3g13,f,17/01/2024,1
4b1b,f,18/01/2024,1];

New:
Load *,
if(Category<>Previous(Category) and Flag=1 or (Category=Previous(Category) and Previous(Flag)=0 and Flag=1),1,
if(Category=Previous(Category) and (Flag=1 and Previous(Flag)=1), rangesum(Peek('sequence'),Flag))) as sequence
Resident Data
Order by Category,Date;

Drop Table Data;

Left Join(New)
Load Category,
max(sequence) as max_sequence
Resident New
Group by Category;

Final:
NoConcatenate
Load *
Resident New
where max_sequence=4;

drop Table New;