Skip to main content
Announcements
Introducing Qlik Answers: A plug-and-play, Generative AI powered RAG solution. READ ALL ABOUT IT!
cancel
Showing results for 
Search instead for 
Did you mean: 
rob_vander
Contributor III
Contributor III

Keep only 4 consecutive records per category

Hi All,

I have below dataset. 

Load * Inline [
ID,Category,Date,Flag

11a1,a,01/01/2024,1
2a11,a,10/01/2024,1
3b23,a,15/01/2024,1
4a5b,a,16/01/2024,1
4j5b,a,17/01/2024,0

1aa1,b,01/01/2024,1
2ab1,b,02/01/2024,1
3bbc,b,03/01/2024,0

11a1,c,01/01/2024,1
2211,c,10/01/2024,1
3bd3,c,15/01/2024,1
aa5b,c,16/01/2024,1
ah5b,c,17/01/2024,1


1ca1,d,01/01/2024,1
2b11,d,10/01/2024,1
3g23,d,15/01/2024,1
4b5b,d,16/01/2024,1

110c,e,14/12/2023,0
1ca1,e,01/01/2024,1
2b11,e,10/01/2024,1
33ba,e,11,01/2024,0
3g23,e,15/01/2024,1
4b5b,e,16/01/2024,1 ]

I want to keep only highlighted records from above table where Flag =1 is consecutively repeating only 4 times, it might have 0 afterwards which should be fine. If Flag=1 is repeated less than 4 or greater than 4 then discard those category records.

Labels (1)
7 Replies
Kushal_Chawda

@rob_vander  what if we have below data, Last 4 records repeated 4 times, do you still want all the records for that category?

110c,e,14/12/2023,0
1ca1,e,01/01/2024,1
2b11,e,10/01/2024,1
33ba,e,11,01/2024,0
3g23,e,15/01/2024,1
4b5b,e,16/01/2024,1

ff22,e,17/01/2024,1

22ds,3,18/01/2024,1

igoralcantara
Partner - Specialist
Partner - Specialist

This is an interesting challenge. Here is a script I quickly wrote that might help you:

TempTable:
Load * Inline [
ID,Category,Date,Flag
 
11a1,a,01/01/2024,1
2a11,a,10/01/2024,1
3b23,a,15/01/2024,1
4a5b,a,16/01/2024,1
4j5b,a,17/01/2024,0
 
1aa1,b,01/01/2024,1
2ab1,b,02/01/2024,1
3bbc,b,03/01/2024,0
 
11a1,c,01/01/2024,1
2211,c,10/01/2024,1
3bd3,c,15/01/2024,1
aa5b,c,16/01/2024,1
ah5b,c,17/01/2024,1
 
1ca1,d,01/01/2024,1
2b11,d,10/01/2024,1
3g23,d,15/01/2024,1
4b5b,d,16/01/2024,1
 
110c,e,14/12/2023,0
1ca1,e,01/01/2024,1
2b11,e,10/01/2024,1
33ba,e,11,01/2024,0
3g23,e,15/01/2024,1
4b5b,e,16/01/2024,1
];
 
SortedTable:
Load *,
     If(Flag = 1, Peek('FlagGroup') + 1, 0) as FlagGroup
Resident TempTable
Order by Category, Date;
 
Drop Table TempTable;
 
GroupCheck:
Load Category,
     FlagGroup,
     Count(Flag) as FlagCount
Resident SortedTable
Where Flag = 1
Group by Category, FlagGroup;
 
FinalFlagGroup:
Load
Category,
    FlagGroup
Where CountFlagGroup = 1
;
Load Category,  
     Max(FlagGroup) as FlagGroup,
     Count(FlagGroup) As CountFlagGroup
Resident GroupCheck
Where FlagCount = 4
Group by Category
;
 
FinalRecords:
NoConcatenate
Load *
Resident SortedTable
Where Exists(FlagGroup, Category & '-' & FlagGroup);
 
Drop Table SortedTable, GroupCheck, FinalFlagGroup;
 
FinalTable:
Load ID,
     Category,
     Date,
     Flag
Resident FinalRecords;
 
Drop Table FinalRecords;

 

Check out my latest posts at datavoyagers.net
rob_vander
Contributor III
Contributor III
Author

@Kushal_Chawda  In this case we need to keep all the records for that category. There won't be any scenario where  Flag is repeating 4 times and more than 4 time at same time for that category

QFabian
Specialist III
Specialist III

Hi @rob_vander , here another option, creating 2 filters fields :

 QFabian_0-1723138257821.png

Aux:
Load * Inline [
ID, Category, Date, Flag
11a1, a, 01/01/2024, 1
2a11, a, 10/01/2024, 1
3b23, a, 15/01/2024, 1
4a5b, a, 16/01/2024, 1
4j5b, a, 17/01/2024, 0
1aa1, b, 01/01/2024, 1
2ab1, b, 02/01/2024, 1
3bbc, b, 03/01/2024, 0
11a1, c, 01/01/2024, 1
2211, c, 10/01/2024, 1
3bd3, c, 15/01/2024, 1
aa5b, c, 16/01/2024, 1
ah5b, c, 17/01/2024, 1
1ca1, d, 01/01/2024, 1
2b11, d, 10/01/2024, 1
3g23, d, 15/01/2024, 1
4b5b, d, 16/01/2024, 1
110c, e, 14/12/2023, 0
1ca1, e, 01/01/2024, 1
2b11, e, 10/01/2024, 1
33ba, e, 11,01/2024, 0
3g23, e, 15/01/2024, 1
4b5b, e, 16/01/2024, 1
];
 
//qualify *;
 
Data:
Load
recno(),
    rowno(),
if(recno() = 1, 
    Flag, 
    if(Category = previous(Category),
    Flag + peek('Acum'),
        Flag
        ))  as Acum,       
        
ID,
Category,
Date,
Flag
Resident Aux;
drop table Aux;
 
Filter:
Load
Category,
    sum(Flag) as Filter
Resident Data
Where
Acum <= 4
Group By
Category;
 
 
Filter2:
Load
Category,
    sum(Flag) as Filter2
Resident Data
// Where
// Acum <= 4
Group By
Category;
 
 
//drop table Data;
 
exit script; /*
QFabian
rob_vander
Contributor III
Contributor III
Author

@igoralcantara  It is  not working. Giving 0 records at end of the script

rob_vander
Contributor III
Contributor III
Author

@QFabian  It's not giving expected output.

Kushal_Chawda

@rob_vander  try below

Data:
Load * Inline [
ID,Category,Date,Flag
11a1,a,01/01/2024,1
2a11,a,10/01/2024,1
3b23,a,15/01/2024,1
4a5b,a,16/01/2024,1
4j5b,a,16/01/2024,0

1aa1,b,01/01/2024,1
2ab1,b,02/01/2024,1
3bbc,b,03/01/2024,0

11a1,c,01/01/2024,1
2211,c,10/01/2024,1
3bd3,c,15/01/2024,1
aa5b,c,16/01/2024,1
ah5b,c,17/01/2024,1

110c,d,14/12/2023,0
1ca1,d,01/01/2024,1
2b11,d,10/01/2024,1
3g23,d,15/01/2024,1
4b5b,d,16/01/2024,1

110c,e,14/12/2023,0
1ca1,e,01/01/2024,1
2b11,e,10/01/2024,1
33ba,e,11/01/2024,0
3g23,e,15/01/2024,1
4b5b,e,16/01/2024,1

a10c,f,14/12/2023,1
1aa1,f,01/01/2024,1
2v11,f,10/01/2024,1
3sba,f,11/01/2024,0
3a23,f,15/01/2024,1
4b4b,f,16/01/2024,1
3g13,f,17/01/2024,1
4b1b,f,18/01/2024,1];

New:
Load *,
if(Category<>Previous(Category) and Flag=1 or (Category=Previous(Category) and Previous(Flag)=0 and Flag=1),1,
if(Category=Previous(Category) and (Flag=1 and Previous(Flag)=1), rangesum(Peek('sequence'),Flag))) as sequence
Resident Data
Order by Category,Date;

Drop Table Data;

Left Join(New)
Load Category,
max(sequence) as max_sequence
Resident New
Group by Category;

Final:
NoConcatenate
Load *
Resident New
where max_sequence=4;

drop Table New;