- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
conditional row merge, can it be done?
Hi everybody,
I have this problem and i can´t find a solucion. I have a table like this one:
Animal | Name | Diagnoses | Visit Date | Next visit |
DOG | Rocky | cough | 12-oct | 22-oct |
DOG | Rocky | cough | 14-oct | 17-oct |
DOG | Rocky | cough | 15-nov | 20-nov |
DOG | Rocky | rash | 18-nov | 19-nov |
DOG | Duke | rash | 20-nov | 25-nov |
DOG | Duke | rash | 26-nov | 28-nov |
DOG | Duke | cough | 27-nov | 30-nov |
And I need to MERGE or JOIN the rows that are the same animal, name and diagnosis within in a short period of time (maybe two days) and get the first visit and last visit
For example, for the first two rows I can assume that is the same illness, so I can merge both and get the first and max date, in this case, the first row dates. But the third row is a new one even though it has the same illness, but is a month later.
And for the dog duke, it came back with a rash a the next day of his last visit, so I can assume that is the same illness, so is the first visit date (fifth row) and the last date on the sixth row.
The result should be something like this
Animal | Name | Diagnoses | Start | End |
DOG | Rocky | cough | 12-oct | 22-oct |
DOG | Rocky | cough | 15-nov | 20-nov |
DOG | Rocky | rash | 18-nov | 19-nov |
DOG | Duke | rash | 20-nov | 28-nov |
DOG | Duke | cough | 27-nov | 30-nov |
The first table is in order, so there is no need to check the whole table looking for rows to join.
I hope i made myself clear and I thank you all for your time. Any question i'll try to answer it as fast as i can
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
pseudocode
1) make a new table (load resident) in this way
- load the table ordered by Animal, ........Next visit
- if the Animal, Name, Diagnoses of current row are the same of prevous row
(if(peek(Animal)=Animal and .......)
check (always using peek or previous to get the values of the field in previous record) date difference
calculate the 2 dates (2 new fields)
flag the record
2) read the flagged records of the new table (use the 2 new date fields)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
pseudocode
1) make a new table (load resident) in this way
- load the table ordered by Animal, ........Next visit
- if the Animal, Name, Diagnoses of current row are the same of prevous row
(if(peek(Animal)=Animal and .......)
check (always using peek or previous to get the values of the field in previous record) date difference
calculate the 2 dates (2 new fields)
flag the record
2) read the flagged records of the new table (use the 2 new date fields)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thank you, That's a very good starting point. I didn't know about peek and previous so let's give it a try
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thank you again for your guidance. I work it out and got this solution
Base_TB:
Load Animal,
Name,
Diagnoses,
´Visit Date´,
´Observation days´,
RowNo()as ID,
if( Name=previous(Name) and
Diagnoses=previous(Diagnoses) and
[Visit Date] <= (previous([Observation days])+2), peek(ID), RowNo()) as Flag;
LOAD * INLINE [
Animal,Name,Diagnoses,Visit Date,Observation days
DOG,Rocky,cough,12-10-2015,22-10-2015
DOG,Rocky,cough,14-10-2015,17-10-2015
DOG,Rocky,cough,15-11-2015,20-11-2015
DOG,Rocky,rash,18-11-2015,19-11-2015
DOG,Duke,rash,20-11-2015,25-11-2015
DOG,Duke,rash,26-11-2015,28-11-2015
DOG,Duke,cough,27-11-2015,30-11-2015
];
Final:
NoConcatenate
Load Animal,
Name,
Diagnoses,
date(min(´Visit Date´)) as Min_Date,
date(max(´Observation days´)) as max_Date,
Flag
resident Base_TB Group by Flag,Animal,Name,Diagnoses;
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Remember that usually, when you use peek or previous the order is important and if you change the order of the rows in your inline table maybe you get different result. Also, in Qlik, you can only order by in a resident load.
So I suggest to (this could be a load from a qvd, txt, database, etc....)
a:
load * inline [
.........
];
and then do resident load (using peek, previous) with a order by
b:
noconcatenate lod
..............
resident a
order by ............;
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Yes, the order in this case is very important, but it is done in previous tables. This is one of the steps of a bigger project and the input table for this part is loaded in order.
Thanks