Skip to main content
Announcements
Accelerate Your Success: Fuel your data and AI journey with the right services, delivered by our experts. Learn More
cancel
Showing results for 
Search instead for 
Did you mean: 
juliakhaa
Creator
Creator

combine many field with overlapping dates into one field

Hello to the whole community!

There is a code that combine many field with overlapping dates into one field.

The code is below

Table_one:

load *

INLINE [

id_client | id_question| date_open | date_close

YZIYR00R | 14534534 | 05.10.2022 | 30.10.2022

YZIYR00R | 14786543 | 26.10.2022 | 27.10.2022

YZIYR00R | 87634957 | 27.10.2022 | 28.10.2022

YZIYR00R | 12398750 | 27.10.2022 | 28.10.2022

YZIYR00R | 57548555 | 27.10.2022 | 28.10.2022

YZIYR00R | 36485023 | 29.10.2022 | 30.10.2022

] (delimiter is '|');



Tmp:

LOAD *, if(previous(date_close) >= date_open and previous(id_client) = id_client,

peek(question_group), id_question) as question_group

Resident Table_one

ORDER BY id_client, date_open, date_close;

drop table Table_one;



SampleData:

LOAD id_client, question_group as id_question,

FirstSortedValue(date_open, recno()) as date_open,

FirstSortedValue(date_close, -recno()) as date_close

Resident Tmp

GROUP BY id_client, question_group;

drop table Tmp;

So the last line doesn’t fall in the interval. The problem is that the code only checks the previous and next lines, if you add the filed (my last line in code), it will not fall into the interval because is compared to the previous one.
And I would like the intersecting intervals to join one, where the start date would be the minimum number of the intersecting group and the end date would be the maximum.

 

ps. I had some help with this code, although I took it apart and understood it, but it’s hard for me to figure out.

1 Reply
juliakhaa
Creator
Creator
Author

I wrote the following code:

Table_one:

load *

INLINE [

id_client | id_question| date_open | date_close

YZIYR00R | 14534534 | 11.01.2022 | 11.01.2022

YZIYR00R | 14786543 | 11.01.2022 | 11.01.2022

YZIYR00R | 87634957 | 11.01.2022 | 11.01.2022

YZIYR00R | 12398750 | 11.01.2022 | 12.01.2022

YZIYR00R | 57548555 | 13.01.2022 | 13.01.2022

YZIYR00R | 36485023 | 13.01.2022 | 14.01.2022

YZIYR00R | 09748361 | 13.01.2022 | 13.01.2022

YZIYR00R | 56419453 | 13.01.2022 | 15.01.2022





] (delimiter is '|');




Tmp:

LOAD *, if(previous(date_close) >= date_open and previous(id_client) = id_client,

peek(question_group), id_question) as question_group

Resident Table_one

ORDER BY id_client,date_close,date_open;

drop table Table_one;





next:

load id_client, question_group as id_question, min(date_open) as date_open, max(date_close) as date_close

Resident Tmp

Group by id_client, question_group;

drop table Tmp;



but with the data below it will no longer work correctly:

 

Table_one:

load *

INLINE [

id_client | id_question| date_open | date_close

YZIYR00R | 14534534 | 03.10.2022 | 03.10.2022

YZIYR00R | 14786543 | 04.10.2022 | 04.10.2022

YZIYR00R | 87634957 | 05.10.2022 | 02.12.2022

YZIYR00R | 12398750 | 06.10.2022 | 05.10.2022

YZIYR00R | 57548555 | 08.10.2022 | 06.10.2022

YZIYR00R | 36485023 | 17.10.2022 | 11.10.2022

YZIYR00R | 09748361 | 19.10.2022 | 18.10.2022

YZIYR00R | 56419453 | 20.10.2022 | 19.10.2022

YZIYR00R | 64324123 | 31.10.2022 | 26.10.2022

YZIYR00R | 53634322 | 01.11.2022 | 31.10.2022

YZIYR00R | 56787656 | 03.11.2022 | 03.11.2022

YZIYR00R | 78946487 | 09.11.2022 | 03.11.2022

YZIYR00R | 11111111 | 09.11.2022 | 09.11.2022

YZIYR00R | 98541484 | 10.11.2022 | 11.11.2022

YZIYR00R | 45487874 | 29.11.2022 | 23.11.2022

YZIYR00R | 26548459 | 02.12.2022 | 29.11.2022

] (delimiter is '|');