Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Independent validation for trusted, AI-ready data integration. See why IDC named Qlik a Leader: Read the Excerpt!
cancel
Showing results for 
Search instead for 
Did you mean: 
juliakhaa
Creator
Creator

How to combine fields with overlapping dates into one field

Hello, everyone.
Since I am new to Qlik Sense, I am not sure, but as I understood it does not work regular functions and some methods as in postgresql.

That's why this question, maybe someone has encountered this.
Is it possible in Qlik Sense to combine (compact) many fields with overlapping dates into one field?

Let me explain with an example, let's say I have the following fields: id_client, id_question, date_open, date_close.
The table looks like this, taking the id of one client:

id_client | id_question| date_open | date_close
YZIYR00R | 14534534 | 11.01.2022 | 11.01.2022
YZIYR00R | 14786543 | 11.01.2022 | 11.01.2022
YZIYR00R | 87634957 | 11.01.2022 | 11.01.2022
YZIYR00R | 12398750 | 11.01.2022 | 12.01.2022
YZIYR00R | 36485023 | 13.01.2022 | 13.01.2022
YZIYR00R | 09748361 | 13.01.2022 | 14.01.2022
YZIYR00R | 56419453 | 13.01.2022 | 13.01.2022
YZIYR00R | 64324123 | 13.01.2022 | 15.01.2022


the logic is, if the dates of the questions overlap, they should connect and count as one question, but it is considered the next question and does not compact. Tried doing a window function, and then comparing that if the next date_open <= the date close in the first filed, then the dates overlap and output date_close of the second line, but then stalemate and stupor.

The output should be
id_client. | id_question| date_open | date_close
YZIYR00R | 14534534 | 11.01.2022 | 12.01.2022
YZIYR00R | 36485023 | 13.01.2022 | 15.01.2022

i'd be very grateful for your help

17 Replies
juliakhaa
Creator
Creator
Author

This one returns two field:

date_open 11.01 and date_close 12.01

date_open 11.01 and date_close 15.01

It was supposed to be:

date_open 11.01 and date_close 12.01

date_open 13.01 and date_close 15.01

SerhanKaraer
Creator III
Creator III

Yes, you can use QVD file as a source. The point is to sort accordingly for comparing row by row.

When you copy and pasted the script I sent, the result is just like it supposed to be.

image.png

Please check your process again.

juliakhaa
Creator
Creator
Author

Everything works correctly until the last line is added. The first picture is without sorting, the second with sorting by id_client, date_open, date_close.

Tmp:

LOAD *, if(previous(date_close) >= date_open and previous(id_client) = id_client, peek(question_group), id_question) as question_group

INLINE [

id_client | id_question| date_open | date_close

YZIYR00R | 14534534 | 11.01.2022 | 11.01.2022

YZIYR00R | 14786543 | 11.01.2022 | 11.01.2022

YZIYR00R | 87634957 | 11.01.2022 | 11.01.2022

YZIYR00R | 12398750 | 11.01.2022 | 12.01.2022

YZIYR00R | 36485023 | 13.01.2022 | 13.01.2022

YZIYR00R | 09748361 | 13.01.2022 | 14.01.2022

YZIYR00R | 56419453 | 13.01.2022 | 13.01.2022

YZIYR00R | 64324123 | 13.01.2022 | 15.01.2022

YZIYR00R | 64367893 | 11.01.2022 | 12.01.2022

] (delimiter is '|');

SampleData:

LOAD id_client, question_group as id_question,

FirstSortedValue(date_open, recno()) as date_open,
FirstSortedValue(date_close, -recno()) as date_close
Resident Tmp

GROUP BY id_client, question_group

ORDER BY id_client, date_open, date_close;

drop table Tmp;

First photo:

6D3B2C9F-FAD7-4197-A1E7-4826BB59DA27.jpeg

Second photo:

80A6AE69-5049-4805-8F2A-70887D13D674.jpeg

SerhanKaraer
Creator III
Creator III

As I stated, you have to sort your input data before collapsing the records by interval.

Tmp:
LOAD * INLINE [
id_client | id_question| date_open | date_close
YZIYR00R | 14534534 | 11.01.2022 | 11.01.2022
YZIYR00R | 14786543 | 11.01.2022 | 11.01.2022
YZIYR00R | 87634957 | 11.01.2022 | 11.01.2022
YZIYR00R | 12398750 | 11.01.2022 | 12.01.2022
YZIYR00R | 36485023 | 13.01.2022 | 13.01.2022
YZIYR00R | 09748361 | 13.01.2022 | 14.01.2022
YZIYR00R | 56419453 | 13.01.2022 | 13.01.2022
YZIYR00R | 64324123 | 13.01.2022 | 15.01.2022
YZIYR00R | 64367893 | 11.01.2022 | 12.01.2022
] (delimiter is '|');

TmpSorted:
NoConcatenate
LOAD *, if(previous(date_close) >= date_open and previous(id_client) = id_client, peek(question_group), id_question) as question_group
Resident Tmp
ORDER BY id_client, date_open, date_close;

drop table Tmp;

SampleData:
LOAD id_client, question_group as id_question,
	 FirstSortedValue(date_open, recno()) as date_open,
	 FirstSortedValue(date_close, -recno()) as date_close
Resident TmpSorted
GROUP BY id_client, question_group;

drop table TmpSorted;

juliakhaa
Creator
Creator
Author

Yes, I did in the end by the same logic, thank you for that.

But when I started playing with the code, I realized that the problem is in the function (previous), that is, the code only checks the previous and next line. That is, even if the date is in the interval, the code will not work correctly.
For understanding the code below->

Table_one:

load *

INLINE [

id_client | id_question| date_open | date_close

YZIYR00R | 14534534 | 26.10.2022 | 01.11.2022

YZIYR00R | 14786543 | 27.10.2022 | 28.10.2022

YZIYR00R | 87634957 | 27.09.2022 | 02.10.2022

YZIYR00R | 12398750 | 27.09.2022 | 28.09.2022

YZIYR00R | 57548555 | 30.11.2022 | 05.12.2022

YZIYR00R | 36485023 | 29.11.2022 | 30.11.2022

] (delimiter is '|');

Tmp:

LOAD *, if(previous(date_close) >= date_open and previous(id_client) = id_client,

peek(question_group), id_question) as question_group

Resident Table_one

ORDER BY id_client, date_open, date_close;

drop table Table_one;

SampleData:

LOAD id_client, question_group as id_question,

FirstSortedValue(date_open, recno()) as date_open,

FirstSortedValue(date_close, -recno()) as date_close

Resident Tmp

GROUP BY id_client, question_group;

drop table Tmp;

the outputs will be like

1) 27.09-02.10

2) 26.10-28.10

3) 29.11-05.12

But there are must be:

1) 27.09-02.10

2) 26.10-01.11

3) 29.11-05.12

Also the code will not work

INLINE [

id_client | id_question| date_open | date_close

YZIYR00R | 14534534 | 05.10.2022 | 30.10.2022

YZIYR00R | 14786543 | 26.10.2022 | 27.10.2022

YZIYR00R | 87634957 | 27.10.2022 | 28.10.2022

YZIYR00R | 12398750 | 27.10.2022 | 28.10.2022

YZIYR00R | 57548555 | 27.10.2022 | 28.10.2022

YZIYR00R | 36485023 | 29.10.2022 | 30.10.2022

the last line doesn’t fall in the interval

juliakhaa
Creator
Creator
Author

I don’t know, maybe you can help me.

I wrote the following code:

 

Table_one:

load *

INLINE [

id_client | id_question| date_open | date_close

YZIYR00R | 14534534 | 11.01.2022 | 11.01.2022

YZIYR00R | 14786543 | 11.01.2022 | 11.01.2022

YZIYR00R | 87634957 | 11.01.2022 | 11.01.2022

YZIYR00R | 12398750 | 11.01.2022 | 12.01.2022

YZIYR00R | 57548555 | 13.01.2022 | 13.01.2022

YZIYR00R | 36485023 | 13.01.2022 | 14.01.2022

YZIYR00R | 09748361 | 13.01.2022 | 13.01.2022

YZIYR00R | 56419453 | 13.01.2022 | 15.01.2022

 

 

] (delimiter is '|');



Tmp:

LOAD *, if(previous(date_close) >= date_open and previous(id_client) = id_client,

peek(question_group), id_question) as question_group

Resident Table_one

ORDER BY id_client,date_close,date_open;

drop table Table_one;

 

 

next:

load id_client, question_group as id_question, min(date_open) as date_open, max(date_close) as date_close

Resident Tmp

Group by id_client, question_group;

drop table Tmp;



but with the data below it will no longer work correctly:

 

Table_one:

load *

INLINE [

id_client | id_question| date_open | date_close

YZIYR00R | 14534534 | 03.10.2022 | 03.10.2022

YZIYR00R | 14786543 | 04.10.2022 | 04.10.2022

YZIYR00R | 87634957 | 05.10.2022 | 02.12.2022

YZIYR00R | 12398750 | 06.10.2022 | 05.10.2022

YZIYR00R | 57548555 | 08.10.2022 | 06.10.2022

YZIYR00R | 36485023 | 17.10.2022 | 11.10.2022

YZIYR00R | 09748361 | 19.10.2022 | 18.10.2022

YZIYR00R | 56419453 | 20.10.2022 | 19.10.2022

YZIYR00R | 64324123 | 31.10.2022 | 26.10.2022

YZIYR00R | 53634322 | 01.11.2022 | 31.10.2022

YZIYR00R | 56787656 | 03.11.2022 | 03.11.2022

YZIYR00R | 78946487 | 09.11.2022 | 03.11.2022

YZIYR00R | 11111111 | 09.11.2022 | 09.11.2022

YZIYR00R | 98541484 | 10.11.2022 | 11.11.2022

YZIYR00R | 45487874 | 29.11.2022 | 23.11.2022

YZIYR00R | 26548459 | 02.12.2022 | 29.11.2022

] (delimiter is '|');

SerhanKaraer
Creator III
Creator III

Hello,

You seem to have some data quality issues in the second dataset. For instance, date close is earlier than date open. My advice would be first solving the quality problems and then going on with forming the data model.

juliakhaa
Creator
Creator
Author

It’s ok right now 

Table_one:
load *
    INLINE [
id_client | id_question| date_open | date_close
YZIYR00R | 14534534 | 03.10.2022 | 03.10.2022
YZIYR00R | 14786543 | 04.10.2022 | 04.10.2022
YZIYR00R | 87634957 | 05.10.2022 | 02.12.2022
YZIYR00R | 12398750 | 05.10.2022 | 05.10.2022
YZIYR00R | 57548555 | 06.10.2022 | 06.10.2022
YZIYR00R | 36485023 | 07.10.2022 | 11.10.2022
YZIYR00R | 09748361 | 17.10.2022 | 18.10.2022
YZIYR00R | 56419453 | 19.10.2022 | 19.10.2022
YZIYR00R | 64324123 | 20.10.2022 | 26.10.2022
YZIYR00R | 53634322 | 31.10.2022 | 31.10.2022
YZIYR00R | 56787656 | 01.11.2022 | 03.11.2022
YZIYR00R | 78946487 | 03.11.2022 | 03.11.2022
YZIYR00R | 11111111 | 09.11.2022 | 09.11.2022
YZIYR00R | 98541484 | 09.11.2022 | 11.11.2022
YZIYR00R | 45487874 | 10.11.2022 | 23.11.2022
YZIYR00R | 26548459 | 29.11.2022 | 29.11.2022
YZIYR00R | 13452352 | 02.12.2022 |
] (delimiter is '|');