Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Qlik Connect 2026 Agenda Now Available: Explore Sessions
cancel
Showing results for 
Search instead for 
Did you mean: 
Not applicable

Identifying a subset of data

Dear all,

I need some help for a script:

I have allready two tables with patients and diagnoses which are 1:n related via a patient ID. They are loaded from another dataset via the binary command.

Now I have a third table with let say 30 diagnoses (loaded from Excel) and want to classify all patients whether they have one or more of these diagnoses (1) or not (2).

As a result I want to have a new table with two fields the distinct patient ID and the class marker.

Thanks,

Rolf

6 Replies
Not applicable
Author

Hello Rolf,

please provide a short example we can ply arround with.

Thanks!

Rainer

Not applicable
Author

Hello Rainer,


I've attached an excel-file with three tables. The first two are 1:n related via PAT_ID.

I want now in my QV-file the first and the second table (Patients and their diagnoses) and a third table that indicates the occurrence or non-occurrence of one or more diagnoses from the third table 'TBL_Markerdiag' in the 'TBL_DIAG' for every single patient.

The resulting new table from this sample should be:

PAT_ID | Occurence

1 | yes

2 | yes

3 | no

4 | yes

The table TBL_Markerdiag can be dropped after that.

This a very small sample. Marker diagnoses can number up to 100 and TBL_PAT and TBL_DIAG can comprise up to 100.000 and 300.000 rows respectively.

Regards,

Rolf

Not applicable
Author

Load the tables using the following load statements. Note the Diagnose has an extra field, not in the spreadsheet but added during the load, that marks those Diagnoses that you want to mark.

Directory;
LOAD
Diagnose,
1 as DiagnoseCount
FROM Test.xls (biff, embedded labels, table is TBL_Markerdiag$);

Directory;
LOAD
PAT_ID
FROM Test.xls (biff, embedded labels, table is TBL_PAT$);

Directory;
LOAD
PAT_ID,
Diagnose
FROM Test.xls (biff, embedded labels, table is TBL_DIAG$);

Then, somewhere in your displayed objects, add an expression

if(Sum (DiagnoseCount)>0, 'Yes', 'No')

which you can display with the patient ID to show if they are marked (or did you really want it in the script?)

Not applicable
Author

I want it in the script because I need it as a dimension so I can distinguish patients that have one or more marker diagnoses from patients that don't have any at all.

Rolf

Not applicable
Author

Hi Rolf,

If I understood you problem correctly, then probably you can do something like this

Pat:
LOAD
PAT_ID
FROM Path;

tmpDig:
LOAD
PAT_ID,
Diagnose
FROM Path;
//Data you are fetching from Excel
LEFT JOIN
LOAD
PAT_ID
Diagnose,
1 as ExcelFileFlag
FROM Test.xls;

GetDigCount:
LOAD PAT_ID,
COUNT(Diagnose) as DigCount
RESIDENT tmpDig
GROUP BY PAT_ID;

DIG:
LOAD
PAT_ID,
Diagnose
RESIDENT tmpDig;
INNER JOIN
LOAD PAT_ID,
DigCount
RESIDENT GetDigCount;

DROP TABLE tmpDig,GetDigCount;

Now Use your DIG table to get the information

Regards,

Amit

Not applicable
Author

Hi Rolf,

Hope this is what you're looking for:

diag:
LOAD Diagnose as diag
FROM Test.xls (biff, embedded labels, table is TBL_Markerdiag$);


Directory;
LOAD PAT_ID
FROM Test.xls (biff, embedded labels, table is TBL_PAT$);

Test:
LOAD PAT_ID,
PAT_ID as PAT_ID_Yes,
Diagnose,
'Yes' as Occurence
FROM Test.xls (biff, embedded labels, table is TBL_DIAG$)
where exists (diag, Diagnose) ;


concatenate ( Test )
LOAD PAT_ID,
Diagnose,
'No' as Occurence
FROM Test.xls (biff, embedded labels, table is TBL_DIAG$)
where not exists (diag, Diagnose) and not exists( PAT_ID_Yes, PAT_ID) ;

drop table diag;
drop field PAT_ID_Yes;