Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 
vardhancse
Specialist III
Specialist III

QlikSense check trade exist or not in next day csv extract

Hi,

We have one requirement like

Data Format: .csv file (in multiple folders)

Frequency: Daily

Load Folder structure: will have all legacy data from past 2 years

Loaded data from all multiple folders and created one final QVD using below condition as reference:

FOR Each folder in 'ABC','BCD','DEF','EFG','FGH','GHI','HIK',
FOR Each file in FileList ('lib://Sample\$(folder)\*20201006*.csv')

Fact:

Load *

FROM [$(file)]
(txt, codepage is 28591, embedded labels, delimiter is ',', msq);

NEXT file
NEXT folder

Output: all the folders data was now merged into one single QVD

now issue is like data was in below mentioned format:

Day 1
IDStatus (calculated column)
111Unmatched
222Unmatched
333Unmatched
  
  
Day 2
IDStatus (Calculated Column)
222Unmatched
333Unmatched
444Unmatched
  
  
Day 3
IDStatus (Calculated Column)
333Unmatched
444Unmatched
555Unmatched

 

now issue is like the "ID" present in "Day 1" extract is "UnMatched", and just in case if its not available in "Day 2" then "Status" is "Matched"(calculated)

 

Like wise:

1. We need to load complete .CSV files and check with other day file whether that ID is present or not in next day file

2. IF yes then Status : Unmatched

IF no then Status : Matched and stamp Day 2 date in another calculated column date

Hope I have managed to explain the issue well.

Thanks in advance for your help

@sunny_talwar , @jagan 

7 Replies
vardhancse
Specialist III
Specialist III
Author

need to merge data from all the folders and then need to identify max date (the date when it was missing)

and concurrently I need to maintain history as well into Qlik in QVD format.

As per date/month/year selection need to display the no of matched dates

vardhancse
Specialist III
Specialist III
Author

@Gysbert_Wassenaar  can please let me know any solution for this

Anil_Babu_Samineni

This is not that easy, but let's see if @marcus_sommer can help on this? I see this is required permutation and combination.

Best Anil, When applicable please mark the correct/appropriate replies as "solution" (you can mark up to 3 "solutions". Please LIKE threads if the provided solution is helpful
vardhancse
Specialist III
Specialist III
Author

Attached sample data for reference, unable to upload into CSV format and so attached .xlsx

Can please save as to 3 multiple CSV files.

Here is the sample expected result:

**Status and Date are calculated columns

marcus_sommer

If I understand it right you want to create additionally records which then contain the state of "matched"? Does it always relate to the previous day or to all available days or anyhow different to it? How many records respectively unique field-values of ID and DAY exists?

- Marcus

vardhancse
Specialist III
Specialist III
Author

Hi @marcus_sommer ,

Thank you for your reply.

We have data in CSV format from past 4 years, from day 1 we need to search the "ID" is present in next consecutive day report(it can be any day report say like can be until 20th day or so) or not.

**At any point of time, we should be able to do full reload, just in case QVD got corrupted or so and so irrespective of any day the ID checking in new files is common

1. Daily base report is always unmatched report , so by default Status(calculated column can be stamped as "Unmatched")

2. ID in day 1 can be present in any day report say like it can repeat for next 10 or N days 

3. the day it got missed in the daily report is the one we need to pick up and stamp the date in a new calculated column

4. In total we required 2 new additional columns:

      a. Status : Matched/Unmatched

     b. Date In: The day any particular ID is missing from the next consecutive report

5. Coming to Count, in front end if we are selecting

  a. Day 1 in master calendar it should show total un matched records as : 4 (as per sample file provided)

  b. Likewise for day 2 and day 3

  c. concurrently it should show matched or not

In a summary:

Highlighted in bold: new calculated columns

font color (Red) : Total Records

Font color(Green): The one matched(missing from that day particular report)

COB Feb 06th – 111 Matched Trades from 131 - Date In and Date Complete to be marked as 07/02/2021 

COB Feb7th 30th – 31 Matched Trades from 74 - Date In and Date Complete to be marked as 08/02/2021 

COB Feb 9th 1st – 55 Matched Trades from 112 - Date In and Date Complete to be marked as 09/02/2021 

COB Feb 10nd – 125 Matched Trades from 179 - Date In and Date Complete to be marked as 10/02/2021 

Tried my level best to give as much as possible information/requirement, can please let me know if required any further info

Thanks in advance for your time.

marcus_sommer

I think there are various approaches possible.

Quite obvious would be to use interrecord-functions like previous() or peek() to look within other records. It requires a sorted resident-load and maybe more as one load because you need a logic to pick the right record for detecting/comparing the values. If the record could be anywhere you may need the mentioned extra pre-loads to create appropriate help-columns which are later used to determine the right record. That you didn't get directly extra records with interrecord-functions doesn't mandatory mean that such an method isn't applicable because you could apply your information in extra fields or concatenating them within a string which are later transformed with a crosstable/subfield-load in multiple records again.

Another approach would be to apply the logic already within your file-list loop. For it you may also need n pre-loads and some renaming of the key-field which would be the combination of the ID + DAY + ??? as KEY and with exists() you checked if the key exists or not.

Logically simpler is the following:

m: mapping load ID & '|' DAY, 'unmatched' resident t0;
t1: load distinct ID resident t0; join(t1) load distinct DAY resident t0;
t2: load *, applymap('m', ID & '|' DAY, 'matched') as STATE resident t1;

It creates a cartesian product which could have some benefits /alle possible data-combinations exists and real NULL analysis is possible) but also some disadvantages because the dataset could become quite huge and therefore too heavy for for your environment. But you don't need to keep it in this way because afterwards you could drop unneeded records whereby with some additionally logics you may already restrict the data in beforehand.

- Marcus