Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Join us to spark ideas for how to put the latest capabilities into action. Register here!
cancel
Showing results for 
Search instead for 
Did you mean: 
Anonymous
Not applicable

Duplicate check between files while using tFilelist

Hi Team,

I am trying to load files from a directory to MySql Output table
I used tFileList > tFileinputDelimited>tMap>tMySqlOutput design to iterate through the files
Now I want to remove duplicate data between files. ie, check the  data based on a column or combination of 2-3 columns between the files
For example: if month column of first file contains data NOV and if the second file contains same month data as NOV, job should neglect the second file to load
Please help me to implement this concept in my job

Labels (3)
1 Solution

Accepted Solutions
TRF
Champion II
Champion II

You need a new field into the temporary file.

Change the design like this:

tFileList--(iterate)-->tFileInputDelimited-->tMap-->tFileOutputDelimited(with Appen option ticked)

|

+(OnSubjobOK)

|

tFileInputDelimited-->tUniqRow-->tMysqlOutput

 

In the tMap you add a field into the output flow (let say filename) and use this expression to populate this field:

((String)globalMap.get("tFileList_1_CURRENT_FILEPATH"))

Change "tFileList_1" depending on your component real name.

Is that what you expect?

View solution in original post

4 Replies
TRF
Champion II
Champion II

I suppose all your input files are based on the same schema. In such a case, you can read all the input files and push the result to a single temporary file the eliminate the duplicate records before to go into MySQL.

The design should look like this:

tFileList--(iterate)-->tFileInputDelimited-->tFileOutputDelimited(with Appen option ticked)

|

+(OnSubjobOK)

|

tFileInputDelimited-->tUniqRow-->tMysqlOutput

Anonymous
Not applicable
Author

Thanks TRF for providing the job design and concept. Can you please tell me how will I identify and remove the duplicates from the temporary file and distinguish the data is from from first file and second file to find out the correct data.

TRF
Champion II
Champion II

You need a new field into the temporary file.

Change the design like this:

tFileList--(iterate)-->tFileInputDelimited-->tMap-->tFileOutputDelimited(with Appen option ticked)

|

+(OnSubjobOK)

|

tFileInputDelimited-->tUniqRow-->tMysqlOutput

 

In the tMap you add a field into the output flow (let say filename) and use this expression to populate this field:

((String)globalMap.get("tFileList_1_CURRENT_FILEPATH"))

Change "tFileList_1" depending on your component real name.

Is that what you expect?

Anonymous
Not applicable
Author

Thanks TRF, I have tried this approach and it is working as  how the files are placed in the the directory.The order of the file in tFileList  is from the last file in the directory to the first file, right?  I mean the order of the files. Can we specify the order of file load in tFileList or using any component? Also How will I specify the filenames in tfileinputDelimited, tFileOutputDelimited in the main job and tFileinputDelimited in the subjob? using ((String)globalMap.get("tFileList_1_CURRENT_FILEPATH"))?