Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Independent validation for trusted, AI-ready data integration. See why IDC named Qlik a Leader: Read the Excerpt!
cancel
Showing results for 
Search instead for 
Did you mean: 
miteshkhatri80
Contributor II
Contributor II

Collating row counts in multiple subjobs.

I have a job similar to the one below, with two inputs and a filter on each input. I want to output the number of rows from the filters, into a file output that looks something like:

row2|0 rows

row3|100 rows

row5|5 rows

row6|5 rows

 

The file layout doesn't matter too much, I can tweak that as necessary. I just need the values.

 

0683p000009Lt6s.jpg

 

If I have a separate tFlowMeterCatcher that outputs to a CSV file, only the second set of values is output. I believe the first set IS output, but is then overwritten regardless of the "append to file" setting. In any case, I do not want to append to file, because I want the stats to be refreshed entirely each time I run the job.

 

I have also tried to output the results via the Stats&Logs option in the Job tab, but this also only outputs the results from the latest subjob, having overwritten the results from the initial one.

 

How can I make all four stats appear in the same file? Is it possible to have more than one tFlowMeterCatcher, and specify which tFlowMeter(s) it refers to?

Labels (3)
1 Solution

Accepted Solutions
miteshkhatri80
Contributor II
Contributor II
Author

I have managed to resolve it, but I am surprised at the complexity of my solution, and I would love to know if there is a better way.

 

Since tFlowMeterCatcher overwrites the output from one subjob with the second subjob, my aim was to merge the two jobs into one. The only way I found to do this was to use a tMap immediately after each tRowGenerator, and create a new field which identified the source. This is just a simple string field which can contain "Source1" or "Source2" (or anything more meaningful).

 

I then used a tUnite to bring the two tMaps together, immediately followed by a tReplicate. Each output from tReplicate connected to a tFilter which filtered on the source field, and from then I used the original tFilter which gives me the values I need, within the same job.

 

Incidentally, this initially caused an error where the size of the Java code was too large, so I split it up into two sections using a tHashInput and tHashOutput.

 

To me this whole thing seems unnecessarily complex, but I could find no other way to resolve it. Does anyone have a better, more efficient solution?

View solution in original post

2 Replies
miteshkhatri80
Contributor II
Contributor II
Author

I have managed to resolve it, but I am surprised at the complexity of my solution, and I would love to know if there is a better way.

 

Since tFlowMeterCatcher overwrites the output from one subjob with the second subjob, my aim was to merge the two jobs into one. The only way I found to do this was to use a tMap immediately after each tRowGenerator, and create a new field which identified the source. This is just a simple string field which can contain "Source1" or "Source2" (or anything more meaningful).

 

I then used a tUnite to bring the two tMaps together, immediately followed by a tReplicate. Each output from tReplicate connected to a tFilter which filtered on the source field, and from then I used the original tFilter which gives me the values I need, within the same job.

 

Incidentally, this initially caused an error where the size of the Java code was too large, so I split it up into two sections using a tHashInput and tHashOutput.

 

To me this whole thing seems unnecessarily complex, but I could find no other way to resolve it. Does anyone have a better, more efficient solution?

miteshkhatri80
Contributor II
Contributor II
Author

An image of my solution, with a small refinement of using a tMap to filter on the sources, instead of a tReplicate and multiple tFilters.

 

0683p000009Lt6u.jpg