Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Qlik Open Lakehouse is Now Generally Available! Discover the key highlights and partner resources here.
cancel
Showing results for 
Search instead for 
Did you mean: 
Anonymous
Not applicable

Implementation of SQL Analytical function like Row_Number(), Rank()

Hi
I want to implement SQL analytical function using Talend like 
ROW_NUMBER() OVER(PARTITION BY Emp_ID ORDER BY Emp_ID)

RANK() OVER(PARTITION BY Emp_ID ORDER BY Emp_ID).

I tried to use Numeric.Sequence but it has some issues and it is there with Sequence(Date, int, int) signature.
I have a pipe delimited flat file and created metadata and want to use metadata while working with analytical functions
How we can implement these Analytical function using Talend?
Thanks
Labels (2)
8 Replies
Anonymous
Not applicable
Author

This can be done using a tSortRow, a tMap and Numeric.sequence. 
First of all, Numeric.sequence should have the signature "sequence(String seqName, int startValue, int step)". You will need to set the seqName value as a variable that changes according to your grouping (see "Partition By" in your description). 
To get this working, first you will need to use tSortRow to order your data by your "Partition By" requirements. Then (in the next component, the tMap) you will need to keep a track of changes to "Partition By" columns. When the group changes you will need to change the "seqName" value. This will start a new sequence.
You can check for changes between rows in a tMap following the tutorial below....
https://www.rilhia.com/quicktips/quick-tip-compare-row-value-against-value-previous-row
Anonymous
Not applicable
Author

Hi
I was trying to use sequence function but the signature for me its (date, int, int). PF screen shot below

0683p000009MDWM.jpg
I also tried to change it to (string, int, int) but it does not allow any editing. 
This sequence function generates sequence based on string. Can we generate sequence based on Date or Int (Apart from string datatype)?
Thanks..
Anonymous
Not applicable
Author

The code for sequence should be....
    /**
* return an incremented numeric id
*
* {talendTypes} int | Integer
*
* {Category} Numeric
*
* {param} string("s1") sequence identifier
*
* {param} int(1) start value
*
* {param} int(1) step
*
* {example} sequence("s1", 1, 1) # 1, 2, 3, ...
*
* {example} sequence("s2", 100, -2) # 100, 98, 96, ...
*
*/
public static Integer sequence(String seqName, int startValue, int step) {
if (seq_Hash.containsKey(seqName)) {
seq_Hash.put(seqName, seq_Hash.get(seqName) + step);
return seq_Hash.get(seqName);
} else {
seq_Hash.put(seqName, startValue);
return startValue;
}
}

....I don't have 6.3.1 on my machine at the moment, but can only suggest that either this is a bug or a new overloaded version of sequence. Check to see if the old version still exists. If it does not, you can copy the code above and put it into your own routine.
Anonymous
Not applicable
Author

Thanks, it worked. Can we create sequence for other data type as well?
Regards..
Anonymous
Not applicable
Author

The beauty of Talend is that you can create whatever method/function you want in Java. You can also make use of other people's functionality in third party APIs. So the answer is "yes". 
Anonymous
Not applicable
Author

True. We can write our own libraries based on the requirements. I stuck in one scenario, I want to apply row_number once the data get loaded into the component and based on the row_number, I have to do some filtering. Is it possible? Attaching workflow with this...

0683p000009MDOD.jpg
Anonymous
Not applicable
Author

You can do this using the tMap. Do it either in an "out" table column or using a tMap variable.
bradsheridan
Contributor III
Contributor III

Morning Rhall_2_0.  the link you provided below is a dead link.  Do you have one that works?  I'd like to see the solution that you provided.  I'm almost there but not quite...I'm trying to do:

partition data by Field1

order the data within partition by Field2

get the row_num()

 

thanks

Brad