Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi All,
Could someone please help me with the below scenario. I have to find the genealogy for the below input table. I achieved this using Recursive CTE functionality during discovery phase of my project . But now the architecture of my project got changed for Alpha phase.
We are using Azure Data lake storage , this Azure data lake will be mounted on Azure Data bricks. we are using Azure delta lake tables to access the files in ADLS. Recursive CTE functionality is not supported in Azure Delta lake.
I have to achieve this functionality in Talend Data fabric cloud version only. Could someone please tell me how to achieve this in Talend itself without using database. Please find the below input table structure and the expected output as well.
Input table
Parent_ID ID
NULL 001
001 002
002 003
003 004
004 005
Expected Output
Lvl ID Geneology
1 001 001
2 002 001/002
3 003 001/002/003
4 004 001/002/003/004
5 005 001/002/003/004/005
Kind Regards,
Gopinath P
Recursion is a really useful tool, but it is also not the most efficient tool in the box when it comes to memory. So you must be careful when using it. Talend does not support recursion within Jobs, but you can use recursive routines to solve problems like this. Below is a routine that I have edited to suit your requirement....
*********************************************************************
package routines;
import java.util.HashMap;
import java.util.Map;
public class RecursiveRelationship {
private static Map<String,RecursiveRelationship> rrMap= new HashMap<>();
private String child;
private String parent;
public String getParent() {
return parent;
}
public static RecursiveRelationship createChild(String child, String parent){
RecursiveRelationship e= new RecursiveRelationship(child, parent);
rrMap.put(child,e);
return e;
}
private RecursiveRelationship(String child, String parent) {
this.child = child;
this.parent = parent;
}
private String getRelationship(String child){
if (RecursiveRelationship.rrMap.get(child).getParent() == null) {
return child;
}else {
return getRelationship(RecursiveRelationship.rrMap.get(child).getParent()) + "/" +
child;
}
}
@Override
public String toString() {
String relationship = getRelationship(child).equals(parent)?"":getRelationship(child);
return child + "," + relationship;
}
}
*********************************************************************
To use this, I set up a quick job with your data in a tFixedFlowInput component linking to a tJavaFlex. In the START CODE section of the tJavaFlex, I used this code......
java.util.List<RecursiveRelationship> emps = new java.util.ArrayList<>();
In the MAIN CODE section, I used this code......
emps.add(RecursiveRelationship.createChild(row1.ID,row1.Parent_ID));
In the END CODE section, I used this code.....
for(RecursiveRelationship emp:emps){
System.out.println(emp.toString());
}
Obviously if you cant to pass this data to other components you will need to put the for loop in a new tJavaFlex that starts a SubJob (the "for" in the START CODE, the "emp.toString()" in the MAIN CODE and the closing parenthesis in the END CODE) and that will pass the output one row at a time.
The output I got was as follows. I added my own parent child data as well to test it....
Parent ID| Child
null | 001
001 | 002
002 | 003
003 | 004
004 | 005
null | a
a | b
c | d
b | c
The output was this.........
001,001
002,001/002
003,001/002/003
004,001/002/003/004
005,001/002/003/004/005
a,a
b,a/b
d,a/b/c/d
c,a/b/c