Google Drive Folder Path Extraction in Talend

abarpdata_18 · ‎2025-03-12

Hi all,

I'm trying to resolve full folder paths for files in Google Drive using Talend. Given a dataset with file ID, name, parent ID, I need to reconstruct the hierarchical path by tracing parent-child relationships.

Logic I Tried (Using tJavaFlex):

Load data (id, name, parent) into a HashMap for lookup.
Iterate over each row, recursively tracing the parent ID until no parent is found.
Concatenate folder names to build the path.

🚨 Issue: JavaFlex isn't working properly—it either causes performance issues or doesn’t return full paths correctly.

Sample Input (Extracted Google Drive Metadata - Excel/CSV format)

Expected Output (After Full Path Resolution)

Question:

📌 Is there a better way to handle recursive folder paths in Talend? Could tMap, tLoop, or a different approach work better? Any suggestions would be greatly appreciated!

Thanks

quentin-vigne · ‎2025-03-12

Hello @abarpdata_18

Maybe try to use tJavaRow with a while function to get your full path from iterations (tJavaRow would be more suited in my opinion to do this, since tJavaFlex doesn't hand recursion very well and tend to cause stack overflows when doing too much looping)

Create as you did before a HashMap with id, name and parent

String buildPath(String id, HashMap<String, String[]> map) {
    StringBuilder fullPath = new StringBuilder();
    
    while (id != null && !id.isEmpty() && map.containsKey(id)) {
        String[] data = map.get(id);
        fullPath.insert(0, "/" + data[0]);
        id = data[1]; 
    }
    
    return fullPath.toString();
}

Tell me if it work (or doesn't)

- Quentin

abarpdata_18 · ‎2025-03-12

Thanks for your suggestion! I tried implementing tJavaRow with the buildPath function, but I’m running into some errors:

Compilation errors related to HashMap and Map not being recognized.
Talend is throwing a syntax error on the function declaration, saying "Syntax error on token '(' , ';' expected".
"Void methods cannot return a value" error is appearing.
"FullPath cannot be resolved or is not a field" error is showing up.

I ensured that import java.util.HashMap; and import java.util.Map; are present, and that FullPath exists in the schema.

Would you have any additional suggestions or tweaks I should try? Thanks in advance!

quentin-vigne · ‎2025-03-12

EDIT : Corrected my solution, now it looks better.

So I created a csv file with your example data and managed to make it working, I made a mistake before and forgot you can't create functions inside tJavaRow.

This is the full setup :

Your input data feed a first tJavaRow that fill a Map to get your data in memory

Inside your tJavaRow add this code :

java.util.Map<String, String[]> fileMap = (java.util.Map<String, String[]>) globalMap.get("fileMap");

if (fileMap == null) {
    fileMap = new java.util.HashMap<>();
    globalMap.put("fileMap", fileMap);
}

fileMap.put(input_row.id, new String[]{input_row.name, input_row.parents});

output_row.id = input_row.id;
output_row.name = input_row.name;
output_row.parents = input_row.parents;

Feed the output data to a tHashOutput (needed to keep the data in memory while also keeping the HashMap)

Then in a new subjob start with a tHashInput that will go to your second tJavaRow containing the most important part :

java.util.Map<String, String[]> fileMap = (java.util.Map<String, String[]>) globalMap.get("fileMap");

StringBuilder fullPath = new StringBuilder();
String parentId = input_row.parents;

while (parentId != null && !parentId.isEmpty() && fileMap.containsKey(parentId)) {
    String[] data = fileMap.get(parentId);
    fullPath.insert(0, "/" + data[0]);
    parentId = data[1];
}

fullPath.append("/").append(input_row.name);

output_row.id = input_row.id;
output_row.name = input_row.name;
output_row.parents = input_row.parents;
output_row.fullPath = fullPath.toString();

Result :

- Quentin

Data Preparation