Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hello, I am kinda new to Talend.
I am going through a list of files and using treplace to replace a "&" with something.
Is there a way I can get a count of how many replacements occurred for each file processed?
Hi
You can hard code these piece of Java code on tJavaRow to count the number of occurrences,eg:
tFileList--iterate-->tFileInputRaw--main-->tJavaRow-->tFileOutputRaw
on tJavaRow:
int counts=0;
String regex = "something";
java.util.regex.Pattern pattern = java.util.regex.Pattern.compile(regex);
java.util.regex.Matcher matcher = pattern.matcher(input_row.content.toString());
while (matcher.find()) {
i++;
}
System.out.println("the number of total occurrences is: "+i);
Regards
Shong
Yes, you can keep track of the count of replacements for each file processed by maintaining a dictionary where the keys are the file names and the values are the number of replacements that occurred in each file. Here's some sample code to illustrate the idea:
import os
replacements = {} # initialize the dictionary to keep track of replacements
for filename in os.listdir('/path/to/files'):
with open(os.path.join('/path/to/files', filename), 'r') as f:
contents = f.read()
num_replacements = contents.count('&') # count the number of replacements in the file
contents = contents.replace('&', 'replacement_string') # replace '&' with 'replacement_string'
with open(os.path.join('/path/to/files', filename), 'w') as f:
f.write(contents)
replacements[filename] = num_replacements # add the number of replacements to the dictionary
print(replacements) # print the dictionary of filename to number of replacements
In this example, we use the `os` module to iterate through all the files in the specified directory. For each file, we open it and read its contents, counting the number of '&' characters and replacing them with the desired replacement string. We then write the modified contents back to the file and record the number of replacements in the `replacements` dictionary using the filename as the key. Finally, we print out the `replacements` dictionary to see the number of replacements for each file. H‑E‑B Partner Login
Thanks for the responses! I will try these and let you know.
How can I get the number of replacements to a log file? something like filename and replacements?
You can put the number of replacements into global variable for used later, eg:
tFileList--iterate-->tFileInputRaw--main-->tJavaRow-->tFileOutputRaw--oncomponentok--tFixedFlowInput-->tHashOutput1
|onsubjobok
tHashInput1--main--tFileOutputDelimited
on tJavaRow:
int counts=0;
String regex = "something";
java.util.regex.Pattern pattern = java.util.regex.Pattern.compile(regex);
java.util.regex.Matcher matcher = pattern.matcher(input_row.content.toString());
while (matcher.find()) {
i++;
}
System.out.println("the number of total occurrences is: "+i);
globalMap.put("key",counts);
tfixedFlowInput: generates a row that contains the current file name and the number of replacements , add two columns on the schema:
column name : value expression
filename:((String)globalMap.get("tFileList_1_CURRENT_FILE"))
counts:(Integer)globalMap.get("key")
tHashOutput: cache the data into memory, check the 'append' box.
tHashInput: read all the data from memory and write them into log file.
Hope it helps!
Regards
Shong
This is working great! Thanks for your help!!!!
Sorry but one more question..Can I perform a tReplace after the tJavaRow?
tJavaRow--main--tReplace