Skip to main content
Announcements
A fresh, new look for the Data Integration & Quality forums and navigation! Read more about what's changed.
cancel
Showing results for 
Search instead for 
Did you mean: 
Anonymous
Not applicable

Split up string and add a new line (\n) every 100 characters

Hi,

I thought this would be simple, but I haven't been able to find a method to do this. 

 

I have a job that reads a raw file that contains a string 4900 charactes long, but it will vary and could be much longer or much shorter.

I need to split the string into 100 character strings followed by a \n. So that in the output file each 100 character record is on a new line.

 

Can anyone suggest a way to tackle this please?

 

Many thanks, Jerry Wilson 

 

 

Labels (2)
1 Solution

Accepted Solutions
Anonymous
Not applicable
Author

The easiest way to do this is a Talend Routine (essentially a Java class). Create a Routine and add this code to it....

    public static String addPeriodicToken(String data, String token, int period ){
    	String returnVal = data;
    	
    	if(data!=null && token!=null && period>0){
    		int position = 0;
        	
        	while((position+period)<data.length()){
        		
        		String firstSection = data.substring(0, (position+period))+"\n";
        		position = position+period;
        		String secondSection = data.substring(position);
        		data = firstSection+secondSection;
        		position=position+token.length();
        		
        	}
        	returnVal = data;
    	}
    	return returnVal;
    }  

Then you can use this anywhere in your job with the following code.....

routines.YourRoutineName.addPeriodicToken(row1.column,"\n",100)

I knocked the code up quite quickly, so there may be some tweaks you want to make. But it should work

View solution in original post

9 Replies
Anonymous
Not applicable
Author

The easiest way to do this is a Talend Routine (essentially a Java class). Create a Routine and add this code to it....

    public static String addPeriodicToken(String data, String token, int period ){
    	String returnVal = data;
    	
    	if(data!=null && token!=null && period>0){
    		int position = 0;
        	
        	while((position+period)<data.length()){
        		
        		String firstSection = data.substring(0, (position+period))+"\n";
        		position = position+period;
        		String secondSection = data.substring(position);
        		data = firstSection+secondSection;
        		position=position+token.length();
        		
        	}
        	returnVal = data;
    	}
    	return returnVal;
    }  

Then you can use this anywhere in your job with the following code.....

routines.YourRoutineName.addPeriodicToken(row1.column,"\n",100)

I knocked the code up quite quickly, so there may be some tweaks you want to make. But it should work

Anonymous
Not applicable
Author

Spot on, many thanks.

 

I haven't created a user routine before. This has unlocked even more possibilities for Talend processing.

 

Much appreciated. 

Anonymous
Not applicable
Author

Hi rhall_2_0,

 

This code helps to split one column content in every 100 character,but i have 3 more column along with this split column.So after splitting one field ,rest field values are coming only once at the last line.Can you please help me how to bring rest field values as it is if one filed value got splitted into multiple line.

 

Awaiting for your reply.

Anonymous
Not applicable
Author

Can you give an example of the data you are working with and what exactly you want to happen to it? Also, could you raise a new question and tag me in it? It is important that we do not clutter old questions with answers with other questions. Thanks

Anonymous
Not applicable
Author

Hi rhall_2_0

 

source file: attr_src

split column name= DESC25

after splitting in every 256 char ,i am getting below result where apart from "DESC25",rest column values are coming only once at last row .How to bring rest of the field values as it is when there is a split in one field.

result after split:attr_08

 

Please let me know what needs to be done .

Thanks

 

 


attr_08.csv
Anonymous
Not applicable
Author

Hi rhall_2_0

 

source file: attr_src

split column name= DESC25

after splitting in every 256 char ,i am getting below result where apart from "DESC25",rest column values are coming only once at last row .How to bring rest of the field values as it is when there is a split in one field.

result after split:attr_08

 

Please let me know what needs to be done .

Thanks

 


attr_08.csv
attr_src.txt
Anonymous
Not applicable
Author

This is actually a different problem, but looks similar. I am assuming that you want the following to happen. 

Assuming that the split is every 10 characters....

 

Input Data

a large field which needs to split into units of 10 characters; 100; 200
another large field which needs to be split; 300; 400

Output Data

a large fi; 100; 200
eld which ; 100; 200
needs to s; 100; 200
plit into ; 100; 200
units of 1; 100; 200
0 characte; 100; 200
rs; 100; 200
another la; 300; 400
rge field ; 300; 400
which need; 300; 400
s to be sp; 300; 400
lit; 300; 400

If that is the case you can still use the routine that I gave you, but you will need to do some more work. You will need to strip off the other columns and provide them in the routine as another variable. Then for every row created by the routine, append the extra columns before the line feed.

Anonymous
Not applicable
Author

Hi rhall

Can you please show the code how to add other fields as well.

 

Thanks

Anonymous
Not applicable
Author

@swpriyad without building this myself I do not have any code. You should be able to attempt this from the description I gave above. If you get stuck, then can you show me what you have and I can help from there. You will learn far more from attempting this yourself and being helped, rather than me building it for you.As I said, I am happy to help solve problems with your attempts at doing this, but I can't really provide full solutions to every problem I am asked about I am afraid.