Skip to main content
Announcements
Join us at Qlik Connect for 3 magical days of learning, networking,and inspiration! REGISTER TODAY and save!
cancel
Showing results for 
Search instead for 
Did you mean: 
TomG1
Creator
Creator

Using Custom Code in tJavarow

Hi,

 

Is it possible to write custom java code in tJavarow  for Big data jobs.

I am using tJavarow inside a spark job.

but the custom code is not executed.

The following is the custom code

context.Flag="YES";

System.out.println("###################### This output is from tJavaRow ###################");

the value is not assigned as well as the message is not printed.

 

Thanks

 

 

Labels (4)
11 Replies
vapukov
Master II
Master II

not sure about something more serious, but as described - yes it work 

 

variables - not assigned

 

0683p000009Lu7b.png0683p000009LuHE.png

TomG1
Creator
Creator
Author

Hi ,

 

is that a talend bigdata spark job?

If so , instead of running as local, run it in a spark cluster.

 

I am running my spark job in the cluster.

if i put custom in tJava instead of tJavaRow, its getting executed.

I am able to see the output in spark application logs.

but custom code in tJavarow is not getting executed.

 

Thanks

vapukov
Master II
Master II

yes, will test, but may be You are right - it will not work 

 

let wait - what Talend staff answer 🙂

TomG1
Creator
Creator
Author

yeah..lets wait..0683p000009MACn.png

Anonymous
Not applicable

Hello,

 

Custom code components (tJava and tJavaRow) behave and have to be used differently depending on what type of job you are building.  For instance, Spark batch jobs you need to write with Spark Java API syntax to work with the input and output RDD (read the comments in the component when you first add it for help on how to do a test print on your input RDD, try that instead of your system.out*).  In Spark streaming job, you'll be working with RDD in Dstream.  tJava and tJavaRow behave differently too which tJavaRow uses Spark DataFrames API and tJava is purely working with RDDs.

 

See the documentation for the differences between them when using across various types of jobs:

https://help.talend.com/reader/KxVIhxtXBBFymmkkWJ~O4Q/y0Us7J_ukdgxhe9Jx_o_NQ?section=sect-components...

 

Hope that helps.

Anonymous
Not applicable

Hi jpmauss,

 

Could you please provide an example like wordcount or something on how to write custom spark code in tjava/tjavarow ,I tried doing the same by reading the description in the component but was not successful.Could not find any example in knowledge base.

Any help will be much appreciated.

 

Best Regards,

Ojasvi Gambhir

TomG1
Creator
Creator
Author

Hi ALL,

 

if you find any materials to write custom java code in tJava for bigdata version. Please let me know.

 

Thanks

Anonymous
Not applicable

I can look to share some examples, however you'd need to be familiar with Spark Java API which is different than straight java like in the standard jobs. Also with tjavarow you'd need to be familiar with Spark SQL and dataframes API.

 

See this link for intro to programming in Spark, click the Java tab to see how to work with the data using RDDs:

https://spark.apache.org/docs/1.6.2/programming-guide.html

 

See this link for intro to Spark SQL and DataFrames API:

https://spark.apache.org/docs/1.6.2/sql-programming-guide.html

 

When working in Talend, the tInput(whatever) creates an RDD that is to be used in the tJava. See the 'code' tab in studio for how it initializes the spark context and loads the data to an RDD.

Anonymous
Not applicable

Hi,

these is a example code for tJava with Spark job. the code  sample of the component is wrong  ( Talend 6.4.1)

 

in the basic setting :

 

outputrdd_tJava_1 = rdd_tJava_1.map(new mapInToOut(job)).

in the advanced setting, in class  java field

	public static class mapInToOut
			implements
			org.apache.spark.api.java.function.Function<inputStruct, RecordOut_tJava_1> {

		private ContextProperties context = null;
        private java.util.List<org.apache.avro.Schema.Field> fieldsList;
		
		public mapInToOut(JobConf job) {
			this.context = new ContextProperties(job);
		}
		
		@Override
		public RecordOut_tJava_1 call(inputStruct origStruct) {		
			
		if (fieldsList == null) {
				this.fieldsList = (new inputStruct()).getSchema()
						.getFields();
			}

			RecordOut_tJava_1 value = new RecordOut_tJava_1();

			for (org.apache.avro.Schema.Field field : fieldsList) {
				value.put(field.pos(), origStruct.get(field.pos()));
			}

			return value;		
			
		}
	}

 


tJavaExemple.zip