Skip to main content
Announcements
See what Drew Clarke has to say about the Qlik Talend Cloud launch! READ THE BLOG
cancel
Showing results for 
Search instead for 
Did you mean: 
EMeany
Contributor III
Contributor III

Context based source selection for testing

I have a job which pulls records from Salesforce and, after some manipulation, writes them to an AWS Postgres instance.  I have constructed test input and output data to ensure that the job achieves the desired results.  A portion of the job is shown below.

 

0683p000009LzWJ.png

 

I am using a context variable to filter out the Test source whenever I want to use production data - along with a context variable within the SOQL query to limit Salesforce results to 0 when I want to test.  However, my method does not prevent the Test Data component from retrieving records during a production run, and my attempts to use a run-if trigger have been unsuccessful.  There has to be a better way of doing this...

 

I would like to easily be able to switch between the two inputs (Salesforce or Excel input)

or

I would like the ability to test the entire job (not a single subjob or component) using Test Cases within Talend

 

Can someone advise on how I might accomplish this?

Labels (4)
1 Solution

Accepted Solutions
Anonymous
Not applicable

This can easily be achieved.  This is how.

Imagine the screenshot below is is above and linked to your subjob shown in your post....
0683p000009M0FJ.png

1) Use a tFixedFlowInput with the schema of your Test Data component and connect it to a tHashOutput. Set the number of rows for the tFixedFlowInput component to 0. What this does is initialises the tHashOuput. This is done in case you don't actually use it below.

2) Start a new subjob below the one you've just put together. Start it with a tJava. Do nothing with this.

3) Connect a RunIf conditional link to the tJava, take your "Test Data" component from the subjob below and connect it to the tJava using the RunIf. I've used a tRowGenerator for mine to make testing this easier for me. You use your Excel data.

3) In the RunIf condition, use your context variable you are currently using.

4) Now take another tHashOutput and connect it to your Test Data component. Set the tHashOutput component to be linked to the one created above.

5) Now attach what you can see above to your subjob. Replace your TestData component with a tHashInput linked to the top tHashOutput

 

You are done.

 

What happens here is that at the beginning of the job the tHashOutput is initialised to zero rows. Then, if your context variable sets the Test Data to run, the tHashOutput is populated with data. Then, when it gets to your job, the tHashInput will either have no rows to add in the tUnite component, or the the rows for your Test Data.

 

Note: Having read your post again, I may have misunderstood whether you want to always run with the Salesforce data. I assumed you do. If not, you can adapt the solution above by simply adding a new IF link to your tJava and doing exactly the same with the Saleforce data. This is a technique I have used a fair amount.

View solution in original post

5 Replies
Jesperrekuh
Creator III
Creator III

I doubt it's possible.... I do like the approach you have now...

Maybe you could set a variable to extend your salesforce query with something like "TOP 0" or "LIMIT 0" or the where statement " AND 1<0 " on the production query when your context = DEV?
And your Excel just in tMap, context.equals( "DEV" )

Stephen-Elves
Contributor
Contributor

My advice would be to place the test data in a different Postgres table, and ideally database, and change that only as the source using context, this will allow you to test the Postgres query for any bugs as well (probably not needed unless you have a particularly complicated query but probable still BP).

 

In addition, your test harness if it used a UUID appended to the test table name could populate the test table with test data (including negative tests) then pass the table name using context to the job being tested and then compare the output to confirm test operating correctly. Finally, it could clean up after itself and drop the test table.

 

i.e.

Prod Table name: customers

Test table name: customer_<UUID>

Anonymous
Not applicable

This can easily be achieved.  This is how.

Imagine the screenshot below is is above and linked to your subjob shown in your post....
0683p000009M0FJ.png

1) Use a tFixedFlowInput with the schema of your Test Data component and connect it to a tHashOutput. Set the number of rows for the tFixedFlowInput component to 0. What this does is initialises the tHashOuput. This is done in case you don't actually use it below.

2) Start a new subjob below the one you've just put together. Start it with a tJava. Do nothing with this.

3) Connect a RunIf conditional link to the tJava, take your "Test Data" component from the subjob below and connect it to the tJava using the RunIf. I've used a tRowGenerator for mine to make testing this easier for me. You use your Excel data.

3) In the RunIf condition, use your context variable you are currently using.

4) Now take another tHashOutput and connect it to your Test Data component. Set the tHashOutput component to be linked to the one created above.

5) Now attach what you can see above to your subjob. Replace your TestData component with a tHashInput linked to the top tHashOutput

 

You are done.

 

What happens here is that at the beginning of the job the tHashOutput is initialised to zero rows. Then, if your context variable sets the Test Data to run, the tHashOutput is populated with data. Then, when it gets to your job, the tHashInput will either have no rows to add in the tUnite component, or the the rows for your Test Data.

 

Note: Having read your post again, I may have misunderstood whether you want to always run with the Salesforce data. I assumed you do. If not, you can adapt the solution above by simply adding a new IF link to your tJava and doing exactly the same with the Saleforce data. This is a technique I have used a fair amount.

EMeany
Contributor III
Contributor III
Author

Excellent! This is what I was looking for!

The image below shows the design I will ultimately use.

 

0683p000009M0Fd.png

 

A few comments:

 

I had to eliminate the tUnite component as it was causing the Production and Test portions of the flow to be combined into the same subjob (which prevented tJava's run if trigger from correctly connecting to the second component) so I just sent them into a pair of linked tHashOutputs.

 

I am still a bit unsure as to why tJava's OnSubjobOk connection waits to trigger the tHashInput but it does.  I am assuming that the 'run if' connection is somehow tying its connected subjobs together?  This goes against my understanding of how separate subjobs are visually distinguished but oh well, it works!

 

Thank you to everyone that offered a solution.

 

E

 

Anonymous
Not applicable

The OnSubJobOk link waits for the SubJob (in this case the tJava) to finish. The tJava has two RunIf links which are not "finished" until the linked SubJobs are done. Essentially (as you said) just consider the tJava and components linked to it with the RunIfs as one SubJob