Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Qlik Open Lakehouse is Now Generally Available! Discover the key highlights and partner resources here.
cancel
Showing results for 
Search instead for 
Did you mean: 
DM_J
Contributor II
Contributor II

Understanding tMap or tJava components behavior

suppose I have two CSV files: language.csv and languagecheck.csv

0683p000009M6DE.png

 

note there is not any direct relation between them.

 

and I have two jobs, I have two questions for job1 and one question for job2

 

Job1:

0683p000009M6DO.png

tMap:  it is a cartesian join with the result of 9 rows.

0683p000009M6DT.png

 

I wrote a println inside the tJavarow:

System.out.println(input_row.lookup_id);

 

the result should be: 11 and 12 and 13

but it is:

1 testtest aa
12 rest bb
13 quest cc
11 testtest aa
12 rest bb
13 quest cc ...

0683p000009M6Dd.png

 

Question 1: Why this happens and how I can solve it?

 

Question 2: If I open the result I see again something strange, why?

0683p000009M6Dn.png

 

Job2: in this job, I compare the value of the column Name of languagecheck.csv with the column Name of language.csv

 

0683p000009M6Ds.pngtMap:

0683p000009M6Dx.png

 

The result should be two columns id and count, the value of count should be 0 but the result is:

0683p000009M6E2.png

 

Question 3: these two extra columns came from where and why the value of count is 1? 

The result should be 

id    count

1        0

2        0 ....

 

Note: I don't want to create a join between two CSVs inside tMap.

Labels (2)
1 Solution

Accepted Solutions
Anonymous
Not applicable

Hi,

 

    For the first Question, you are using tab as the line separator. But most probably you must not have changed the default line separator semicolon to Tab in both or one of the input files. I got the right results.

0683p000009M6E7.png

 

Coming to your second query, you will have to again check the column separator symbol as first task. Now, I am joining the two datasets using name column.

 

I got zero match as shown below.

0683p000009M5vQ.png

 

0683p000009M6EH.png

 

I am wondering why you are not doing the join within tMap. The problem with your current match method is that you are doing a Cartesian join and then trying to do java functions to perform the same task. Due to Cartesian set, the results to process will become bigger especially for bigger datasets. Which means your current program may suffer throughput issues in future. Also I personally do not like to do hand coding when the same functionalities are provided by ETL tools 🙂 Why to reinvent the same old wheel 😉

 

Hope I have answered your query. Could you please spare a second to mark the topic as resolved.

 

Warm Regards,
Nikhil Thampi

Please appreciate our Talend community members by giving Kudos for sharing their time for your query. If your query is answered, please mark the topic as resolved 🙂

 

View solution in original post

4 Replies
Anonymous
Not applicable

Hi,

 

    For the first Question, you are using tab as the line separator. But most probably you must not have changed the default line separator semicolon to Tab in both or one of the input files. I got the right results.

0683p000009M6E7.png

 

Coming to your second query, you will have to again check the column separator symbol as first task. Now, I am joining the two datasets using name column.

 

I got zero match as shown below.

0683p000009M5vQ.png

 

0683p000009M6EH.png

 

I am wondering why you are not doing the join within tMap. The problem with your current match method is that you are doing a Cartesian join and then trying to do java functions to perform the same task. Due to Cartesian set, the results to process will become bigger especially for bigger datasets. Which means your current program may suffer throughput issues in future. Also I personally do not like to do hand coding when the same functionalities are provided by ETL tools 🙂 Why to reinvent the same old wheel 😉

 

Hope I have answered your query. Could you please spare a second to mark the topic as resolved.

 

Warm Regards,
Nikhil Thampi

Please appreciate our Talend community members by giving Kudos for sharing their time for your query. If your query is answered, please mark the topic as resolved 🙂

 

DM_J
Contributor II
Contributor II
Author

Hi,

 

No, they are the same as each other:
0683p000009M6Bs.png0683p000009M6Eb.png

I also mentioned I can't use join, this question is a simplified case of my problem and I need the cartesian result.

Just assume the name of the CSV2 is part of the name of CSV1, now you cannot use the join as it always gives 0 results.

 

 

 

Anonymous
Not applicable

@DM_J 

 

Unfortunately I am really confused what you are trying to achieve in this use case.

 

You had initially asked two queries and I had showed how it is working properly for your sample data. Right now, are you saying that your lookup and main flow are same but you are not getting any match?

 

Could you please rephrase your query along with current and expected output details so that Talend community members can give informed thoughts.

 

Warm Regards,
Nikhil Thampi

Please appreciate our Talend community members by giving Kudos for sharing their time for your query. If your query is answered, please mark the topic as resolved 🙂

DM_J
Contributor II
Contributor II
Author

Hi @nthampi 

 

I want to thank you and accept your answer as you mentioned about separator "

you are using tab as the line separator. But most probably you must not have changed the default line separator semicolon to Tab in both or one of the input files"

 

The problem was about caching old separator even after changing that to another. When I used Java debug mode it was solved. I mentioned this issue here:

 

https://community.talend.com/t5/Design-and-Development/Old-changes-Caches-after-new-changes/m-p/1659...