Re: How to read two values in csv and compare them... - Qlik Community

Anonymous · ‎2018-10-04

I have a tFileOutputDelimited which has data as below ..

123-abc-rtey121-a1b1-0.00

141-fhf--fb32-utoot21b-b2b2-1.213

456575-21-ytuueoro-ghfyt-43-2.31

4567-abc-grth-a1b1-0.17

i need to compare the values (i.e. float values) which are in the last position of each record.

i.e. record1 == record2 (0.00 == 1.213): fail and display the result as pass (or) fail. (i.e. in the same file.

record3 == record4 (2.31 == 0.17)) : fail

Is there a way to do it in Talend ?

fdenis · ‎2018-10-04

you can use tmap to create a lockup.
main source is your file
lockup source is your file

Anonymous · ‎2018-10-08

@fdenis

I actually have a problem here. both csv files contains different no.of columns.

eg:- csv_1 contains 5 fields or more while csv_2 may contain 8 fields or more. But always the last column value can be compared.

If I use, tmap I may need to define the schema (i.e. fixed set of columns) which doesnt fit my purpose.

I might have to use a specific fields(commonly used between two files) below to compare both the fields

File-1:

123-D1-abc-rtey121-a1b1-0.00

141-B1-fhf--fb32-utoot21b-b2b2-1.213

456575-C1-21-ytuueoro-ghfyt-43-2.31

4567-B1-abc-grth-a1b1-0.17

File-2:

123-D1-abc-rtey121-a1b1-1.00

141-B1-fhf--fb32-utoot21b-b2b2-1.213

456575-C1-21-ytuueoro-ghfyt-43-4.31

4567-B1-abc-grth-a1b1-0.17

i.e., like below

If (file_1.last_value_of[123-D1] == file_2.last_value_of[123-D1]) 
{ 
print "pass"
else
{
"fail"
}

In this case,

0.00 != 1.00

and result will be fail.

Can i do this with tMap (or) should i use tSystem to write a script to achieve this ?

Also, how to read both files and compare ?

Anonymous · ‎2018-10-09

Hi,

The concept of tmap do not limit the data comparison between two files with different structures. You can read both files in tmap as main and lookup.

The main file may have 5 columns and lookup file may have 15 columns. But you know the column name which need to be compared in your case. You will have to also determine whether you are going to join the records based on any primary columns or by line number of file. If it is based on line number fo file, you will have to add an additional column containing line numbers before doing the match in tmap.

Now, if you are looking for a simple file compare, you can do that using tFileCompare component in Palette.

Warm Regards,

Nikhil Thampi

Anonymous · ‎2018-10-12

Got it. Thanks @nthampi.

Btw,

I changed my job design a little bit. So that i can read my first delimited file and split it and retrieve only three values which i need. (i.e. 1st, 2nd and last value).

tFileInputDelimited -> tMap -> tLogRow.

I am able to retrieve first value of each line using the following code in the expression filter, so that i can assign it to a schema field in the output.

StringHandling.LEFT(row1.input,StringHandling.INDEX(row1.input,"-"))

123-D1-abc-rtey121-a1b1-0.00

141-B1-fhf--fb32-utoot21b-b2b2-1.213

But i am not sure how to retrive 2nd and last value. (i.e.)

123-D1-abc-rtey121-a1b1-0.00

141-B1-fhf--fb32-utoot21b-b2b2-1.213

Any help would be appreciated.

Anonymous · ‎2018-10-15

Please let me know.

Anonymous · ‎2018-10-15

Hi,

Why don't you read this file with a tfileinputdelimited component with field separator as "-". Another question I am having is the reason why the schema is fluctuating for a single file. Ideally any file should be having a predefined schema and if the data is not there, you need to add a null value.

Could you please verify these details before starting to populate the data? Well, if the data from the source system do not have any data layout conventions for a csv file, then you will have to employ the java regular expressions to parse the data.

But my recommendation is to insist on a proper file layout for a simple flow like this from the source system.

Warm Regards,

Nikhil Thampi

Anonymous · ‎2018-10-16

Hi @nthampi,

Initially my tFileInputDelimited has the Field Seperator as ";". After ur your suggestion, i changed the delimiter to "-". I am now getting StringOutOfBoundsException

java.lang.StringIndexOutOfBoundsException: String index out of range: -1
at java.lang.String.substring(Unknown Source)
at routines.StringHandling.LEFT(StringHandling.java:229)
at my_project.new1_of_copy_of_compare_csv_0_1.New1_of_Copy_of_compare_csv.tFileInputDelimited_1Process(New1_of_Copy_of_compare_csv.java:849)
at my_project.new1_of_copy_of_compare_csv_0_1.New1_of_Copy_of_compare_csv.runJobInTOS(New1_of_Copy_of_compare_csv.java:1328)
at my_project.new1_of_copy_of_compare_csv_0_1.New1_of_Copy_of_compare_csv.main(New1_of_Copy_of_compare_csv.java:1177)

For your second question (why the schema is fluctuating for a single file):
I am generating this file, as an output of another job. I cannot define a schema, as it varies for each input (i.e. sql query)

Anonymous · ‎2018-10-16

@nthampi, @fdenis

Finally, I am able to retrieve second & last value using below manner..

123-D1-abc-rtey121-a1b1-0.00

To, retreive second value "D1"

StringUtils.split(row1.input, "-")[1]

To, Retrieve last value "0.00"

string.substring(string.lastIndexOf("-")+1).trim()

How to read two values in csv and compare them to display as pass or fail ?

Big Data

v7.x