Skip to main content
Announcements
Introducing Qlik Answers: A plug-and-play, Generative AI powered RAG solution. READ ALL ABOUT IT!
cancel
Showing results for 
Search instead for 
Did you mean: 
Anonymous
Not applicable

tFileInputExcel ignoring values after 000

Hello all,
I have isolated this case to tFileInputExcel component that just refuses to read values after "000".
For example: the following value reads fine: 9.001
but the following 9.0001 reads in as 9
I have tried quiet a few things to get this to work without any luck.
I'm attaching the screenshots and a link to Talend job and sample excel file.
Feel free to download entire thing from here:
http://db.tt/p7LzArAu
I'm on Ubuntu 10.10 64 Bit.
I'm using TOS-All-r63143-V4.2.2
Labels (2)
14 Replies
nestoru
Contributor
Contributor

I have to assume this is a problem with your Open Office. Real Microsoft Excel generated from Windows does work. I will attach the file but below you can see the results coming from my Windows VM:

tarting job test at 14:36 19/11/2011.
connecting to socket on port 3633
connected
Debug: format string is 0.0000
.-------------------.
| #1. tLogRow_1 |
+----------+--------+
| key | value |
+----------+--------+
| Quantity | 1 |
| Price | 1.0735 |
+----------+--------+
.-------------------.
| #2. tLogRow_1 |
+----------+--------+
| key | value |
+----------+--------+
| Quantity | 2 |
| Price | 9.0001 |
+----------+--------+
.------------------.
| #3. tLogRow_1 |
+----------+-------+
| key | value |
+----------+-------+
| Quantity | 3 |
| Price | 9.001 |
+----------+-------+
.-------------------.
| #4. tLogRow_1 |
+----------+--------+
| key | value |
+----------+--------+
| Quantity | 4 |
| Price | 1.0001 |
+----------+--------+
disconnected
Job test ended at 14:36 19/11/2011.

Best,
-Nestor
nestoru
Contributor
Contributor

See the file that works for me here: http://db.tt/3vTiWBrt
nestoru
Contributor
Contributor

And just to eliminate any doubts below is the result of running the Talend exported job in OSX (which is BSD alike so it should worn in Ubuntu as well)

nestor-krfs:test nestor$ ./test_run.sh
Debug: format string is 0.0000
.-------------------.
| #1. tLogRow_1 |
+----------+--------+
| key | value |
+----------+--------+
| Quantity | 1 |
| Price | 1.0735 |
+----------+--------+
.-------------------.
| #2. tLogRow_1 |
+----------+--------+
| key | value |
+----------+--------+
| Quantity | 2 |
| Price | 9.0001 |
+----------+--------+
.------------------.
| #3. tLogRow_1 |
+----------+-------+
| key | value |
+----------+-------+
| Quantity | 3 |
| Price | 9.001 |
+----------+-------+
.-------------------.
| #4. tLogRow_1 |
+----------+--------+
| key | value |
+----------+--------+
| Quantity | 4 |
| Price | 1.0001 |
+----------+--------+
nestoru
Contributor
Contributor

I just saw the excel inside the zip file. The cells are not formatted correctly. They are type General instead of Number. Change it to Number, 4 decimal places and it will work.
Anonymous
Not applicable
Author

=nestor.urquiza@gmail.com]I have to assume this is a problem with your Open Office. Real Microsoft Excel generated from Windows does work. I will attach the file but below you can see the results coming from my Windows VM:

There is nothing wrong wtih OO-generated file.
In my file, cell formating is Number:General. Once, I change it to Number: 4 digit rounding it works. Even with OO.
I need to read the file as is from Talend.
It's a work around, but it is not a very good one.
I need to read vales, how they are stored. Visually they also look proper.
In my Talend job, I just try to read them in as Strings.
nestoru
Contributor
Contributor

I have seen this to fail with several API's and even from SSIS before. That is why I went the extra mile debugging this problem and I found out there is a bug in the JXL API. I have reported so in the Yahoo group http://tech.groups.yahoo.com/group/JExcelApi/messages. My message is still not showing up ther eprobably because is going through moderation at the moment.
Definitely it would be great if Talend/JXL can accept the General format as text without losing any part of the content of the cell. Talend uses the JXL library. You realize that if you switch to debug mode, put a break point on the component and examine the code there.
Can you try converting with POI directly? Look into here for a quick try https://github.com/anjlab/excel2csv-java. If the resulting CSV does come correct then you can convert to CSV before feeding the file into Talend. That way you will have a temporary solution while you wait for the bug to be fixed. Below is an approximate copy of what I posted to JXL Yahoo user group:
There is a problem with precision in the jxl api. This is a continuation of this post: http://www.talendforge.org/forum/viewtopic.php?pid=69610
1. Compile source code from http://www.java-tips.org/other-api-tips/jexcel/converting-excel-documents-to-csv-files.html for example:
javac -cp jxl-2.6.12.jar ConvertCSV.java
2. Put the attached input.xls in the same directory where the compiled class is and run the program, for example:
java -cp jxl-2.6.12.jar:. ConvertCSV
3. Look at the resulting file input.csv. It should be:
1,1.0735
2,9.0001
3,9.0010
4,1.0001
But instead you get:
1,1.074
2,9
3,9.001
4,1
BTW I tested this with latest version of JXL http://mvnrepository.com/artifact/net.sourceforge.jexcelapi/jxl/2.6.12 and also with the one Talend 4.2.2 ships with (c:\TOS-Win32-r63143-V4.2.2\lib\java\jxl.jar) just in case I had the opportunity to patch Talend but they both failed.
nestoru
Contributor
Contributor

nestoru
Contributor
Contributor

The post has been responded and indeed there is a bug in Talend. Anybody from the Talend team reading this? I think we need to open a bug ticket. Here is the content of my post:
I will get to the Talend team now as clearly there are two things missing in their component:
1. A way to specify precision.
2. A correction so they stop to use getValue() instead of getContents()
In the generated code I can see:
if (true && jxl.CellType.NUMBER == cell_tFileInputExcel_2
.getType()) {
temp_row_tFileInputExcel_2 = String
.valueOf(((jxl.NumberCell) cell_tFileInputExcel_2)
.getValue());
String content = cell_tFileInputExcel_2
.getContents();
if (content != null
&& content.indexOf(separatorChar) == -1
&& (temp_row_tFileInputExcel_2
.indexOf("E") == -1)) {
temp_row_tFileInputExcel_2 = content;
} else {
String literal = temp_row_tFileInputExcel_2;
if (literal != null) {
literal = df
.format(((jxl.NumberCell) cell_tFileInputExcel_2)
.getValue());
}
temp_row_tFileInputExcel_2 = literal;
}
} else {
temp_row_tFileInputExcel_2 = cell_tFileInputExcel_2
.getContents();
}
So clearly "content" is picking a dirty value there and that is propagated later on in the code resulting in wrong parsing.
nestoru
Contributor
Contributor

I meant to say "they start to use"