
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
How to read and convert correctly a column in an XLSX file into a CVS file
I'm struggling with an Excel file (.xlsx) filled manually by people : it contains a column wich should be a date format, but sometimes I have some alphabetical characters in it.
So I would like to extract this column saying it is a string column and put it in a delimited file (CSV) in 2 colimns : one which would have the same value as the one in the Excel and which type is String, and another one which would contains the date as a Date.
The original Excel file is like :
The problem is that when I'm looking at the log, I can see that the values displayed are not correct!
And when looking in the CSV file, there are modification of my values!
The result shows that the initial values read in the Excel file are not the one displayed in Excel. And I have alteration in the calculated Date_neo !
The function I used to calculate the fiel Date_neo is :
public static Date ValidDate(String pattern, String dateToValidate) {
SimpleDateFormat formatter = new SimpleDateFormat(pattern);
//To make strict date format validation
formatter.setLenient(false);
Date parsedDate = null;
try {
parsedDate = formatter.parse(dateToValidate);
System.out.println("La chaîne de caractères [" + dateToValidate + "] est une date : " + formatter.format(parsedDate));
}
catch (ParseException e) {
//Handle exception
System.out.println("La chaîne de caractères [" + dateToValidate + "] n'est pas une date : " + parsedDate);
return null;
}
return parsedDate;
}
Could you help me ?
Accepted Solutions

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Yes, but it doesn't change anything.
So I leave it unchecked.
I think I found something : I created a MetaData in the Excel file section with the file I use.
As you can see: the dates (if they are correct) are correctly read.
Then, I created 2 global routines :
public static Date ValidDate(String pattern, String dateToValidate) {
SimpleDateFormat formatter = new SimpleDateFormat(pattern);
//To make strict date format validation
formatter.setLenient(false);
Date parsedDate = null;
try {
parsedDate = formatter.parse(dateToValidate);
System.out.println("La chaîne de caractères [" + dateToValidate + "] est une date : " + formatter.format(parsedDate));
}
catch (ParseException e) {
//Handle exception
System.out.println("La chaîne de caractères [" + dateToValidate + "] n'est pas une date : " + parsedDate);
return null;
}
return parsedDate;
}
public static String StringOutputDate(String StringInputDate, String FormatDateInput, String FormatDateOut) {
DateFormat dateFormat = new SimpleDateFormat(FormatDateOut);
Date date = null;
String returnValue = "";
try {
date = new SimpleDateFormat(FormatDateInput, Locale.ENGLISH).parse(StringInputDate);
returnValue = dateFormat.format(date);
}
catch (ParseException e) {
returnValue = StringInputDate;
}
return returnValue;
}
In the tMap composant,
with :
Var.pattern
"dd/MM/yyyy"
Var.date_string_format_souhaite
!Relational.ISNULL(row13.Date) && !"".equals(row13.Date) ?
StringToDate.StringOutputDate(row13.Date, "EEE MMM dd HH:mm:ss zzz yyyy", Var.pattern)
:null
Var.Date_est_valide
!Relational.ISNULL(Var.date_string_format_souhaite) ?
StringToDate.DateIsValid(Var.pattern, Var.date_string_format_souhaite)
:false
DateFormatSouhaite_String
Var.date_string_format_souhaite
Date_neo
Var.Date_est_valide?
TalendDate.parseDate(Var.pattern, Var.date_string_format_souhaite)
:null
It works !

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@Catherine Thomas can you upload this example file for testing?

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello Shong : I can't. I've got this message :
Your company doesn't support the following file types: .xlsx
That's why I made some screen shots...

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi, have you try to use convert date column to string option in the advanced settings of th tFileInputExcel component ?

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@Catherine Thomas , the community support file types .xlsx at attachment. If you can't upload it, is it possible to email it to me shong@talend.com
I want to see what's the real value in Excel file. the fist value looks like 01/02/1936, but it is read as 2/1/36.
In excel file, set the type of this column as text and read it as a string in Job.
Regards
Shong

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Yes, but it doesn't change anything.
So I leave it unchecked.
I think I found something : I created a MetaData in the Excel file section with the file I use.
As you can see: the dates (if they are correct) are correctly read.
Then, I created 2 global routines :
public static Date ValidDate(String pattern, String dateToValidate) {
SimpleDateFormat formatter = new SimpleDateFormat(pattern);
//To make strict date format validation
formatter.setLenient(false);
Date parsedDate = null;
try {
parsedDate = formatter.parse(dateToValidate);
System.out.println("La chaîne de caractères [" + dateToValidate + "] est une date : " + formatter.format(parsedDate));
}
catch (ParseException e) {
//Handle exception
System.out.println("La chaîne de caractères [" + dateToValidate + "] n'est pas une date : " + parsedDate);
return null;
}
return parsedDate;
}
public static String StringOutputDate(String StringInputDate, String FormatDateInput, String FormatDateOut) {
DateFormat dateFormat = new SimpleDateFormat(FormatDateOut);
Date date = null;
String returnValue = "";
try {
date = new SimpleDateFormat(FormatDateInput, Locale.ENGLISH).parse(StringInputDate);
returnValue = dateFormat.format(date);
}
catch (ParseException e) {
returnValue = StringInputDate;
}
return returnValue;
}
In the tMap composant,
with :
Var.pattern
"dd/MM/yyyy"
Var.date_string_format_souhaite
!Relational.ISNULL(row13.Date) && !"".equals(row13.Date) ?
StringToDate.StringOutputDate(row13.Date, "EEE MMM dd HH:mm:ss zzz yyyy", Var.pattern)
:null
Var.Date_est_valide
!Relational.ISNULL(Var.date_string_format_souhaite) ?
StringToDate.DateIsValid(Var.pattern, Var.date_string_format_souhaite)
:false
DateFormatSouhaite_String
Var.date_string_format_souhaite
Date_neo
Var.Date_est_valide?
TalendDate.parseDate(Var.pattern, Var.date_string_format_souhaite)
:null
It works !

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello Shong. Do you still need my Excel file or my below answer is sufficient ?
Thank you for you concern.
