Skip to main content
Announcements
A fresh, new look for the Data Integration & Quality forums and navigation! Read more about what's changed.
cancel
Showing results for 
Search instead for 
Did you mean: 
CatherineT
Contributor II

How to read and convert correctly a column in an XLSX file into a CVS file

I'm struggling with an Excel file (.xlsx) filled manually by people : it contains a column wich should be a date format, but sometimes I have some alphabetical characters in it.

So I would like to extract this column saying it is a string column and put it in a delimited file (CSV) in 2 colimns : one which would have the same value as the one in the Excel and which type is String, and another one which would contains the date as a Date.

The original Excel file is like :

0695b00000QEB6qAAH.png

0695b00000QEB76AAH.png

The problem is that when I'm looking at the log, I can see that the values displayed are not correct!

0695b00000QEB7oAAH.png

And when looking in the CSV file, there are modification of my values!

0695b00000QEB8cAAH.png

0695b00000QEBAJAA5.png

The result shows that the initial values read in the Excel file are not the one displayed in Excel. And I have alteration in the calculated Date_neo !

0695b00000QEBAnAAP.png

The function I used to calculate the fiel Date_neo is :

public static Date ValidDate(String pattern, String dateToValidate) {

  SimpleDateFormat formatter = new SimpleDateFormat(pattern);

   

  //To make strict date format validation

  formatter.setLenient(false);

  

  Date parsedDate = null;

   

  try {

    parsedDate = formatter.parse(dateToValidate);

    System.out.println("La chaîne de caractères [" + dateToValidate + "] est une date : " + formatter.format(parsedDate));

  } 

   

  catch (ParseException e) {

    //Handle exception

   System.out.println("La chaîne de caractères [" + dateToValidate + "] n'est pas une date : " + parsedDate);

   return null;

  }

   

  return parsedDate;

}

Could you help me ?

Labels (1)
  • Cloud

1 Solution

Accepted Solutions
CatherineT
Contributor II
Author

Yes, but it doesn't change anything.

So I leave it unchecked.

 

I think I found something : I created a MetaData in the Excel file section with the file I use.

0695b00000QEJqSAAX.pngAs you can see: the dates (if they are correct) are correctly read.

 

Then, I created 2 global routines :

public static Date ValidDate(String pattern, String dateToValidate) {

 

  SimpleDateFormat formatter = new SimpleDateFormat(pattern);

   

  //To make strict date format validation

  formatter.setLenient(false);

  

  Date parsedDate = null;

   

  try {

    parsedDate = formatter.parse(dateToValidate);

    System.out.println("La chaîne de caractères [" + dateToValidate + "] est une date : " + formatter.format(parsedDate));

  } 

   

  catch (ParseException e) {

    //Handle exception

   System.out.println("La chaîne de caractères [" + dateToValidate + "] n'est pas une date : " + parsedDate);

   return null;

  }

   

  return parsedDate;

}

 

public static String StringOutputDate(String StringInputDate, String FormatDateInput, String FormatDateOut) {

DateFormat dateFormat = new SimpleDateFormat(FormatDateOut);

  Date date = null;

   

  String returnValue = "";

   

  try {

date = new SimpleDateFormat(FormatDateInput, Locale.ENGLISH).parse(StringInputDate);

    returnValue = dateFormat.format(date);

  } 

   

  catch (ParseException e) {

    returnValue = StringInputDate;

  }

   

  return returnValue;

}

 

In the tMap composant,

0695b00000QEJsOAAX.pngwith :

Var.pattern

"dd/MM/yyyy"

 

Var.date_string_format_souhaite

!Relational.ISNULL(row13.Date) && !"".equals(row13.Date) ?

StringToDate.StringOutputDate(row13.Date, "EEE MMM dd HH:mm:ss zzz yyyy", Var.pattern)

:null 

 

Var.Date_est_valide

!Relational.ISNULL(Var.date_string_format_souhaite) ?

StringToDate.DateIsValid(Var.pattern, Var.date_string_format_souhaite) 

:false 

 

DateFormatSouhaite_String

Var.date_string_format_souhaite 

 

Date_neo

Var.Date_est_valide?

TalendDate.parseDate(Var.pattern, Var.date_string_format_souhaite)

:null 

 

It works !

0695b00000QEJsYAAX.png

View solution in original post

6 Replies
Anonymous
Not applicable

@Catherine Thomas​ can you upload this example file for testing?

CatherineT
Contributor II
Author

Hello Shong : I can't. I've got this message :

Your company doesn't support the following file types: .xlsx

 

That's why I made some screen shots...

gjeremy1617088143

Hi, have you try to use convert date column to string option in the advanced settings of th tFileInputExcel component ?

Anonymous
Not applicable

@Catherine Thomas​ , the community support file types .xlsx at attachment. If you can't upload it, is it possible to email it to me shong@talend.com

I want to see what's the real value in Excel file. the fist value looks like 01/02/1936, but it is read as 2/1/36.

In excel file, set the type of this column as text and read it as a string in Job.

 

Regards

Shong

CatherineT
Contributor II
Author

Yes, but it doesn't change anything.

So I leave it unchecked.

 

I think I found something : I created a MetaData in the Excel file section with the file I use.

0695b00000QEJqSAAX.pngAs you can see: the dates (if they are correct) are correctly read.

 

Then, I created 2 global routines :

public static Date ValidDate(String pattern, String dateToValidate) {

 

  SimpleDateFormat formatter = new SimpleDateFormat(pattern);

   

  //To make strict date format validation

  formatter.setLenient(false);

  

  Date parsedDate = null;

   

  try {

    parsedDate = formatter.parse(dateToValidate);

    System.out.println("La chaîne de caractères [" + dateToValidate + "] est une date : " + formatter.format(parsedDate));

  } 

   

  catch (ParseException e) {

    //Handle exception

   System.out.println("La chaîne de caractères [" + dateToValidate + "] n'est pas une date : " + parsedDate);

   return null;

  }

   

  return parsedDate;

}

 

public static String StringOutputDate(String StringInputDate, String FormatDateInput, String FormatDateOut) {

DateFormat dateFormat = new SimpleDateFormat(FormatDateOut);

  Date date = null;

   

  String returnValue = "";

   

  try {

date = new SimpleDateFormat(FormatDateInput, Locale.ENGLISH).parse(StringInputDate);

    returnValue = dateFormat.format(date);

  } 

   

  catch (ParseException e) {

    returnValue = StringInputDate;

  }

   

  return returnValue;

}

 

In the tMap composant,

0695b00000QEJsOAAX.pngwith :

Var.pattern

"dd/MM/yyyy"

 

Var.date_string_format_souhaite

!Relational.ISNULL(row13.Date) && !"".equals(row13.Date) ?

StringToDate.StringOutputDate(row13.Date, "EEE MMM dd HH:mm:ss zzz yyyy", Var.pattern)

:null 

 

Var.Date_est_valide

!Relational.ISNULL(Var.date_string_format_souhaite) ?

StringToDate.DateIsValid(Var.pattern, Var.date_string_format_souhaite) 

:false 

 

DateFormatSouhaite_String

Var.date_string_format_souhaite 

 

Date_neo

Var.Date_est_valide?

TalendDate.parseDate(Var.pattern, Var.date_string_format_souhaite)

:null 

 

It works !

0695b00000QEJsYAAX.png

CatherineT
Contributor II
Author

Hello Shong. Do you still need my Excel file or my below answer is sufficient ?

Thank you for you concern.