Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Qlik Open Lakehouse is Now Generally Available! Discover the key highlights and partner resources here.
cancel
Showing results for 
Search instead for 
Did you mean: 
Anonymous
Not applicable

Load Multiple CSV files with different Structure

Hi all,
We have a project requirement where we are required to load 100+ CSV files into the oracle database. All 100 CSV files have different structure. Before loading the files into the database we need to perform basic validations like, 
- for Date column check if date is in correct format
- trim the white spaces for strings
- check if numeric data types columns actually consists of numbers
Now I have 2 queries:
- I am thinking of storing the name and schema of the CSV files in database. Read that meta data information of the file and then depending upon the meta data, load the data into the oracle tables. This way tomorrow is 100+ new CSV files are added for loading, I do not have to create a new job everytime.
Please suggest how it can be achieved. 
Labels (4)
28 Replies
Anonymous
Not applicable
Author

Hi,
Have you already checke talend dynamic schema feature which  allows you to design schema with an unknown column structure (unknown name and number of columns).
Here is the article about:TalendHelpCenter:How to process changing data structure?
Best regards
Sabrina
Anonymous
Not applicable
Author

I am aware of it but this does not allow me to perform any data quality checks on the Data. This will be a simple load from file to the database or vice versa.
Thanks,
Sankalp
Anonymous
Not applicable
Author

Hello,

         How can I import data from excel or csv file and dump data in two tables  State Name, State Code, City Name, City Code in State_City table and Sr No, Name, Address, Department, Designation, Date of Joining in Emp table? Here is the attached file.


ExcelFile.xlsx
Anonymous
Not applicable
Author

Hi Dhara,

  

        You will require two logical operations in this context.

 

      After reading the file (while reading, specify limit value as 5 so that you will read only header part), you need to push the data to a tjavarow or tmap. Since State Name is in column A and value is in column B, you can do a comparison operation. ie, if input_row.columnA is equal to "State Name:" then output_row.statname=input_row.columnB. Similar operation can help you to parse other header details.

 

     For reading the detailed content, read the data using texcelinput component again but this time specify the header as 5. In this way, you will be skipping the header components. 

 

     Hope this can resolve your issue. If it is working, please mark as accepted solution so that it will help Talend community.

 

Warm regards,

Nikhil Thampi

    

Anonymous
Not applicable
Author

How can I write if else condition in tJavaRow component?

if (input_row.A=='State Name:')
{
output_row.statename = input_row.B;
System.out.println(output_row.statename);
}
else if (input_row.A=='City Name:')
{
output_row.cityname = input_row.B;
System.out.println(output_row.cityname);
}

I wrote this code but getting error 'Invalid Character constant'.

See the screenshot

0683p000009LzKH.png

Anonymous
Not applicable
Author

Hi,

 

You have assigned the value as constant and that is the reason for error. You will have to try like below in tjavarow.

 

if (!Relational.ISNULL(input_row.A) && input_row.A.equals("State Name:"))
{
output_row.statename = input_row.B;
}
else if (!Relational.ISNULL(input_row.A) && input_row.A.equals("City Name:"))
{
output_row.cityname = input_row.B;
}

 

If you think the suggestion has worked, please mark the topic as acceptable solution.

 

Warm Regards,

 

Nikhil Thampi

Anonymous
Not applicable
Author

I am getting duplicate rows when I dump data into Oracle table and I am not getting state code value and city code value.

This is my code.

if (!Relational.ISNULL(input_row.A) && input_row.A.equals("State Name:"))
{
output_row.statename = input_row.B;
}
else if (!Relational.ISNULL(input_row.A) && input_row.A.equals("City Name:"))
{
output_row.cityname = input_row.B;
}
else if (!Relational.ISNULL(input_row.C) && input_row.C.equals("State Code:"))
{
output_row.statecode = input_row.D;
}
else if (!Relational.ISNULL(input_row.C) && input_row.C.equals("City Code:"))
{
output_row.citycode = input_row.D;
}

Anonymous
Not applicable
Author

I am getting duplicate rows when I dump data into Oracle table and I am not getting state code value and city code value.

This is my code.

if (!Relational.ISNULL(input_row.A) && input_row.A.equals("State Name:"))
{
output_row.statename = input_row.B;
}
else if (!Relational.ISNULL(input_row.A) && input_row.A.equals("City Name:"))
{
output_row.cityname = input_row.B;
}
else if (!Relational.ISNULL(input_row.C) && input_row.C.equals("State Code:"))
{
output_row.statecode = input_row.D;
}
else if (!Relational.ISNULL(input_row.C) && input_row.C.equals("City Code:"))
{
output_row.citycode = input_row.D;
}

Can you help me to resolve this?

 

 

 


0683p000009LzLF.png

Output:

StateName  CityName StateCode CityCode

Gujarat        null             123           null

Gujarat        abc            123           234

Gujarat        abc            123           234

 

Thanks,

Anonymous
Not applicable
Author

Hi Dhara3010,

 

        You will have to denormalize the dataset to get a single row of data. I have used the same excel sheet provided by you in your previous chat and I got the result set in single line. I have attached the sample job for your reference as zip file. Please make necessary changes according to your need.

 

         0683p000009LzHs.png

And the result I got is as shown below (you may have to reformat the date column according to your choice).

 

 0683p000009LzLP.png

The code used in tjavarow is as shown below.

 

//Code generated according to input schema and output schema
output_row.merge_data="Yes";
output_row.statename = "";
output_row.cityname = "";
output_row.statecode = "";
output_row.citycode = "";
output_row.fromdate = "";
output_row.todate = "";

 

if (!Relational.ISNULL(input_row.A) && input_row.A.equals("State Name:"))
{
output_row.statename = input_row.B;
}
else if (!Relational.ISNULL(input_row.A) && input_row.A.equals("City Name:"))
{
output_row.cityname = input_row.B;
}
else if (!Relational.ISNULL(input_row.A) && input_row.A.equals("From Date:"))
{
output_row.fromdate = input_row.B;
}

if (!Relational.ISNULL(input_row.C) && input_row.C.equals("State Code:"))
{
output_row.statecode = input_row.D;
}
else if (!Relational.ISNULL(input_row.C) && input_row.C.equals("City Code:"))
{
output_row.citycode = input_row.D;
}
else if (!Relational.ISNULL(input_row.C) && input_row.C.equals("To Date:"))
{
output_row.todate = input_row.D;
}

 

 

Warm Regards,

 

Nikhil Thampi

 


test2.zip