Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
I have a couple of CSV files that I load into Data Prep. All at once (I only specify a directory in "Add Dataset", no individual files). So far, so good.
All files have the same structure, the first line is the header.
Is there a way to globally set the first row as header for all files? I know there is this "Row" -> "Make as header..." feature, but what happens in my case is:
file1.csv:
Firstname;Lastname;Age
Felix;Kjellberg;23
Julian;Ilett;43
file2.csv:
Firstname;Lastname;Age
Ben;Heck;58
Dave;Jones;48
The result in Data Prep is:
Firstname|Lastname|Age
Ben Heck 58
Dave Jones 48
Firstname Lastname Age
Felix Kjellberg 23
Julian Ilett 43
So even if I set the blue line as header, the green line will stay. Is there a way to avoid this?
Hi,
Out of curiosity, can you confirm the following?
To answer your question: there is no dedicated data set parameter or function to remove subsequent occurrences of the header but you can do it in a single preparation step: set a filter on the first column with the column header as filter value (so filter on "Firstname" in your example below) and use the function "delete filtered rows".
Regards,
Gwendal