Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi all,
I have a CSV file which contains data in columns which start with a single quote. Looks like this:
ID Description Type Name
1 '17 name Main Name1
2 '17 name Main Name2
3 '17 name Main Name3
4 '17 name Main Name4
For some reason because of the quote it creates 1 field and my Description field looks like: '17 name;Main;Name1. My other fields move a column so the type field suddenly has the Name value in it.
I now it is because of the quotes because when I removed them from the data everything went fine. I tried to remove the quote with the Purgechar or replace and tried to tweak the CSV load with the msq or no quote value. Unfortunatly nothing seems to help.
Hope you guys have an idea how to fix this. Unfortunatly because of confidential data I can't attach the example
Regards, Paul
Please provide a csv with a few example-records - both correct ones and the faulty ones and also an example how the faulty records should look like if they are corrected.
Beside them it could be more appropriate to correct these issues within the source respectively to adjust the export into the csv-files. Another approach might be to flag those invalid records and to ignore them per filter or selections within the further evaluations (not each data-quality issue could be solved respectively is it worth to build heavy workarounds and/or to adjust them manually).
- Marcus
Hi Paul,
Try this
replace(Description, chr(39), '') as Description
It is working fine for me.
Hi Sravanthi,
I tried this as well, it does replace the character for me but I still get the same outcome. I still get the field with multiple columns in it.
Regards, Paul
From here what was the expected result?
This is the result I got:
17 Odh buiten groenvoorziening;1160 OH van geb en t;1;Artikel;406071;Flora 17
Every ; is a different column, the ' in front off 17 has indeed been removed. But still it creates 1 field from multiple columns. The above should be 6 columns. This goes ok for all other fields except the one that start with a single quote
Regards Paul
You could split it with subfield() like:
subfield(FIELD, ';', 1) as F1,
subfield(FIELD, ';', 2) as F2,
....
- Marcus
Still doubt yar
May be SubField() function can help you to split 6 columns. But i am not sure how you are interpreting the values
Hi,
Thing is that for the other 50000 records I already have the different columns. so only for about 100 records it does the above because there are quotes in there.
Regards, Paul
Do I understand it right that there just a few records which are different? Then I would probably load those data twice and filter those records per: where substringcount(FIELD, ';') = 0 respectively > 0 within the (concatenated) second load and perform there the subfield-stuff.
- Marcus
Can you share the csv, or at least a snippit of it?
I can't replicate this issue which suggests to me you might have another unprintable character in there.
What about trying
keepchar(field,'abcdefghijklmnopqrstuvwxyz0123456789') as field