Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
I have given up finding an answer that can help me with this problem I have.
I have a .csv file I want to load, it is using a semicolon ';' as a delimiter.
The problem I have is that the semicolon can also be found in some text that is also present in the .csv file.
This is my load:
Load
numbers1,
text, //Here the text with ';' can be found
numbers2
FROM
[C:\TEST\TEST.csv]
(txt, utf8, embedded labels, delimiter is ';', msq);
I am not 100% sure that the semicolon is the only problem when dealing with text strings coming from people manually typing in the text, but I have tried removing, replacing, and purging all the untypical characters I can find, I have even tried only loading a couple of fields, but nothing works.
In similar cases I load such files with the wizard with the option of a fixed length. This means a whole record is loaded within a single field.
Now we are able to count with substringcount() the delimiter-char to differentiate between the records with the expected number of the delimiters and the others. In the wrong records we could check with index() where the delimiters are and which other chars are around to detect the wrong ones which could then removed. Depending on the data-quality it could become a quite ugly task but there is not really an easier way else to go back to repair the source-data.
- Marcus
Thank you for the suggestion @marcus_sommer, and have saved a lot of time just knowing that there is no easy way other than importing the .csv file as fixed length and then work on separating it manually,
I will try and use what you told me and share the result when I have done it, and ofc gives you an acceptance when I have done so.