Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hello,
I have data in a text file, encoded in UTF-8 (w/o BOM), with fixed records.
When the file contains a special character, this one is considered as 2 characters, and all the following data is parsed wrong (with a shift).
This file:
BRAND MODEL DATE VALUE Audi A3 20140101abcdefgh Audi A4 20140202abcdefgh Audi Coupé 20140303abcdefgh
BRAND MODEL DATE VALUE
Audi A3 20140101abcdefgh
Audi A4 20140202abcdefgh
Audi Coupé 20140303abcdefgh
loaded with QlikView:
Data: LOAD @1:16 AS BRAND, @17:35 AS MODEL, @36:43 AS DATE, @44:n AS VALUE FROM test.csv (fix, utf8,header is 1 lines);
Data:
LOAD @1:16 AS BRAND,
@17:35 AS MODEL,
@36:43 AS DATE,
@44:n AS VALUE
FROM
test.csv
(fix, utf8,header is 1 lines);
will give me a wrong DATE for the last record: "2014030" instead of "20140303", because the "é" of "Coupé" will count as 2 characters.
And it's VALUE will be "3abcdefgh" (with a "3" that should not be there).
If I convert the same file in ANSI, I don't have the problem.
(please, don't answer me "so, convert the file in ANSI" )