I just found out that it is not the delimiter type that is an issue here. I changed the delimiter in the source file from ', to ~ and still have the same problem. The text in the field uses some kindof character that makes qlikview read it as the next field. I think it is the TAB in the text field that makes it considered as new field. Any solutions for that.
This could be a carriage return character in one of the fields making Qlikview split the record into two or more lines, however when it reaches the end of the record it should see the start of the next record.
Is it just a handful of records being thrown out, or all of them?
Are you able to control the output of the csv file?
You could try quoting the fields (if the source system allows this) so the data in the csv looks like ...
"data1","data2","data3" etc - this may work as long as there are no " characters in the data!
I have a text file feed with this problem on one record and it cannot be corrected in the source system so I have to feed my file through an external batch file process which replaces the CR character with an empty string, then loads as normal. It's a nuisance but there's no other way to fix it unless you can correct it at source.
This isn't a Qlikview issue, it's the data. Anyway, here's the script I am using which seems to work. Copy and paste this into a .vbs file. It won't run by itself as it needs parameters (see my notes later on).
Const ForReading = 1
Const ForWriting = 2
sCurPath = CreateObject("Scripting.FileSystemObject").GetAbsolutePathName("..\")
strFileName = sCurPath & Wscript.Arguments(0)
strOldText = Replace(Wscript.Arguments(1),"vbLf",vbLf)
'strOldText = "\" & vbLf
strNewText = Wscript.Arguments(2)
'strNewText = "\"
'msgbox strFileName & " " & strOldText & " " & strNewText
Set objFSO = CreateObject("Scripting.FileSystemObject")
Set objFile = objFSO.OpenTextFile(strFileName, ForReading)
strText = objFile.ReadAll
strNewText = Replace(strText, strOldText, strNewText)
Set objFile = objFSO.OpenTextFile(strFileName, ForWriting)
Then use something like this in your load script ...
// Call the vbscript to remove linefeed characters in the datafeed file
SET vbs = "..\?????.vbs"; // path to the vbs file
SET LoadFile = "..\?????.csv"; //Path to file to cleanse (Filename cannot contain spaces. )
SET RemoveString = "\vbLf"; //vbLf will be replaced with correct vbLf character in vbs file
SET NewString = "\";
// Build command line and execute
LET cmdExe = 'cmd.exe /c ' & chr(34) & '$(vbs)' & chr(34) & ' ' & '$(LoadFile)'& ' ' & '$(RemoveString)' & ' ' & '$(NewString)';
I had to pass "vbLf" as a text string in the command then change this to vbLf in the .vbs file.
Unfortunately there isn't much you can do to prevent delimiter characters appearing in a field unless this can be controlled at the point of data entry. What I have set up in some of my QV docs is a field count check, then flag as suspicious those records which appear to have additional fields (due to extra delimiters).
First, get your top row (preferable the embedded field labels, or at least a record you know is correct) as a single text string and count the number of delimiter characters (pipe characters in this example) ...
substringcount ( @1, '|' ) AS ExpectedFields
(txt, codepage is 1252, no labels, delimiter is ',', msq);
Then create a table of suspicious rows ...
RecNo() as SR_MainRowRef,
' Record No: ' & RecNo() & ' has ' & substringcount (@1, '|' ) & ' fields, when ' & FieldValue('ExpectedFields',1) & ' expected.' as DQ_Comment
(txt, codepage is 1252, no labels, delimiter is \x7f, msq, header is 1 lines)
where substringcount (@1, '|' ) <> FieldValue('ExpectedFields',1);
// delimiter here is a DEL character which is very unlikely to appear
Then load your data as normal ensuring you include the following field ...
, RecNo() as MainRowRef
Finally, join the suspicious row data as follows ...
left join (data)
SR_MainRowRef as MainRowRef,
1 as DQ_Flag
.. and drop the SuspectRows table.
Hope this helps