Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Qlik GA: Multivariate Time Series in Qlik Predict: Get Details
cancel
Showing results for 
Search instead for 
Did you mean: 
Niqander
Contributor III
Contributor III

Load a CSV file with double-double quotes as escape

Hi Guys,

I'm quite experienced with Qlik, but this CSV is killing me. 

When loading this CSV in Excel, it will tranfsorm the text to columns without a problem

When loading it into an Qlik Cloud analytics app it cannot figure out how to get around these two things:

  1. "Each row starts and ends with a double quote"
  2. Each field gets a ""double-double quote"" as an escape

The CSV will load correctly in qlik when i manually correct the file like this:

  1. remove start and end double quote
  2. replace double-double quote to only 1 double quote

Now above manual corrections is not what i want to do for a daily refreshed csv.

Can you solve this puzzle for me?

@PDBI 

Labels (1)
1 Solution

Accepted Solutions
rwunderlich
Partner Ambassador/MVP
Partner Ambassador/MVP

Hi,

Here's an updated script that correctly handles the comma lists. 

TempRaw:
Load
Replace(Mid(Raw, 2, len(Raw)-2), '""', '"') as Raw
;    
LOAD
    "@1:n" as Raw
FROM [lib://DataFiles/Double quotes problem - only 2 rows for testing.csv]
(fix, codepage is 28591, no labels, Header is 1 line);
 
Header:
LOAD 
Concat('@' & RecNo() & ' as [' & Fname & ']', ',', RecNo()) as FieldList
;    
LOAD 
SubField(Header, ',') as Fname
;    
// First 1 doesn't work here in a preceding load
LOAD
    "@1:n" as Header
FROM [lib://DataFiles/Double quotes problem - only 2 rows for testing.csv]
(fix, codepage is 28591, no labels)
Where RecNo() = 1
;
Let vFieldList = Peek('FieldList');
 
Data:
LOAD
   $(vFieldList)
From_Field (TempRaw, Raw)
(txt, codepage is 28591, explicit labels, delimiter is ',', msq);
 
Drop Tables TempRaw, Header;
 
-Rob

View solution in original post

7 Replies
Niqander
Contributor III
Contributor III
Author

Do I need to create an automation which will picks up this file and pulls it through a powershell script which preps this csv for Qlik and then stores the prepped csv for loading into qlik? 

Niqander
Contributor III
Contributor III
Author

This is the powershell which will overwrite the CSV with a qlik-prepped CSV. it basically deletes the first character for every row and it replaces "" for ". 

rwunderlich
Partner Ambassador/MVP
Partner Ambassador/MVP

You may find that the cleanest solution is to preprocess with something like an Automation. For what it's worth, here's a pure script solution that works only if there are no double quotes in the actual data, as it just blindly purges all double quotes.

TempRaw:
LOAD
    PurgeChar("@1:n", '"') as Raw
FROM [lib://DataFiles/Double quotes problem - only 2 rows for testing.csv]
(fix, codepage is 28591, no labels, Header is 1 line);
 
Header:
LOAD 
  Concat('@' & RecNo() & ' as [' & Fname & ']', ',', RecNo()) as FieldList
;    
LOAD 
SubField(Header, ',') as Fname
;    
// First 1 doesn't work here in a preceding load
LOAD
    "@1:n" as Header
FROM [lib://DataFiles/Double quotes problem - only 2 rows for testing.csv]
(fix, codepage is 28591, no labels)
Where RecNo() = 1
;
Let vFieldList = Peek('FieldList');
 
Data:
LOAD
   $(vFieldList)
From_Field (TempRaw, Raw)
(txt, codepage is 28591, explicit labels, delimiter is ',');
 
Drop Tables TempRaw, Header;
 
The From_Field load requires a fieldlist like
@1 as id,
@2 as active, etc

The Header Load and vFieldList logic is just my attempt to automate the generation of the fieldlist. You could hard code it in the final load if that appears clearer. 
 
Niqander
Contributor III
Contributor III
Author

Hi Rob, Thanks for climbing into this CSV! I got your solution working and attached I have the result.

As you can see the file loads and colums get filled, but there is 1 field "regions" which has comma separated values inside the field. these values will populate and mess up the fields after "regions" .

rwunderlich
Partner Ambassador/MVP
Partner Ambassador/MVP

Hi,

Here's an updated script that correctly handles the comma lists. 

TempRaw:
Load
Replace(Mid(Raw, 2, len(Raw)-2), '""', '"') as Raw
;    
LOAD
    "@1:n" as Raw
FROM [lib://DataFiles/Double quotes problem - only 2 rows for testing.csv]
(fix, codepage is 28591, no labels, Header is 1 line);
 
Header:
LOAD 
Concat('@' & RecNo() & ' as [' & Fname & ']', ',', RecNo()) as FieldList
;    
LOAD 
SubField(Header, ',') as Fname
;    
// First 1 doesn't work here in a preceding load
LOAD
    "@1:n" as Header
FROM [lib://DataFiles/Double quotes problem - only 2 rows for testing.csv]
(fix, codepage is 28591, no labels)
Where RecNo() = 1
;
Let vFieldList = Peek('FieldList');
 
Data:
LOAD
   $(vFieldList)
From_Field (TempRaw, Raw)
(txt, codepage is 28591, explicit labels, delimiter is ',', msq);
 
Drop Tables TempRaw, Header;
 
-Rob
Niqander
Contributor III
Contributor III
Author

Hi Rob, many thanks for changing the code for me to only purge the first and last " with the mid function, I could/should have done that myself, sorry for taking your time on this one. I ran the code for the full dataset and all data looks to be loading into the correct fields now.

I have filed an "Qlik Sense | Data Prep & Modeling" Ideation for this CSV file to be handled by the engine automatically without the need of the above script, just like MS Excel (see attached) does with the text-to-columns function without any further need to alter text beforehand.

https://ideation.qlik.com/app/#/case/515953?currentProductId=9da99bc1-0dfb-4471-8e12-d23685578c43&cp...

rwunderlich
Partner Ambassador/MVP
Partner Ambassador/MVP

Glad it works. I agree it would be good if the loader handled it as-is.

-Rob