Re: Strip commas from fixed width file - Qlik Community

Report Inappropriate Content · ‎2014-03-31

Hi

I have a couple of questions. For some background, I load data from a daily data file (data load is cumulative i.e. I have to load all files YTD) which is a fixed width csv file and can be pretty big (50-60k new records every day). However, one of the fields in the file is an address field and that field occassionally contains commas, which makes it next to impossible to load the file as CSV. The problem in loading it as fixed width is that Qlikview takes a long time to load these files up compared to csv files and I have also found some bugs whereby the data that Qlikview loads gets corrupted (You get a different set of data with every reload - takes me a few reloads to get the correct data - I will raise that with Qlikview). So my question is:

Can I somehow create a hybrid script such that I can load certain character ranges as CSV and some others as fixed with? e.g. Treat character 1 to 1000 as CSV, 1001:1300 as Fixed width and 1301:2000 as CSV again? Failing that, can I strip out the commas from the address field in the script without impacting the rest of the commas?
Second question is that some of the entries in the fixed width file are corrupted (usually the supplier uses spaces to pad fields and sometimes there aren't enough spaces in the line). So if I am loading this file as a fixed with file, can I somehow create a dimension that returns the length of the entire line so that I can create that as a dimension?

Thanks,

Abhi

Gysbert_Wassenaar · ‎2014-03-31

The proper solution is to have the application that creates the csv file apply proper quoting on the address field. Another solution would be to create the csv file with another field separator. Perhaps a semicolon ; or a pipe character |.

If you need to fix this in qlikview then one way would be to load the whole line as one record, then replace the spurious comma's in the address field, then store the results into another text file and finally load that text file as csv. You can use the Textbetween and replace functions to get rid of the comma's.

Temp:

Load Replace(TextBetween(@1, ',' , ',' , 15) , ',' , '_') as Record From ....; // if Address is the 15th field

Store Temp into temp.csv (txt);

talk is cheap, supply exceeds demand

Report Inappropriate Content · ‎2014-03-31

Thanks for the reply but won't it just replace the first comma instance in address, whereas I can have more than one comma in the address field.

Gysbert_Wassenaar · ‎2014-03-31

It probably won't do anything at all. Perhaps something like this works.

Temp:

// address field between 7th comma from the start and 6th comma from the end

LOAD

@1:n, left(@1:n,index(@1:n,',',7)) & replace( mid( @1:n,index(@1:n,',',7)+1,index(mid( @1:n,index(@1:n,',',7)+1),',',-6)),',','') & right(@1:n,1+len(@1:n)-index(@1:n,',',-6)) as Record

FROM Data.csv (fix, codepage is 1252);

Store Temp into DataNew.csv (txt);

talk is cheap, supply exceeds demand