3 Replies Latest reply: Mar 31, 2014 11:12 AM by Gysbert Wassenaar RSS

    Strip commas from fixed width file



      I have a couple of questions. For some background, I load data from a daily data file (data load is cumulative i.e. I have to load all files YTD) which is a fixed width csv file and can be pretty big (50-60k new records every day). However, one of the fields in the file is an address field and that field occassionally contains commas, which makes it next to impossible to load the file as CSV. The problem in loading it as fixed width is that Qlikview takes a long time to load these files up compared to csv files and I have also found some bugs whereby the data that Qlikview loads gets corrupted (You get a different set of data with every reload - takes me a few reloads to get the correct data - I will raise that with Qlikview). So my question is:


      • Can I somehow create a hybrid script such that I can load certain character ranges as CSV and some others as fixed with? e.g. Treat character 1 to 1000 as CSV, 1001:1300 as Fixed width and 1301:2000 as CSV again? Failing that, can I strip out the commas from the address field in the script without impacting the rest of the commas?
      • Second question is that some of the entries in the fixed width file are corrupted (usually the supplier uses spaces to pad fields and sometimes there aren't enough spaces in the line). So if I am loading this file as a fixed with file, can I somehow create a dimension that returns the length of the entire line so that I can create that as a dimension?




        • Re: Strip commas from fixed width file
          Gysbert Wassenaar

          The proper solution is to have the application that creates the csv file apply proper quoting on the address field. Another solution would be to create the csv file with another field separator. Perhaps a semicolon ; or a pipe character |.


          If you need to fix this in qlikview then one way would be to load the whole line as one record, then replace the spurious comma's in the address field, then store the results into another text file and finally load that text file as csv. You can use the Textbetween and replace functions to get rid of the comma's.



          Load Replace(TextBetween(@1, ',' , ',' , 15) , ',' , '_') as Record From ....; // if Address is the 15th field


          Store Temp into temp.csv (txt);