9 Replies Latest reply: Nov 30, 2010 5:44 AM by jbeleza RSS

    Help importing ASCII file with returns

      Hi,

      I'm trying to import an ASCII file from an old database, where I have data that should be on the same line separated in two or more lines with returns, somtehing like this:

       

      field1|field2|field3

      right|right|right

      right|

      wrong|wrong

      right|right|right

       

      So when I import the data using "|" as the separator I get some problems, in this case I would have 4 lines instead of the real 3, just because the second line got separet in two lines, with the fields divided.

       

      Can you think of a way to solve the problem?

        • Help importing ASCII file with returns

          Hi,

          Is the extra line feed always after the record separator ? If yes, look for command line tool sed (there is Windows version available) . Replace the regular expression at the end of the line "|CRLF$" with "|" -> that will merge the lines.

          Something like

          sed "/|\n$/|/" file.txt > fixed.txt

          If the extra line feeds can be in random places, you need more sofisticated algorithm. Ex a Pyton / Perl script that reads one line, counts the separator characters. If those are not enough, concatenate with next line.

          -Alex

            • Help importing ASCII file with returns

              Hi Alex,

              Thanks for the reply.

              Actually I was trying to do this when I run the script on import, because it will happen to several files and I didn't want to create a new version of the file. And yes, the extra line feeds can be in random places...

              I already tried to purge and replace the chars on the load script, but I haven't managed to solve the problem. I guess the only way is to read the file creating a "purged" resident file that will later be imported as a new file without the CR and LF, dropping the first.

              If I can't find a similar solution I will have to choose a solution like the one you are proposing, which seems to solve my problem directly on the file.

                • Help importing ASCII file with returns

                  Hi,

                  You can use sed to drop all CR (or CRLF) . Than have a Perl / Python script that reads exactly as many delimiters as there should be fields, insert a CRLF, write to fixed.txt, and repeat until EOF.

                  I am not sure you can do this with QlikView script. It does not handle well output to text files, nor reading N fields instead of complete line.

                  -Alex

                    • Help importing ASCII file with returns

                      Hi,

                      QlikView is very good at reading at line level, but will got confused by line breaks in this case.

                      ----------------------------------------------------------

                      Command line bellow will remove all CR LF characters:

                      tr -s "\r\n" "|" < bad.txt > temp.txt

                      This results in

                      field1|field2|field3|right|right|right|right|wrong|wrong|right|right|right

                      Couldn't find one-liner to replace every 3rd separator with separator and CRLF . So try a Python script that produces 4 column of data

                       



                      FNAME = 'temp.txt'
                      NR_FIELDS = 3

                      with open(FNAME) as f:
                      raw = f.readlines()[0].split('|')
                      res = []
                      for i in xrange(len(raw)):
                      if (i % NR_FIELDS == 0):
                      res.append('\n' + rawIdea)
                      else:
                      res.append(rawIdea)

                      print '|'.join(res)


                      -Alex

                    • Help importing ASCII file with returns
                      Miguel Angel Baeyens de Arce

                      Hello,

                      You may use as well something like the following in your load script

                       

                      LOAD Replace(F1, chr(10), '|') AS F1FROM File;


                      Depending on the charmap you may need to add the chr(13) char too.

                      Hope this helps.

                        • Help importing ASCII file with returns

                          Hi Miguel,

                          I already tried several solutions with purgechar and replace. They don't work because I'm importing the file and when you do that, from what I understood, qv script just reads the separators for the fields, creating the structure of the table, and then it replaces the chars inside the fields created. I also thought on breaking the fields on two temp tables and then joining them, but that isn't either a great solution, because I don't have a 'clean' keyfield to do this (because of the separators and artificial lines).

                          Also removing all CR and LF, like Alex also tried, creates a single structure that is hard to break after, creating all the single lines with the correct structure.

                          I'll keep trying when I have some time available, for now I think I'll just have to change the original files.

                          Thanks!

                          Joaquim