4 Replies Latest reply: Dec 2, 2015 2:36 PM by Stefan Wühl RSS

    Combining Data During Load

    Steve Nase

      Hello,

       

      QlikView Novice here seeking some guidance.  I am currently creating a dashboard that is meant to make it easy for people to find potential duplicate invoices.  Originally, I thought that I would create a field during the data load that just concatenates the fields I need: Vendor Number, and Invoice Amount. Then I could just look for where that new field is the same.  I quickly realized that these fields reside in different tables. Is there an easy way for me concatenate 2 fields from 2 different tables into a new field during load? 

       

      It may also be that my initial approach is flawed and I need to consider another way of finding potential duplicates.  The criteria for a duplicate is same invoice amount and same vendor with a similar invoice number (which I don't think QlikView has a way of finding, so I planned to leave this third criteria to human eyes scanning the dashboard)

       

      Any guidance is appreciated.  I realize that the question is a little bit vague, but I'm pretty stuck on how I should make this. 

       

      Thanks in advance,

      Steve

        • Re: Combining Data During Load
          Stefan Wühl

          As a first step, you could create a scatter chart with amount of invoice and invoice number on the axes. then similar amounts and numbers should be clustered. you can filter your chart by selecting your vendors or use vendor as dimension.

          • Re: Combining Data During Load
            Rob Wunderlich

            As a simple starter approach, you can create a Straight Table with a calculated Dimension of "Vendor & Amount" and an expression like "count(InvoiceNumber)". Those with 2 or more are potential duplicates. You can further refine this with Set Analysis to show only those where Count(InvoiceNumber)>1.

             

            You could do something similar in script. It's hard to give specific script without seeing your data model. Can you post a screenshot of your table viewer?

             

            -Rob

              • Re: Combining Data During Load
                Steve Nase

                I appreciate both your quick replies and will check into what you said.  Rob, below is my data model and the fields I mentioned are highlighted - what's tricky about the invoice number is there will never be an exact duplicate since the system doesn't let invoice numbers be duplicated, but users get around that by changing one character or appending a character on the end.  I don't think that Qlikview has any type of fuzzy searching capability like that, which is why I'm leaving that piece out.

                 

                  Data Model.jpg

                  • Re: Combining Data During Load
                    Stefan Wühl

                    If you want to search the invoices with exact same amount and same Vendor, I think Rob's solution should work perfectly fine to identify the candidates.

                     

                    You would then need to inspect the invoice numbers or other attributes to check if they are real duplicates.

                     

                    I have somehow misread your request in a way that amounts could also slightly differ, hence I suggested above more graphical approach. To get the invoice number on the scatter graph axis, it needs to be a number (or you may use another appropriate attribute, like the invoice timestamp). As said, I think Rob's solution is better if you are searching same amounts.

                     

                    If you are interested in string distance functions to compare the invoice numbers, you can start with

                    String metric - Wikipedia, the free encyclopedia

                     

                    AFAIR, there are some threads coping with the Levenshtein algorithm,

                    Levenshtein distance - Wikipedia, the free encyclopedia

                    e.g. here

                    Levenshtein Distance VBScript

                     

                    If users only append characters, there might be easier solutions (match the smaller string with the beginning of larger string).