11 Replies Latest reply: Jun 14, 2017 10:20 AM by Eric Vilarrasa Canadell RSS

    Peek and Concatenate missunderstanding

    Eric Vilarrasa Canadell

      Hello,  I have a little problem that I think that comes from the misunderstanding of the Concatenate or Peek functions. As far as I know those functions do:

       

          Concactenate (Old_Table): Used when loading a table this adds the rows of the new tabla at de end of (in the last position) the Old_Table to create the New_Table.

       

          Peek(Field): Written this way, this function returns the value of the last row of the new table.

       

      Given This, suppose that I have a Table named "Old Table"  whit data of some product that includes sn (Serial Number) , date (Timestamp), and Color. What I want is to Concatenate Table_With_Duplicates to it so i use a where statement with peeks to check if the last row value is the same as the new one I want to introduce and discard it.

       

      My code:


      Concatenate(Old_Table):
      load *
      Resident Table_Sorted_With_Duplicates wherenot(peek(date)=date and sn=peek(sn) and COLOR=peek(COLOR));
      drop table Table_Sorted_With_Duplicates;

       

      With this code, when on the old table still there but if I create first a table without duplicates and then concatenate it works:

      NoConcatenate

      Table_Clean:

      load *
      Resident Table_Sorted_With_Duplicates wherenot(peek(date)=date and sn=peek(sn) and COLOR=peek(COLOR));
      drop table Table_Sorted_With_Duplicates;


      Concatenate(Old_Table):

      load *
      Resident Table_Clean;

      drop table Table_Clean;


       

      My point is that there is something that I misunderstand abount the Concatenate function on QV and when you use it the new rows are not added at the end of the table where the peek is checking. 

       

      I provide you an example to test it. Also, I have testested with peek( Variable, 1)  in order to know if the values are added on the top an does not have worked.

        • Re: Peek and Concatenate missunderstanding
          Peter Rieper

          Cannot rebuild your problem, your assumptions on concatenate and peek are correct.

          You may share some sample data in order to have a closer look

           

          please find a simplified model attached.

           

          Peter

          • Re: Peek and Concatenate missunderstanding
            Julie Decoene

            Hi Eric,

             

            I would suggest to create a temporary key in your first table as follows: AutoNumberHash256(date, sn, COLOR) as Key_temp. Then for the concatenation use the where clause : Where Not Exists('Key_temp', AutoNumberHash256(date, sn, COLOR)).

             

            Julie

              • Re: Peek and Concatenate missunderstanding
                Eric Vilarrasa Canadell

                I didn't know that way. Does it mean that the AutoNumberHash256() function does not have a value (or it is null) when there is a repetition and that is why you charge just when the field exists?

                 

                Also, I think that the one that just applies on the table you are loading from is the previous and that is why it work. :S

                  • Re: Peek and Concatenate missunderstanding
                    Julie Decoene

                    The AutoNumberHash will calculate an unique key for each combination of the expressions inside it.

                    The Exists will compare the new keys with existing ones. Exists checks the field Key_temp in the whole model.


                    exists(field [ , expression ] )

                    Determines whether a specific field value exists in a specified field of the data loaded so far. Field is a name or a string expression evaluating to a field name. The field must exist in the data loaded so far by the script. Expr is an expression evaluating to the field value to look for in the specified field. If omitted, the current record’s value in the specified field will be assumed.


                    I believe this is a more robust solution than only looking at the previous record, since it will check the whole model for duplicates. If you use Peek or Previous you first need to sort your records and apparently it does not work with concatenate, so that's a lot of manipulation that can be avoided with the Exists() function and a key.

                    • Re: Peek and Concatenate missunderstanding
                      Julie Decoene

                      I don't know if you noticed in your example, but the 2 last lines with D2, S2 from table 2 are not loaded, which means in the case of using Concatenate and Peek together, you are comparing all the records of table 2 with the last one of table 1. So it seems to be a consequence the order in which QV is performing the different operations. Sorry I cannot explain more than that why it's not working as you expected.

                        • Re: Peek and Concatenate missunderstanding
                          Eric Vilarrasa Canadell

                          This is True and I did not have noticed about that. Also I checked that if you use peek( Field, 1 ) taking the first value the ones that are not there are S1, D1 that are the ones in the first place.

                           

                          This makes me think in 2 possibilities:

                           

                              1- The QV is inserting the data on some strange place in the middle (and I would like to find where)

                           

                              2- Maybe, doing the peek on the while statement is looking at the Old_Table, but not the concatenation

                           

                          Anyway, thank you for your comments.

                          • Re: Peek and Concatenate missunderstanding
                            Eric Vilarrasa Canadell

                            Hello, I think I Find out what is really happening.

                             

                            The Way QV does the concatenate is not ROW by ROW unless it must do it this way. If is possible for the QV just adds the whole table with one step and then checks the where condition after it. It is easy to see that if the that this consumes much less memory and is faster than do it row by row.

                             

                            This process is like if it was pasting the memory location of the first row just after the last row of the previous table in order to start with the new one.


                            This does not happen if there is a calculated field on the load, then the QV is forced to create it row by row.

                             

                            Event though I marked this answer as the correct (because is the one that answers the question), If someone reads this searching for a way to eliminate duplicates, please, read the answers of the other people, among them there are pretty good ways to do it.