Hello, I have a little problem that I think that comes from the misunderstanding of the Concatenate or Peek functions. As far as I know those functions do:
Concactenate (Old_Table): Used when loading a table this adds the rows of the new tabla at de end of (in the last position) the Old_Table to create the New_Table.
Peek(Field): Written this way, this function returns the value of the last row of the new table.
Given This, suppose that I have a Table named "Old Table" whit data of some product that includes sn (Serial Number) , date (Timestamp), and Color. What I want is to Concatenate Table_With_Duplicates to it so i use a where statement with peeks to check if the last row value is the same as the new one I want to introduce and discard it.
Concatenate(Old_Table): load * Resident Table_Sorted_With_Duplicates wherenot(peek(date)=date and sn=peek(sn) and COLOR=peek(COLOR)); drop tableTable_Sorted_With_Duplicates;
With this code, when on the old table still there but if I create first a table without duplicates and then concatenate it works:
load * Resident Table_Sorted_With_Duplicates wherenot(peek(date)=date and sn=peek(sn) and COLOR=peek(COLOR)); drop tableTable_Sorted_With_Duplicates;
load * ResidentTable_Clean;
My point is that there is something that I misunderstand abount the Concatenate function on QV and when you use it the new rows are not added at the end of the table where the peek is checking.
I provide you an example to test it. Also, I have testested with peek( Variable, 1) in order to know if the values are added on the top an does not have worked.
Hello, I think I Find out what is really happening.
The Way QV does the concatenate is not ROW by ROW unless it must do it this way. If is possible for the QV just adds the whole table with one step and then checks the where condition after it. It is easy to see that if the that this consumes much less memory and is faster than do it row by row.
This process is like if it was pasting the memory location of the first row just after the last row of the previous table in order to start with the new one.
This does not happen if there is a calculated field on the load, then the QV is forced to create it row by row.
Event though I marked this answer as the correct (because is the one that answers the question), If someone reads this searching for a way to eliminate duplicates, please, read the answers of the other people, among them there are pretty good ways to do it.
Hello prieper I have attached you a QV with a better example. I have attatched on the original post a QV with an example. Because I can not do it here. On the example there is a variable named vLoadType that allow you to try both cases. Also on the chart in the main page you will see that the duplicates still there.
I would suggest to create a temporary key in your first table as follows: AutoNumberHash256(date, sn, COLOR) as Key_temp. Then for the concatenation use the where clause : Where Not Exists('Key_temp', AutoNumberHash256(date, sn, COLOR)).
The AutoNumberHash will calculate an unique key for each combination of the expressions inside it.
The Exists will compare the new keys with existing ones. Exists checks the field Key_temp in the whole model.
exists(field [ , expression ] )
Determines whether a specific field value exists in a specified field of the data loaded so far. Field is a name or a string expression evaluating to a field name. The field must exist in the data loaded so far by the script. Expr is an expression evaluating to the field value to look for in the specified field. If omitted, the current record’s value in the specified field will be assumed.
I believe this is a more robust solution than only looking at the previous record, since it will check the whole model for duplicates. If you use Peek or Previous you first need to sort your records and apparently it does not work with concatenate, so that's a lot of manipulation that can be avoided with the Exists() function and a key.
I don't know if you noticed in your example, but the 2 last lines with D2, S2 from table 2 are not loaded, which means in the case of using Concatenate and Peek together, you are comparing all the records of table 2 with the last one of table 1. So it seems to be a consequence the order in which QV is performing the different operations. Sorry I cannot explain more than that why it's not working as you expected.
I like a lot your solution in order to remove duplicates!
It's true that is really robust and you don't need to load the table two times (which is really good to clean the script). I wonder if it is faster or not than the one with previous(), but right now for me this is not much important.