Skip to main content
Announcements
Live today at 11 AM ET. Get your questions about Qlik Connect answered, or just listen in. SIGN UP NOW
cancel
Showing results for 
Search instead for 
Did you mean: 
Not applicable

Updated Rows

Hi,

my database table is being updated every day.

the table is rows of errors that is being pulled in for review. So we review each rows and make changes to the table. obviously, after changes are being made, the same row will be gone from the table since it will not be pulled in when table is refreshed.

Let's say there were 10 rows on first of may.

I make updates to database about 5 rows.

and on the same day another error happens are 3 rows are added to the table.

after the table is refreshed at the end of the day, the new table will have 8 rows since from 10 5 were gone and 3 were added.

What I want is, to be able to see the the added row for each day.

I want to be able to see what and how many rows are added. and see the trend.

for example, I want to see the creation date of the error, on May 1st these two rows were added. on May 2nd these five rows are added. like that...

Please give me advice how to make this happen.

I think assigning today's date for new rows will be one of the solution.

Thank you!

1 Solution

Accepted Solutions
johnw
Champion III
Champion III

Backing up a step, in my opinion a QVD should never be the only place that data exists. It's OK to transform the data, even significantly, but you should always be able to trace QlikView data back to a true database or file. QlikView is a reporting tool, not a system of record.

If I understand you correctly, there IS a database that contains ALL of the data. Rows can be marked as being in error. Rows that are in error can then be marked as fixed. You have something like a view of that database that only includes the rows that are marked as being in error. You intend to make use of that view to create an incremental load.

One possible practical difficulty with your plan is that if a row is marked in error and then fixed before the next load, you'd fail to load it, even though it should be part of your error statistics.

And again, QlikView is not a system of record. The underlying database structure should be tracking these changes. Since you are tracking changes to a database, you really need a history or activity database that tracks those changes. It could timestamp each change, record the general type of change (such as "error" or "fixed"), who made the change, and so on. QlikView should then be reading that history or activity database, NOT trying to sort out the activities on its own.

But you may not be in a position to dictate or even request changes to the underlying database structure. So let's assume the database is what it is, and you need to track the changes to it in QlikView.

In that case, to minimize the practical difficulty with missing errors, you'd want to do your incremental load as frequently as is practical, and it needs to run fast.

But the details? The devil's in the details, and unless I'm at your site looking at your database and gathering requirements from your users, I'm not going to be able to tell you the RIGHT data model for QlikView, or the RIGHT script to handle your incremental load. There are just far too many open questions that can have a dramatic effect on what your data model and script should look like. Technically I could treat you as both a user and a DBA, ask you every question, and then write your script, but I don't have time for that.

But maybe I can answer a simple question with a simple answer. Use the today() function to add today's date to your incremental load. Put it in the LOAD portion since it's not SQL. But I doubt that's what you actually want to do, because if the same row showed up last load as an error, I don't think you want to change the "error date" to today's date. And you probably want to use now() as well, because you should be timestamping changes, not just dating them. Eh, so much for a simple answer.

It should be simple to find resources for how incremental loads work in general, just not how to apply it to your specific case.

Well, that was a long post of rather limited usefulness. I, uh... have to get back to the job they actually pay me for, but good luck!

View solution in original post

5 Replies
johnw
Champion III
Champion III

As you said, you need an add date or probably timestamp for each row. And you need to NOT remove the rows from the table after they are updated. They can go into a reviewed status, or something along those lines, but not be actually deleted. Otherwise you can't load them in to know how many were created. I'd add a reviewed timestamp as well so that you can track and trend errors being reviewed, not just added.

Your trend of rows added is just using Add Date as the dimension, and count(distinct [Error ID]) as your expression. Trend of reviewed is Review Date as the dimension, count(distinct [Error ID]) as your expression.

If you want to be able to see both on the same table, you'll want a slightly different data model. Rather than or in addition to a table of errors, you'd want a table that represents error activities. We've identified at least two - add and review. So now we have a table that has the Error ID and an Activity Type of 'Add' and 'Review' as a key, and it just has a single Date field on it. Now you could do a trend with Date and Activity Type as dimensions, and again, count(distinct [Error ID]) as the expression.

Those are my initial thoughts on the subject, anyway.

Not applicable
Author

Hi John,

I have a following question.

my table that I am pulling data from, is being updated daily. the table is run against the entire database and filters out the error and stores the errored rows into one table and I am pulling data from that table and displaying on QV. and once the error is fixed, the table will not be displaying the fixed rows anymore. so it will be gone.

So I decided to make use of QVD file and use incremental load. that way only the new rows will be inserted into QVD but rows that are removed from the table will not be removed from QVD.

my next challenge is assigning each new row a date.

each row already has dates of other type. and I am trying to add date to each row that it became an error. I hope I am being clear here. anyways, I don't care about the rows that has already been in the table. but the new rows that are going to come in starting today, I want them to have their today's date so that I can eventually see the trend of new rows added on each day.

Load *

SQL*

From ~~~

This is how my QVD looks like right now. can I add some code to it so that it will assign incrementally added rows a today's date value?

Thank you!

johnw
Champion III
Champion III

Backing up a step, in my opinion a QVD should never be the only place that data exists. It's OK to transform the data, even significantly, but you should always be able to trace QlikView data back to a true database or file. QlikView is a reporting tool, not a system of record.

If I understand you correctly, there IS a database that contains ALL of the data. Rows can be marked as being in error. Rows that are in error can then be marked as fixed. You have something like a view of that database that only includes the rows that are marked as being in error. You intend to make use of that view to create an incremental load.

One possible practical difficulty with your plan is that if a row is marked in error and then fixed before the next load, you'd fail to load it, even though it should be part of your error statistics.

And again, QlikView is not a system of record. The underlying database structure should be tracking these changes. Since you are tracking changes to a database, you really need a history or activity database that tracks those changes. It could timestamp each change, record the general type of change (such as "error" or "fixed"), who made the change, and so on. QlikView should then be reading that history or activity database, NOT trying to sort out the activities on its own.

But you may not be in a position to dictate or even request changes to the underlying database structure. So let's assume the database is what it is, and you need to track the changes to it in QlikView.

In that case, to minimize the practical difficulty with missing errors, you'd want to do your incremental load as frequently as is practical, and it needs to run fast.

But the details? The devil's in the details, and unless I'm at your site looking at your database and gathering requirements from your users, I'm not going to be able to tell you the RIGHT data model for QlikView, or the RIGHT script to handle your incremental load. There are just far too many open questions that can have a dramatic effect on what your data model and script should look like. Technically I could treat you as both a user and a DBA, ask you every question, and then write your script, but I don't have time for that.

But maybe I can answer a simple question with a simple answer. Use the today() function to add today's date to your incremental load. Put it in the LOAD portion since it's not SQL. But I doubt that's what you actually want to do, because if the same row showed up last load as an error, I don't think you want to change the "error date" to today's date. And you probably want to use now() as well, because you should be timestamping changes, not just dating them. Eh, so much for a simple answer.

It should be simple to find resources for how incremental loads work in general, just not how to apply it to your specific case.

Well, that was a long post of rather limited usefulness. I, uh... have to get back to the job they actually pay me for, but good luck!

Not applicable
Author

Thank you so much John for sharing your valuable insight. I will definitely keep your advice in mind.

Again, Thank you so much John!!

apierensv
Contributor III
Contributor III

Hello David,

This problematic is met frequently by Qlikview developers. I solved the issue but it is very tricky, and as John said, you should do it in your database before the data come into Qlikview.

To detect the changes, and make a date interval of validity (two date fields : begin_date, end_date, the BI term for this is a "slowly changing dimension"), you have 5 cases, the first 3 cases are easy to understand, the last 2 cases are less easy and usefull only when you have rows arriving, disappearing and coming back.

You will also have to store the generation date in a separate file (qvd or csv), because if you store it in the application, as you will need a loop, if Qlikview crash in the middle of the loop the last version of your variable won't be saved.

For the first generation (at the initialization) you give the date (or timestamp) and put the end date at '31/12/9999', then for the next generations you have five cases :

1) The rows of the current generation are completely new

2) Opposite case than below : the past generation rows don't exist in the current generation

3) The rows of the previous generation exist in the current generation

4) The rows before the previous generation exist in the current but not in the previous generation

5) The rows of the current generation exist in the generation before the previous but not in the previous generation

To detect differences I used AutoNumberHash256 on all the fields (so don't care about a primary key), this works if you don't have duplicates in your source data, if you have duplicates you should add a count() as a field to de-deduplicate, and in your final expressions in the interface you will have to multiply all the numbers with this count() field.

So, imaging you have these two data sets, the last generation date is '20/04/2016' and the new generation date is '21/04/2016' :

DATA_OLD:

load field1,field2,date(begin_date,'DD/MM/YYYY') as begin_date,date(end_date,'DD/MM/YYYY') as end_date inline

[field1,field2,begin_date,end_date

1,1,01/04/2016,31/12/9999

2,8,01/04/2016,31/12/9999

3,4,01/04/2016,15/04/2016];

DATA_NEW:

load field1,field2 inline

[field1,field2

1,1

2,5,

3,4];

The algorithm in the attached file will give you this final table, so you can check (yes it really works !) :

field1field2        begin_date            end_date
1101/04/201631/12/9999
2801/04/201620/04/2016
2521/04/201631/12/9999
3401/04/201615/04/2016
3421/04/201631/12/9999

Maybe one day Qlikview will be able to build a slowly changing dimension natively, but -I repeat- you should do it in a database.