Hello! Hope someone could help me with this problem.
I have a large dataset (~2mil records) that is updated periodically, and I need to keep track of the changes in this set. Whenever there's a new update, I'll receive the new data dump with ALL records, not just the updated ones. It looks like this:
Update
Period
Item
Value
Update 1
2020-01
A
1000
2020-01
B
500
2020-01
C
4000
Update 2
2020-02
A
1000
2020-02
B
1000
2020-02
C
2000
Update 3
2020-03
A
1000
2020-03
B
1000
2020-03
C
3000
Update 4
2020-04
A
1000
2020-04
B
2500
2020-04
C
3000
2020-04
D
500
My question is, what is the best way to model the data to keep track of Value changes?
If I keep all historical records like the above example, I'll end up with ~8 millions records after just 4 updates.
But if I keep only the changes, or the latest record + changes if any, I cannot visualize the changes in a pivot table (the final requirement). Before() doesn't work, since the data is missing:
Looking forward to your ideas! If you need any clarifications, please feel free to ask.