Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
What's the correct way to apply an upsert against a fact table when using Kudu as the storage engine?
Certain fact tables e.g. accumulating snapshots, require existing facts to be updated and new facts to be inserted. I've been trying to implement this simple update strategy in Talend but I'm struggling with the Update flow. Here's how it currently works:
1. Select new fact records as input.
2. Lookup against existing fact table (using tMap).
3. Match both streams as inner join (using business key).
4. Send inner join rejects = true to Insert output flow (new records).
5. Send inner join rejects = false to Update output flow (existing records).
I've tested the logic and it correctly filters my incoming records into both streams correctly. The insert step is then straightforward. However, the update step requires me to update the existing records stored in Kudu. I'm using Impala as the broker but none of the components seem suitable here. An output component will simply append the records which isn't the correct behavior. The only option that comes to mind is to store the output in a temporary table and then apply the update using SQL via an Impala input query. Obviously this is not ideal so I'm looking for better ideas?
Thanks
Hello,
Please refer to this related topic:https://community.talend.com/t5/Design-and-Development/resolved-How-to-work-on-Accumulating-Snapshot...
Best regards
Sabrina