Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
We are start loading the largest tables (SAP ECC Tables ). We estimate that it may take 5-7 days to finish these tables.(e.g. MSEG, KONV ). But there is a tricky situation, SAP ECC only retains 3 days logs (72 hours). I need your advice --- By using fullload+CDC task will we lose some delta data ---- because of the log retention time less than the loading duration ? Speak in a simple way, A Task( Full Load + CDC) be affected by source system log retention time or not ?
CDC for a table is started BEFORE the actual fullload for a table is started to be sure to catch any change during the load. You can, and really should, see this in he logs you made while testing this.
As such the source DB log retention is irrelevant for a fullload+cdc.
If for other reasons you envision to ever need to start by timestamp earlier then the source DB retention then you could consider using a LOGSTREAM task where Replicate reads the transaction log and stores it in its own format on its own storage with its own user chosen retention. A minor catch in this is that you must have selected all tables early enough.
Are you sure you have to accept a 3+ day fullload? Can that not be tuned and improved? Any option for more parallelism/partitioning? More load streams? Appropriate priorities on tables such that the last table to be started (alphabetically) is not the longest one to load? The slowest table to load should ideally be made the first one to start loading.
HTH,
Hein
Hello Hein,
Thank you for your timely reply. I assume there is no data in the incremental process during full load.
But I saw your comments there will be irrelevant of the source log to the full load + CDC. But where and when exactly is the incremental data updated to the target system during full load?
Take Table VBAP as an example:
When the full load is done, the third line data amount is 300, But after the 3rd line completed full load , the whole full load is not done yet, and then the 3rd line increased into 1500, so when will the change of 1500 be reflect ?
Specific e.g. in Qlik Replicate ---Below 5 tables in the red box are executing a full load, and there is no data in the change processing until the full load is completed and then incremental data comes in.
Please correct me if i misunderstand.
If you look at Monitor - FullLoad - Click on 'Loading' elevator bar. There is a column 'Cached Changed'.
Those are the changes done on source while the table was loading and they will be applied immediatly after the fullload for that table is done. Immediately after that the table is 'live' and updates may be seen in the Monitor - Change Processing page. I often like to sort that by descending last change date&time.
Trust but verify - study the REPTASK_xxx.log! First with default information loging level, and "PERFORMANCE' set to trace typically. Once you understand most of that, switch on 'trace' level logging for LOAD and UNLOAD for a few minutes and study that. After that's back to informational try SOURCE_CAPTURE for a few seconds - to trace.
Enjoy.
Hein.