Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi folks,
New to Qlik Replicate and trying to understand proper syntax to use the parallel load (full load) feature by data segment for a date column.
Source database is Oracle and a table we're trying to fully load into Sql server 2019 has over 83 million rows. It has over 1300 partitions by date and the full load by partitions will take just over 14 hours. (Throughput records is 1635 and Throughput volume is 1605.)
I wanted to test if manually defining the date segments via "use data ranges" and having fewer plus more evenly distributed segments, still by date, would offer better performance; however, I keep getting some sort of error.
For the segment boundaries, the instructions indicate to specify the "upper limit of the range":
Segment 1: '2019-01-19 00:00:00'
Segment 2: '2019-07-01 00:00:00'
Segment 3: '2020-01-01 00:00:00'
...
Segment 7: '2025-01-01 00:00:00'
This attempt resulted in:
"Failed to init unloading table '<database>'.'<table>'
ORA-01841: (full) year must be between -4713 and +9999, and not be 0 "
&
"Stream component 'st_1_<task>' terminated
Stream component failed at subtask 1, component st_1_<task>
Error executing source loop"
Any thoughts or advice?
Thanks
Hi @rurquhart
Please post the data type for that column being segmented. It should work without the single quotes around the timestamp.
@rgurquhart
1. could you check with your source on this table , what is the column that you use for the segment, what is the datatype ?
2. could you verify the datetime from that column, exactly ?
Hi @rurquhart
Please post the data type for that column being segmented. It should work without the single quotes around the timestamp.
Hi Steve,
Source column 'abc' has data type defined as a DATE. It's also partitioned by this column 'abc' in the source database.
Here's one snippet of the source database ddl:
NOCOMPRESS
NOPARALLEL
PARTITION BY RANGE (abc)
(
PARTITION P_20130327 VALUES LESS THAN (TO_DATE(' 2013-03-28 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIAN'))
LOGGING
TABLESPACE DW_LARGE_DATA01
PCTFREE 10
INITRANS 1
STORAGE
In the Qlik app, I used column abc as the segment column.
Looks like the issue with date format, we can use function similar to below and convert date to DD-MM-YYYY format and then test.
SELECT to_char(to_date('2019-01-19 00:00:00','RRRR-MM-DD HH24:MI:SS'), 'DD-MM-RRRR HH24:MI:SS') FROM dual
Thanks, everyone, for your suggestions. @Alan_Wang This recommendation helped get past the syntax error I would normally receive; however, I'm now getting duplicate rows loaded based only on the segment values listed as such above.
Reminder:
Segment 1: '2019-01-19 00:00:00'
Segment 2: '2019-07-01 00:00:00'
Segment 3: '2020-01-01 00:00:00'
...
Segment 7: '2025-01-01 00:00:00'
Does this section allow for logic to be applied? Example, something along these lines:
Segment Boundaries abc
1 <'2019-01-19 00:00:00'
2 abc<'2019-07-01 00:00:00' AND abc > '2018-12-31 23:59:59'
3 abc<'2020-01-01 00:00:00' AND abc > '2019-06-30 23:59:59'
4 abc<'2020-04-01 00:00:00' AND abc>'2019-12-31 23:59:59'
5 abc<'2020-07-01 00:00:00' AND abc>'2020-03-31 23:59:59'
6 abc>'2020-03-31-23:59:59'
7 !=0
Thanks for any further guidance/suggestions.
the logic is some what correct, below is from my task , so that you see the logic
you can enable source_unload to trace and see the how the segment is done.
example in my task :
Line 2232: 00007976: 2022-05-25T14:56:43:735596 [SOURCE_UNLOAD ]T: Order by '', Resume Where '', Transformation Where ' ( ([MyDate1] <= '2013-09-10 10:01:04') ) ' (endpointshell.c:8433)
Line 4511: 00007956: 2022-05-25T14:56:43:860593 [SOURCE_UNLOAD ]T: Order by '', Resume Where '', Transformation Where ' (NOT (([MyDate1] <= '2013-09-10 10:01:04') )) AND ( ([MyDate1] <= '2013-09-10 10:01:05') ) ' (endpointshell.c:8433)
Line 15208: 00003040: 2022-05-25T14:56:44:39594 [SOURCE_UNLOAD ]T: Order by '', Resume Where '', Transformation Where ' (NOT (([MyDate1] <= '2013-09-10 10:01:05') )) AND ( ([MyDate1] <= '2014-09-10 10:01:06') ) ' (endpointshell.c:8433)
Line 349199: 00003248: 2022-05-25T14:56:47:276590 [SOURCE_UNLOAD ]T: Order by '', Resume Where '', Transformation Where ' (NOT (([MyDate1] <= '2014-09-10 10:01:06') )) ' (endpointshell.c:8433)
---
for duplicate, it could happen if the date is not the whole PK, Segment normally work with PK, see user guide :
-For all endpoints, the Unique Index column is automatically selected. Select which additional columns whose data you wish to use to delineate the ranges and then click OK.