Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
I have a table A in sqlserver, and table A have snp_date column which is (YYYYMMDD)20200101, it has millions records, I cannot load in a big chunk, I want to loop column snp_date to load day by day to snowflake.
Could you advise in detail how to achieve it?
Replicate does not have a looping construct, nor a delay construct. It does have a scheduler and 'passthru' queries.
Let's first step back you your problem statement because it has several problems.
>> I have a table A in sqlserver, and table A have snp_date column which is (YYYYMMDD)20200101
Fine, but sloppy. To be clear the column is VARCHAR(8) and 20200101 is a sample value, possibly running all teh way to the current day?
>> it has millions records,
Fine, but sloppy. millions of ROWS is no big deal. Per day perhaps? Are they extra-wide rows perhaps? many VARCHAR(MAX) columns? What is the total volume in GB per day or per year?
>> I cannot load in a big chunk,
WHY NOT, and if not in one chunk, how much can you handle? Source overload? Target overload? Network overload? Payd by the hour?
>> I want to loop column snp_date to load day by day to snowflake.
There is no loop construct.
Day by day a day at a time? I guess not, as you'll never catch up. A day every hour? That'll take two days. Typically Snowflake just sneeze at a few million rows. Seconds, not minutes. It loves loading billions of rows.
How long did it take to load a large day in your testing (you did run several tests right?)
Maybe, just maybe...
- you can schedule a full load job every hour and calculate a day number to load for that hour in a passthru filter?
- you can schedule a full load job every hour on the hour and use a join with VIEW in the passthru filter where that view is updated by a SQL server job every hour on the half hour to point to a next day? If something stumble or if you are not ready change it to a non-existing (future) day and nothing will happen until researched and a fresh valid restart value is updated.
- Replicate sends the load files in chunks (Advanced settings - Max file size (MB): - default 100 MB). Use a slow network link to make the chunks come in slow enough?
>> Could you advise in detail how to achieve it?
Ha, you want details, and you provide little to none yourself. Hilarious.
CHeers,
Hein.
Hello @JimXu ,
Thanks for reaching out to Qlik Community!
I'm not sure what difficulties you are facing however if you want to replicate a very big table, you may use Parallel Load to speed up the full load process. You may use different data rangers, eg Partitions, sub-partitions, or you can define yourself segments boundaries, use date column etc.
BTW, the supported source and target endpoints can be found here.
Hope this helps.
Regards,
John.
Maybe something like this:
for i = 0 to (today() - makedate(2020) - 1)
let v = num(num#(text(date(makedate(2020) + $(i), 'YYYYMMDD')));
t: sql select * from YourTable where YourDate = $(v);
store t into [t_$(V).qvd] (qvd); drop tables t;
next
Replicate does not have a looping construct, nor a delay construct. It does have a scheduler and 'passthru' queries.
Let's first step back you your problem statement because it has several problems.
>> I have a table A in sqlserver, and table A have snp_date column which is (YYYYMMDD)20200101
Fine, but sloppy. To be clear the column is VARCHAR(8) and 20200101 is a sample value, possibly running all teh way to the current day?
>> it has millions records,
Fine, but sloppy. millions of ROWS is no big deal. Per day perhaps? Are they extra-wide rows perhaps? many VARCHAR(MAX) columns? What is the total volume in GB per day or per year?
>> I cannot load in a big chunk,
WHY NOT, and if not in one chunk, how much can you handle? Source overload? Target overload? Network overload? Payd by the hour?
>> I want to loop column snp_date to load day by day to snowflake.
There is no loop construct.
Day by day a day at a time? I guess not, as you'll never catch up. A day every hour? That'll take two days. Typically Snowflake just sneeze at a few million rows. Seconds, not minutes. It loves loading billions of rows.
How long did it take to load a large day in your testing (you did run several tests right?)
Maybe, just maybe...
- you can schedule a full load job every hour and calculate a day number to load for that hour in a passthru filter?
- you can schedule a full load job every hour on the hour and use a join with VIEW in the passthru filter where that view is updated by a SQL server job every hour on the half hour to point to a next day? If something stumble or if you are not ready change it to a non-existing (future) day and nothing will happen until researched and a fresh valid restart value is updated.
- Replicate sends the load files in chunks (Advanced settings - Max file size (MB): - default 100 MB). Use a slow network link to make the chunks come in slow enough?
>> Could you advise in detail how to achieve it?
Ha, you want details, and you provide little to none yourself. Hilarious.
CHeers,
Hein.