Skip to main content
Announcements
Introducing Qlik Answers: A plug-and-play, Generative AI powered RAG solution. READ ALL ABOUT IT!
cancel
Showing results for 
Search instead for 
Did you mean: 
JimXu
Contributor
Contributor

loop date column to load the data incremental

I have a table A in sqlserver, and table A have snp_date column which is (YYYYMMDD)20200101, it has millions records, I cannot load in a big chunk, I want to loop column snp_date to load day by day to snowflake.

 

Could you advise in detail how to achieve it?

 

Labels (3)
1 Solution

Accepted Solutions
Heinvandenheuvel
Specialist III
Specialist III

Replicate does not have a looping construct, nor a delay construct. It does have a scheduler and 'passthru' queries.

Let's first step back you your problem statement because it has several problems.

>> I have a table A in sqlserver, and table A have snp_date column which is (YYYYMMDD)20200101

Fine, but sloppy. To be clear the column is VARCHAR(8) and 20200101 is a sample value, possibly running all teh way to the current day?

>>  it has millions records,

Fine, but sloppy. millions of ROWS is no big deal. Per day perhaps? Are they extra-wide rows perhaps? many VARCHAR(MAX) columns? What is the total volume in GB per day or per year?

>> I cannot load in a big chunk,

WHY NOT, and if not in one chunk, how much can you handle? Source overload? Target overload? Network overload? Payd by the hour?

>> I want to loop column snp_date to load day by day to snowflake.

There is no loop construct. 

Day by day a day at a time? I guess not, as you'll never catch up. A day every hour? That'll take two days. Typically Snowflake just sneeze at a few million rows. Seconds, not minutes. It loves loading billions of rows.

How long did it take to load a  large day in your testing (you did run several tests right?) 

Maybe, just maybe...

- you can schedule a full load job every hour and calculate a day number to load for that hour in a passthru filter?

- you can schedule a full load job every hour on the hour and use a join with VIEW in the passthru filter where that view is updated  by a SQL server job every hour on the half hour to point to a next day? If something stumble or if you are not ready change it to a non-existing (future) day and nothing will happen until researched and a fresh valid restart value is updated.

- Replicate sends the load files in chunks (Advanced settings - Max file size (MB): - default 100 MB).  Use a slow network link to make the chunks come in slow enough?

>> Could you advise in detail how to achieve it?

Ha, you want details, and you provide little to none yourself. Hilarious.

CHeers,

Hein.

 

 

View solution in original post

3 Replies
john_wang
Support
Support

Hello @JimXu ,

Thanks for reaching out to Qlik Community!

I'm not sure what difficulties you are facing however if you want to replicate a very big table, you may use Parallel Load to speed up the full load process. You may use different data rangers, eg Partitions, sub-partitions, or you can define yourself segments boundaries, use date column etc.

BTW, the supported source and target endpoints can be found here.

Hope this helps.

Regards,

John.

 

Help users find answers! Do not forget to mark a solution that worked for you! If already marked, give it a thumbs up!
marcus_sommer

Maybe something like this:

for i = 0 to (today() - makedate(2020) - 1)
   let v = num(num#(text(date(makedate(2020) + $(i), 'YYYYMMDD')));
   t: sql select * from YourTable where YourDate = $(v);
   store t into [t_$(V).qvd] (qvd); drop tables t;
next

Heinvandenheuvel
Specialist III
Specialist III

Replicate does not have a looping construct, nor a delay construct. It does have a scheduler and 'passthru' queries.

Let's first step back you your problem statement because it has several problems.

>> I have a table A in sqlserver, and table A have snp_date column which is (YYYYMMDD)20200101

Fine, but sloppy. To be clear the column is VARCHAR(8) and 20200101 is a sample value, possibly running all teh way to the current day?

>>  it has millions records,

Fine, but sloppy. millions of ROWS is no big deal. Per day perhaps? Are they extra-wide rows perhaps? many VARCHAR(MAX) columns? What is the total volume in GB per day or per year?

>> I cannot load in a big chunk,

WHY NOT, and if not in one chunk, how much can you handle? Source overload? Target overload? Network overload? Payd by the hour?

>> I want to loop column snp_date to load day by day to snowflake.

There is no loop construct. 

Day by day a day at a time? I guess not, as you'll never catch up. A day every hour? That'll take two days. Typically Snowflake just sneeze at a few million rows. Seconds, not minutes. It loves loading billions of rows.

How long did it take to load a  large day in your testing (you did run several tests right?) 

Maybe, just maybe...

- you can schedule a full load job every hour and calculate a day number to load for that hour in a passthru filter?

- you can schedule a full load job every hour on the hour and use a join with VIEW in the passthru filter where that view is updated  by a SQL server job every hour on the half hour to point to a next day? If something stumble or if you are not ready change it to a non-existing (future) day and nothing will happen until researched and a fresh valid restart value is updated.

- Replicate sends the load files in chunks (Advanced settings - Max file size (MB): - default 100 MB).  Use a slow network link to make the chunks come in slow enough?

>> Could you advise in detail how to achieve it?

Ha, you want details, and you provide little to none yourself. Hilarious.

CHeers,

Hein.