
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
How-to-guide: Bulk Insert to Azure SQL Server
If you are trying to import/insert a lot of data to Azure SQL Server (Not on-prem/SQL Server on the Cloud hosted on a VM) you will run into some issues. Especially if you are trying to copy data from one Azure SQL Server to another.
My Architecture is Azure Cloud, Talend Clound Data Integration and Remote Engine that is deployed on Virtual Machine in the same Datacentre/region as all my resources (as much as possible) in order to have good performances.
I will save you all the things that I tried, the main issues are:
- If you will use the tMsSqlOutput will create row-by-row transactions. Not very efficient when you want to insert millions and millions of records.
- Azrue SQL Server is a PaaS – it doesn’t have the concept of Filesystem. So traditional bulk load will not work here – you cannot just point it to the location of the file because there is no such a thing…
The gist of the solution is using 2 jumphosts one of them is the Remote Engine as a Filesystem, the second one is Azure Storage Blob/Container to be used as a location that the Azure SQL Server can access.
The pipeline that eventually worked was:
And the Talend pipeline that orchestrate it:
Detailed instructions:
- First create SAS – Shared Access Signature to the Azure Storage that will be used as your 2nd jumphost
- If you followed the Instructions correctly you will get a SAS connection string in the following form:
sv=2015-12-11&ss=b&srt=sco&sp=rwac&se=2017-02-01T00:55:34Z&st=2016-12-29T16:55:34Z&spr=https&sig=copyFromAzurePortal
Don’t forget to remove the ‘?’ at the beginning of the string after copying the SAS connection string!
- The next stage will be to create the credentials in the Target DB.
CREATE DATABASE SCOPED CREDENTIAL MyAzureBlobStorageCredential
WITH IDENTITY = 'SHARED ACCESS SIGNATURE',
SECRET = 'sv=2015-12-11&ss=b&srt=sco&sp=rwac&se=2017-02-01T00:55:34Z&st=2016-12-29T16:55:34Z&spr=https&sig=copyFromAzurePortal';
The SECRET is the SAS you created in stage (2)
- Now we need to create an External Data Source:
CREATE EXTERNAL DATA SOURCE MyAzureBlobStorage
WITH ( TYPE = BLOB_STORAGE,
LOCATION = 'https://myazureblobstorage.blob.core.windows.net',
CREDENTIAL= MyAzureBlobStorageCredential);
Notice that MyAzureBlobStorage is the name of your Storage account
- The BULK INSERT instruction itself will look something similar to this:
BULK INSERT Product
FROM 'data/product.csv'
WITH ( DATA_SOURCE = ‘MyAzureBlobStorage');
*DATA_SOURCE is the EXTERNAL DATA SOURCE that was created at stage (4)
** notice the ‘data’ in FROM 'data/product.csv' – it is the Blob in the Azure Storage. You can also add subfolders if needed.
More info on here
And the result:
Cheers
Maayan

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Regards
Shong

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hey,
thanks for the help. Amazing info you have given.

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
HI ,
with tdrow I am getting below error
" could not be opened. Operating system error code 5(Access is denied.).I have also tried it with tdbbulkexec in that also I am facing same error .
Please suggest .

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
From which component? This is weird Azure SQL Servers don't have OS. Maybe is from the Remote Engine?

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
What can i do if a column is having new line in the data?. While loading into file, it is creating as new row. So getting error while doing the bulk insert
