Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
I have a system that provides a monthly 2-4gb backup via BCP (Bulk Copy Program - SQL Server). Can Talend import a BCP file? I need to parse it and break it up into different sections for insert into S3 buckets (parquet files).
Sorry for the vague question. We don't have a SQL Server to use as an intermediary so I'm looking at big file, extract, push to s3.
Hello,
Talend does not support reading or parsing native BCP (Bulk Copy Program) files directly. BCP is a SQL Server–specific binary format and requires SQL Server tools or a format definition to interpret the data structure.
Without SQL Server or a BCP format file, Talend cannot reliably determine column boundaries or data types.
Recommended approach:
Convert the BCP file at the source to a standard flat format (CSV, delimited, or fixed-length) using BCP character mode (-c) or a format file.
Ingest the converted file in Talend using standard file input components.
Process and write the data to S3 (for example as Parquet using Big Data / Spark components).
Talend can process large files (2–4 GB) efficiently when using streaming components, but the input must be in a supported file format.
Thanks,
Gourav
Hi @pthomas,
Is your BCP file just a flat text file with a .BCP extension or is a binary file meant to be read only by SQL Server? Can you please confirm that?
I heard you don't have a SQL Server to use as an intermediary but, why not? I believe you can use a SQL Server Dev version as a temporary place.
Regards,
Mark Costa
Read more at Data Voyagers - datavoyagers.net
Follow me on my LinkedIn | Know IPC Global at ipc-global.com
It's binary. I was hoping Talend had a way to interpret the format but I guess not. I'm in an anti-Microsoft shop so dealing with SQL Server makes ears pop up.
The good news - for me - is that they decided to go a different route to get the data.