Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Dear Community,
Request your kind help asap with the below query:
We are exploring TOS for Data Quality tool for our DQ profiling. We have our source data in Azure Data Lake Storage (ADLS) for whom profiling need to be done.
Is there a way to connect to ADLS directly/indirectly from TOS for Data Quality?
Also do we have an option to load parquet/orc/avro files in TOS for Data Quality? If yes, kindly help with some documentation
Please feel free to revert should there be any queries.
Thanks,
Senthil
Hello,
I'm afraid that this feature is not available in talend open studio for data quality.
For your issue "How to connect azure storage from a data profiling perspective"
After checking with DQ experts, they recommended trying to create an HD insight cluster to connection azure storage from the DQ side.( solution required HD Insight installation on Azure storage)
Profile parquet files in “profiling perspective” is a new feature for us.
In addition to that, you are able to read parquet and CSV file from Azure storage in Talend Studio integration perspective in Standard job and Big Data Batch job and "tFileInputParquet" component, is available for both DI and BD (Batch/Streaming) jobs.
Hope it helps.
Best regards
Sabrina
Hello,
I'm afraid that this feature is not available in talend open studio for data quality.
For your issue "How to connect azure storage from a data profiling perspective"
After checking with DQ experts, they recommended trying to create an HD insight cluster to connection azure storage from the DQ side.( solution required HD Insight installation on Azure storage)
Profile parquet files in “profiling perspective” is a new feature for us.
In addition to that, you are able to read parquet and CSV file from Azure storage in Talend Studio integration perspective in Standard job and Big Data Batch job and "tFileInputParquet" component, is available for both DI and BD (Batch/Streaming) jobs.
Hope it helps.
Best regards
Sabrina
Thank You very much for the response. This was really helpful to understand the availability of the feature.
Hello,
Feel free to let us know if there is any further help we can give.
Best regards
Sabrina
hello
We can support profiling ADLS gen2 file by jdbc driver see TDQ-20315 and TDQ-18068
Profiling ADLS gen2: we have doc https://help.talend.com/r/en-US/Cloud/studio-user-guide-api-services-platform/profiling-adls-databri...
thanks