Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi,
i met some issue why tBigQueryOutput component. when i used it, it worked well. but when i changed the separator character, my job failed.
when i replace comma by pipe character, i had the below error message.
Exception in component tBigQueryOutput_5_tBQBE (JB_GCP_CREATE_TABLE)
java.lang.RuntimeException: Job failed: BigQueryError{reason=invalid, location=gs://int-storage-bucket-datalake-bm/staging/2019/03/15/TARRPMS.csv, message=Error while reading data, error message: CSV table encountered too many errors, giving up. Rows: 1; errors: 1. Please look into the errors[] collection for more details.}
at bm_gcp.jb_gcp_create_table_0_1.JB_GCP_CREATE_TABLE.tFileInputDelimited_3Process(JB_GCP_CREATE_TABLE.java:3942)
at bm_gcp.jb_gcp_create_table_0_1.JB_GCP_CREATE_TABLE.runJobInTOS(JB_GCP_CREATE_TABLE.java:4515)
at bm_gcp.jb_gcp_create_table_0_1.JB_GCP_CREATE_TABLE.main(JB_GCP_CREATE_TABLE.java:4130)
My data not contain pipe character in thoses columns
Below some sample data from csv file :
455|44245|A2010101|000083|²description info +|3100|1530|8.0|20190301|20170515|20200131|1901001|REG|8.81|M2|8,018131|8,812499|18,418928|0|M2|CM|FI|" 5"|A2010101|17,682171
455|44245|A2010101|000083|²description info+|3100|1530|8.0|20190301|20170515|20200131|1901001|REG|8.81|M2|8,018131|8,812499|18,418928|0|M2|CM|EM|" 5"|A2010101|15,287711
455|44245|A2010101|000083|²description info+|3100|1530|8.0|20190301|20170515|20200131|1901001|REG|8.81|M2|8,018131|8,812499|18,418928|0|M2|CM|EN|" 5"|A2010101|14,550954
I will add something : when i used the same file (with pipe separator) and i ingest it manually in GCP. It work. So i don't understand, what's wrong with the component.
Could you help me please, have you ever met that issue?
I think it's real bug from component.
Regards,
Hello,
What's your input source? Could you please post your whole job design screenshot on forum?
Best regards
Sabrina
Hi,
Source : tfileinputdelimited (csv)
Separator is ";"
Header is :1
the second row is : (first is header)
455;44245;A33333;000083;description;3100;1530;8;20190220;20190301;20190329;1901001;REG;8.810000;M2;8,018131;8,812499;18,191532;1;M2;55;FI;" 5";E87952;17,463871
in the file, i have " 5" , the space is importante so i really need doublequote to enclose it.
tMap : No transformation actually, it's just use to transfert source to target.
tBigQueryoutput : build another csv file, send to Google Cloud Storage and ingest to BigQuery Table.
In advanced parameters : Separator is coma and encode is UTF8
I would like to change comma separator in target file because, comma is used to decimal separator (french format)
I used pipe, and many other characters to replace coma. Error to ingest : csv file contains too many error...
I aslo tried to replace tBigQueryOutput by two component : tGSPut and tBigQueryBulkExec , to avoid a new csv creation, but i had the same issue.
if you need more details, tell me.
Regards
Siedaen.
Hello,
Sorry for silence. We try our best to answer as many post as we can. Any input source can be uploaded into forum so that we could take a testing on V 7.1.1 to see if it is a bug?
Here exists a jira issue about "tBigqueryOutput handles empty strings as null" and this is issue is fixed on 7.2.1, 7.1.2 .
Really thanks for your time.
Best regards
Sabrina
Is the issue fixed? having same issue. Also is there a way to use [chr(198) or Æ] as delimiter in the tBigQueryBulkExec when inserting to BigQuery?
It works with SDK
bq load --field_delimiter="Æ" --source_format=CSV --schema="<schema>" <tablename> <filename>
It is a bug, no matter what you put on the "Field Delimiter" value, if you check the job in BigQuery (Jobs > Project Jobs) it always says "COMMA".
How can we know the status of this? When this will fixed? (Im using 7.2)
the issue occurs when we use service account as the authentication mode
also option such as 1. create table if not exist, 2. truncate and 3. append also disappear.
any update on this
related to your ticket https://jira.talendforge.org/browse/TBD-5960?page=com.atlassian.jira.plugin.system.issuetabpanels%3A...
I cant understand that..
Its suposed to be resolved in 7.1.1, it is not, I tested in the same version
Also, prio Low? not allowing you to chose a delimiter is something we need...
Why is closed if its not fixed?