Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi,
I'm trying to stream the JSON message in Kafka and save it as the .json file. However the output file is empty.
If somebody can help, I will be grateful.
I'm using TOS for Big Data 6.4
Kafka ver.: 0.9.0.1
Kafka ver. set in TOS: 0.9.0.1
I'm also looking for examples of solutions where Kafka is used to catch the message in JSON format and the output is in .json file.
I want to evaluate whether TOS is a good tool for reading JSON using Kafka.
Kafka producer and consumer input/output (example):
My simple jobs.
1.
2.
Hi,
if Output file empty, it mean - You wrong parse JSON
You can attache screenshots for tExtractJSON and tWriteJSON, it would be better
Hi,
I've checked two options of the same job.
1. with Kafka input. No data in .jsonfile (file is empty) or no data i .txt file as an output (tFileOutputRaw component)
2. with tFixedFlowInput component. All the data and structure is in the .json file. I've checked also export to .xls file and it works as well.
In both cases I'm not using tWriteJSON component. Should I use it?
1. Kafka input job
tExtractJSON config
tFileOutputJSON
Output schema, columns 1:1
2. Job where I used tFixedFlowInput component
tFixedFlowInput
JSON here:
{"data":[{"Service_Description":"Pets Allowed","Service_Code":"PET"},{"Service_Description":"Swimming Pool","Service_Code":"SWI"},{"Service_Description":"Tennis Court","Service_Code":"TEN"},{"Service_Description":"Dry Cleaning","Service_Code":"DRY"},{"Service_Description":"Internet Access","Service_Code":"INT"},{"Service_Description":"WIFI Internet Access","Service_Code":"WIF"},{"Service_Description":"Fitness Room","Service_Code":"FIT"},{"Service_Description":"Concierge","Service_Code":"CON"}]}
output .json file
output .xlsx file
Please NOTE that tLogRow_2 displays the same output in both cases/jobs.
I will check later with your sample
generally I use JSONPath for parse JSON
I make few test, with Your schema.
My Jobs use little different and this is was not affected for me, but look like all depend from configuration of KafkaInput component
Kafka as any MQ oriented for non stop work, and depending how Your component setup, it open and close output file different.
If You use auto-disconnect by timeout or by number of received messages - all fine:
if You manual stop Job - file not closed properly
as alternative possible use tFlowToIterate and route output in different JSON files, or append in same delimited, something like:
in this case - each file contain single message from Kafka, all of them closed independent and could be processed after
Hello
May you kindly help me with this please, I have a problem with my first experience working with Talend OS_BD, I'm consuming Kafka messages (JSON format) into Talend where I need to eventually output it into PostgreSQL data warehouse (in JSON format as well). My problem is, the PostgreSQL table is not created, if I create it manually no data is inserted in the table but It can acknowledge that the table already exists. I can not also save the Kafka messages to .json file, its always empty. Do i have to first convert this JSON string to object so that i insert it to PostgreSQL ? I'm stuck for days and I hope you can help me out. if you have any better solution to register these data in the data warehouse (PostgreSQL) I will appreciate it. below is the brief description of my set-up
here is the JSON I'm receiving from the Kafka topic before any extraction
another one
I then put in place a pipeline as follows
I'm just selecting 2 columns from the JSON for the PostgreSQL table, here is the output in tlogRow
here is my configuration for tkafkaInput
and the tExtractJsonFields is as follows
I hope you can help me with ideas on how to solve this
Regards
Hi
I think you problem in tPostgreSQL component.
By default it commits and inserts data by 10000 rows, you need to disable this in Advanced tab of component settings, to make an insert for each row.
In the original solution, it special split for 2 sub-jobs to be sure sub-job finish it work before next iteration started. Same issue could be and with saving to file (not sure)
Thank for the response
the problem now is , the "commit every" setting should not be empty
what prevents you from using 1 ? 🙂
Thank you Sir it worked , and i have also noted in Kafka configuration timeout precision by default its -1.
Now do you know how i can convert this json string in Talend OS for BD
to be like this one below without the backslashes
Regards