Skip to main content
Announcements
Qlik Connect 2024! Seize endless possibilities! LEARN MORE
cancel
Showing results for 
Search instead for 
Did you mean: 
sergsyb
Contributor III
Contributor III

Parse json at Qlik Replicate level

Hi guru,

There is following environment:

MongoDB is a Source and Sybase IQ is a Target. All data in MongoDB store in json format.

Is it possible at QR level, in some way, to parse incoming  json LOB and transfer  already parsed values instead original LOB (as it doing by default) to the target end point?

 

For example,  following LOB is incoming from the Source to the Qlik Replicate

{"employees":[ 

  {"name":"Shyam", "email":shyamjaiswal@gmail.com}, 

  {"name":"Bob", "email":bob32@gmail.com}, 

  {"name":"Jai", "email":jai87@gmail.com} 

]} 

 

We need to parse it at QR level (parser we will develop ourselves) and send to the Target (Sybase IQ) following cmds:

Insert into employees values ("Shyam", "shyamjaiswal@gmail.com")

Insert into employees values ("Bob ", " bob32@gmail.com")

Insert into employees values ("Jai", " jai87@gmail.com")

 

Labels (3)
1 Solution

Accepted Solutions
Heinvandenheuvel
Specialist II
Specialist II

I largely agree with @john_wang 

Your designers/developers chose MongoDb? Well, It's tempting to say - "You've made your bed, now lie in it ".

I'm not a fan. Another saying that comes to mind is 'pay me now, or pay me later'. - You will pay at some point.

Still, that doesn't help you here and now. As John indicates the only option that Replicate would appear to offer is 'UDF' - User Defined Functions. However those do not deal with CLOB arguments and even if they did, then  the user-written json parser behind the udf would have to be called over and over, once for every column trying to be exposed.

I suspect the solution is best designed to run on the target DB having Replicate dump the inserts/changes in a staging table with trigger or periodic sweeper procedure.

If one were to insist on using Replicate to somehow deliver JSON (or XML) document based source cleanly to a columns based traditional SQL RDB then I might try to explore a solution with FILE based target/source.

Can you get the MongoDB changes delivered in a flat file? Could you create a first stage Replicate task to grab the MongoDB input and dump into a flat file? With Flat file as source you can request/define "File preprocessing command" (RTFM - 8.6 Using a file as a source)

Such user provided script (any language/script/tool vs C coded needed for UDF), could parse each row into distinct columns in a well defined CSV arrangement. That output could in turn become input for a Flat File to DB task. I'm just dreaming this up. I'm not sure how practical it all might be.

fwiw,

Hein.

 

View solution in original post

4 Replies
john_wang
Support
Support

Hello @sergsyb ,

Really an interesting topic. In my opinion the parser should not be implemented within Replicate as:

1- MongoDB is no schema, it's hard to guarantee the valid RDBMS SQL produced (for example any column values maybe is NULL or any column maybe missed in a row etc);

2- I'm afraid the performance even if you can develop a parser by using the User-defined transformations especially while the JSON is complex and big

So far the common approach is that, set Store Changes mode in task setting and store the LOB (in fact it's the data come from MongoDB) in Sybase IQ, parse the JSON/LOB while consuming the datas in your apps/parser programs. In this way the parser stage is out of replication process and it will not impact the replication performance, and any data from MongoDB will not generate invalid SQL issues, it's much easier to control in your programs. 

Hope this helps.

Regards,

John.

 

 

Help users find answers! Do not forget to mark a solution that worked for you! If already marked, give it a thumbs up!
Heinvandenheuvel
Specialist II
Specialist II

I largely agree with @john_wang 

Your designers/developers chose MongoDb? Well, It's tempting to say - "You've made your bed, now lie in it ".

I'm not a fan. Another saying that comes to mind is 'pay me now, or pay me later'. - You will pay at some point.

Still, that doesn't help you here and now. As John indicates the only option that Replicate would appear to offer is 'UDF' - User Defined Functions. However those do not deal with CLOB arguments and even if they did, then  the user-written json parser behind the udf would have to be called over and over, once for every column trying to be exposed.

I suspect the solution is best designed to run on the target DB having Replicate dump the inserts/changes in a staging table with trigger or periodic sweeper procedure.

If one were to insist on using Replicate to somehow deliver JSON (or XML) document based source cleanly to a columns based traditional SQL RDB then I might try to explore a solution with FILE based target/source.

Can you get the MongoDB changes delivered in a flat file? Could you create a first stage Replicate task to grab the MongoDB input and dump into a flat file? With Flat file as source you can request/define "File preprocessing command" (RTFM - 8.6 Using a file as a source)

Such user provided script (any language/script/tool vs C coded needed for UDF), could parse each row into distinct columns in a well defined CSV arrangement. That output could in turn become input for a Flat File to DB task. I'm just dreaming this up. I'm not sure how practical it all might be.

fwiw,

Hein.

 

sergsyb
Contributor III
Contributor III
Author

Thanks John_wang and Heinvandenheuvel for your thoughts on this not trivial qustion. Indeead we can not use User-defined transformations because LOB does not support transformations in QR.
But we will review aproach suggested by Heinvandenheuvel (use FILE) with followed processing .

Thanks again.

Michael_Litz
Support
Support

Hi,

I agree with Hein and John and this would be a performance hit for the task

You can in fact do various transformations on a LOB field if you do a source lookup for the field value it would come back as a string.

Please check this article out: Transformation: Source Lookup - Oracle ROWID 5/4
https://community.qlik.com/t5/Knowledge/Transformation-Source-Lookup-Oracle-ROWID/ta-p/1805170

You would need to disregard the original LOB field and create a new field in the target and then for it's value,  use the source lookup and then you can do string functions to parse the results.

Thanks,
Michael Litz