Skip to main content
Announcements
Global Transformation Awards! Applications are now open. Submit Entry
cancel
Showing results for 
Search instead for 
Did you mean: 
lucaL
Partner - Contributor II
Partner - Contributor II

Parse json without knowing his schema

Hi all,

I'm wondering if there is a way o simplfy the way I'm imaging i need to do a job.

I will have lots of json, all made with a similar form, that is something like this:

{
	"columns": [
		{
			"name": "user",
			"type": "string"
		},
		{
			"name": "organization",
			"type": "string"
		},
		{
			"name": "site",
			"type": "string"
		},
	],
	"rows": [
		{
			"user": {
				"value": "username1"
			},
			"organization": {
				"value": "org1"
			},
			"site": {
				"value": "site1"
			},
		},
		{
			"user": {
				"value": "username2"
			},
			"organization": {
				"value": "org2"
			},
			"site": {
				"value": "site2"
			}
		}
	]
}

 

So there would be a section with the schema (rows) and a section of the plain data (columns)....

I would like to have a job that parse one of these json file, and write the data on a DB table automatically, without the need to create a job for every JSON with the need to write manually the schemas, the mappings between the schema and the fields and so on... Any ideas from you, talend experts? 🙂

Labels (3)
1 Solution

Accepted Solutions
fcolagiacomo
Contributor III
Contributor III

Hi,

I fixed the job. I left you to improve date converting to send to database; you can use the same strategy of "columnTypes" or you can create more complex data structure.  

View solution in original post

4 Replies
fcolagiacomo
Contributor III
Contributor III

Hi,

I believe I found a solution to your issue.

My solution executes the follow logic steps:

  1. it reads Json file and loads in globalMap object the columns and rows objects;
  2. it reads columns object and produces csv header file and Dynamic Schema file with name and type;
  3. it read rows objects and appends in csv file the data;
  4. it loads Dynamic Schema produced in step 3;
  5. it reads csv data file with dynamic schema and writes into databases.

I attach the job and subjob sample.

    

lucaL
Partner - Contributor II
Partner - Contributor II
Author

Hi,

thanks for your effort... During the weekend I was thinking on a similar solution, but was not able to implement it... I guss the road is correct, actually, trying your project have some problems that "columns" in the json file are switched on the output table on a more complex example as the one below, where username is written on the dateAccess col and the others are switche of one....

 

{
	"columns": [
		{
			"name": "user",
			"type": "string"
		},
		{
			"name": "organization",
			"type": "string"
		},
		{
			"name": "site",
			"type": "string"
		},
		{
			"name": "role",
			"type": "string"
		},
		{
			"name": "dateAccess",
			"type": "date"
		}
	],
	"rows": [
		{
			"user": {
				"value": "username1"
			},
			"organization": {
				"value": "org1"
			},
			"site": {
				"value": "site1"
			},
			"role": {
				"value": "role1"
			},
			"dateAccess": {
				"value": "12/05/2024 12:15:32"
			}
		},
		{
			"user": {
				"value": "username2"
			},
			"organization": {
				"value": "org2"
			},
			"site": {
				"value": "site2"
			},
			"role": {
				"value": "role2"
			},
			"dateAccess": {
				"value": "15/05/2024 15:12:45"
			}
		}
	]
}

 

This is the relative data csv:
user;organization;site;role;dateAccess
site1;role1;org1;12/05/2024 12:15:32;username1
site2;role2;org2;15/05/2024 15:12:45;username2

 

I could also ask developers to slightly change the json, but not sure if that helps... Like this maybe:

 

{
	"columns": [
		{
			"name": "user",
			"type": "string"
		},
		{
			"name": "organization",
			"type": "string"
		},
		{
			"name": "site",
			"type": "string"
		},
		{
			"name": "role",
			"type": "string"
		},
		{
			"name": "dateAccess",
			"type": "date"
		}
	],
	"rows": [
		{
			"user": "username1",
			"organization": "org1",
			"site": "site1",
			"role": "role1",
			"dateAccess": "12/05/2024 12:15:32"
		},
		{
			"user": "username2",
			"organization": "org2",
			"site": "site2",
			"role": "role2",
			"dateAccess": "15/05/2024 15:12:45"
		}
	]
}

 

fcolagiacomo
Contributor III
Contributor III

Hi,

I fixed the job. I left you to improve date converting to send to database; you can use the same strategy of "columnTypes" or you can create more complex data structure.  

lucaL
Partner - Contributor II
Partner - Contributor II
Author

Hi,

thanks for your update, I will look into it! Yesterday I've elaborated also on your solution and ended up wit hthis, by sligtly modifying the input json as attached...

Now I'm looking also to see if there's a way to save a dynamic schema on a file by making it auto contructing from a db table, that would be another solution to this automatism.

 

Thanks,

Luca