Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Dear friends,
I have a problem when trying to import data from flat file into MongoDB using tMongoDBOutput, tExtractJSONFields and tWriteJSONFields, describing in the scenario as below:
Ref: internet, you can search by writing some words in data set of my example, sorry I cannot attach an URL.
Here is a simple example of how to read a CSV file and convert the data into JSON format.
CSV data (original data):
WDCi,Lean,KL
WDCi,Kai Herng,KL
WDCi,Walter,Sydney
WDCi,Deborah,Sydney
WDCi,Terry,US
FPT,Minh-Hieu,Hanoi
FPT,Anthony,Paris
FPT,Luis,Paris
JSON data (result wanted) in Mongo:
{
"_id" : ObjectId("554b7fdcb42309e07933f70f"),
"name" : "WDCi",
"locations" :
},
{
"location" : "Sydney",
"staffs" :
},
{
"location" : "US",
"staffs" : {
"staff" : "Terry"
}
}
]
}
{
"_id" : ObjectId("554b7fdcb42309e07933f710"),
"name" : "FPT",
"locations" :
}
]
}
Firstly, I will use a tFileInputDelimited component to read the CSV file:
http://blog.wdcigroup.net/wp-content/uploads/2012/07/fileschema.pngFigure 1.1
Then, I will drop the tWriteJSONField component from the palette and link the output row of the tFileInputDelimited component to it. After that, I define the schema for tWriteJSONField component. In this scenario, I create one column named “company” and set it as the output column of the component. The output row of tWriteJSONField component is linked to a tExtractJSONFields component so that I can use to insert into MongoDB later.
The result when I see in MongoDB is:
{
"_id" : ObjectId("554b7fdcb42309e07933f70f"),
"name" : "WDCi",
"locations" : {
"locations" :
},
{
"location" : "Sydney",
"staffs" :
},
{
"location" : "US",
"staffs" : {
"staff" : "Terry"
}
}
]
}
}
/* 2 */
{
"_id" : ObjectId("554b7fdcb42309e07933f710"),
"name" : "FPT",
"locations" : {
"locations" :
}
]
}
}
The problem is: the array of location embedded document is included inside an "locations" element. How could I do to have the result as wanted?
Thanks very much.
p/s: I tried to uncheck the "Get nodes" box of location in tExtractJSONFields, but the result in MongoDB showed:
{
"_id" : ObjectId("554b7ba4b42394748949a3e3"),
"name" : "WDCi",
"locations" : ""
}
{
"_id" : ObjectId("554b7ba4b42394748949a3e4"),
"name" : "FPT",
"locations" : ""
}