Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Join us to spark ideas for how to put the latest capabilities into action. Register here!
cancel
Showing results for 
Search instead for 
Did you mean: 
czuriaga
Contributor
Contributor

Read list from MongoDB collection

Hi, I have this collection in MongoDB of users, and different twitter followers across the time

> db.persons.find().pretty()
{
"_id" : ObjectId("5665a4b75d25764fa615ac5d"),
"name" : "Phillip Parker",
"screen_name" : "PhilP",
"social_networks" : {
"twitter" : {
"followers" : [
{
"date" : ISODate("2015-10-17T12:00:47Z"),
"value" : 125
},
{
"date" : ISODate("2015-10-19T05:17:51Z"),
"value" : 129
},
{
"date" : ISODate("2015-11-01T17:20:22Z"),
"value" : 135
},
{
"date" : ISODate("2015-11-04T14:13:26Z"),
"value" : 137
}
]
}
}
}


I'm trying to read it using a tMongoDBInput component to get these rows

"Phillip Parker","PhilP","2015-10-17T12:00:47Z",125
"Phillip Parker","PhilP","2015-10-19T05:17:51Z",129
"Phillip Parker","PhilP","2015-11-01T17:20:22Z",135
"Phillip Parker","PhilP","2015-11-04T14:13:26Z",137


How must I configure tMongoDBInput or Talend to get data in this format?

PS: The objective when this format is achieved, is to add in a tMap a month field (yyyy-mm or a relative month from another date) from the date field, and aggregate by user+month, choosing the max() number of followers of the month, that is, generate this result (2 documents in this example) in a new collection

> db.persons_aggr.find().pretty()
{
"_id" : ObjectId("5665a8df5d25764fa615ac5e"),
"name" : "Phillip Parker",
"screen_name" : "PhilP",
"month" : "2015-10",
"social_networks" : {
"twitter" : {
"followers" : 129
}
}
}
{
"_id" : ObjectId("5665a8f05d25764fa615ac5f"),
"name" : "Phillip Parker",
"screen_name" : "PhilP",
"month" : "2015-11",
"social_networks" : {
"twitter" : {
"followers" : 137
}
}
}
Labels (2)
1 Reply
czuriaga
Contributor
Contributor
Author

By executing an aggregate command, I get the dataset I was looking for

> db.persons.aggregate( [ 
  { $unwind: '$social_networks.twitter.followers' },
  { $project : { _id: 0,
                 name: '$name',
                 screen_name:'$screen_name',
                 date:'$social_networks.twitter.followers.date',
                 value:'$social_networks.twitter.followers.value' } } ] )

{ "name" : "Phillip Parker", "screen_name" : "PhilP", "date" : ISODate("2015-10-17T12:00:47Z"), "value" : 125 }
{ "name" : "Phillip Parker", "screen_name" : "PhilP", "date" : ISODate("2015-10-19T05:17:51Z"), "value" : 129 }
{ "name" : "Phillip Parker", "screen_name" : "PhilP", "date" : ISODate("2015-11-01T17:20:22Z"), "value" : 135 }
{ "name" : "Phillip Parker", "screen_name" : "PhilP", "date" : ISODate("2015-11-04T14:13:26Z"), "value" : 137 }

But this statement only works in the Mongo console. The tMongoDBInput only execute find(), and the tMongoDBRow doesn't generate an exit to be processed
If I user the same statement, with $out
> db.persons.aggregate( [ 
  { $unwind: '$social_networks.twitter.followers' },
  { $project : { _id: 0,
                 name: '$name',
                 screen_name:'$screen_name',
                 date:'$social_networks.twitter.followers.date',
                 value:'$social_networks.twitter.followers.value' } },
  {$out:"persons_aggr"} ] )

> db.persons_aggr.find({},{_id:0})
{ "name" : "Phillip Parker", "screen_name" : "PhilP", "date" : ISODate("2015-10-17T12:00:47Z"), "value" : 125 }
{ "name" : "Phillip Parker", "screen_name" : "PhilP", "date" : ISODate("2015-10-19T05:17:51Z"), "value" : 129 }
{ "name" : "Phillip Parker", "screen_name" : "PhilP", "date" : ISODate("2015-11-01T17:20:22Z"), "value" : 135 }
{ "name" : "Phillip Parker", "screen_name" : "PhilP", "date" : ISODate("2015-11-04T14:13:26Z"), "value" : 137 }

In the Mongo console works fine, a a new collection is created, but again, this statement doesn't work in the tMongoDBRow component