Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
See why IDC MarketScape names Qlik a 2025 Leader! Read more
cancel
Showing results for 
Search instead for 
Did you mean: 
czuriaga
Contributor
Contributor

Read list from MongoDB collection

Hi, I have this collection in MongoDB of users, and different twitter followers across the time

> db.persons.find().pretty()
{
"_id" : ObjectId("5665a4b75d25764fa615ac5d"),
"name" : "Phillip Parker",
"screen_name" : "PhilP",
"social_networks" : {
"twitter" : {
"followers" : [
{
"date" : ISODate("2015-10-17T12:00:47Z"),
"value" : 125
},
{
"date" : ISODate("2015-10-19T05:17:51Z"),
"value" : 129
},
{
"date" : ISODate("2015-11-01T17:20:22Z"),
"value" : 135
},
{
"date" : ISODate("2015-11-04T14:13:26Z"),
"value" : 137
}
]
}
}
}


I'm trying to read it using a tMongoDBInput component to get these rows

"Phillip Parker","PhilP","2015-10-17T12:00:47Z",125
"Phillip Parker","PhilP","2015-10-19T05:17:51Z",129
"Phillip Parker","PhilP","2015-11-01T17:20:22Z",135
"Phillip Parker","PhilP","2015-11-04T14:13:26Z",137


How must I configure tMongoDBInput or Talend to get data in this format?

PS: The objective when this format is achieved, is to add in a tMap a month field (yyyy-mm or a relative month from another date) from the date field, and aggregate by user+month, choosing the max() number of followers of the month, that is, generate this result (2 documents in this example) in a new collection

> db.persons_aggr.find().pretty()
{
"_id" : ObjectId("5665a8df5d25764fa615ac5e"),
"name" : "Phillip Parker",
"screen_name" : "PhilP",
"month" : "2015-10",
"social_networks" : {
"twitter" : {
"followers" : 129
}
}
}
{
"_id" : ObjectId("5665a8f05d25764fa615ac5f"),
"name" : "Phillip Parker",
"screen_name" : "PhilP",
"month" : "2015-11",
"social_networks" : {
"twitter" : {
"followers" : 137
}
}
}
Labels (2)
1 Reply
czuriaga
Contributor
Contributor
Author

By executing an aggregate command, I get the dataset I was looking for

> db.persons.aggregate( [ 
  { $unwind: '$social_networks.twitter.followers' },
  { $project : { _id: 0,
                 name: '$name',
                 screen_name:'$screen_name',
                 date:'$social_networks.twitter.followers.date',
                 value:'$social_networks.twitter.followers.value' } } ] )

{ "name" : "Phillip Parker", "screen_name" : "PhilP", "date" : ISODate("2015-10-17T12:00:47Z"), "value" : 125 }
{ "name" : "Phillip Parker", "screen_name" : "PhilP", "date" : ISODate("2015-10-19T05:17:51Z"), "value" : 129 }
{ "name" : "Phillip Parker", "screen_name" : "PhilP", "date" : ISODate("2015-11-01T17:20:22Z"), "value" : 135 }
{ "name" : "Phillip Parker", "screen_name" : "PhilP", "date" : ISODate("2015-11-04T14:13:26Z"), "value" : 137 }

But this statement only works in the Mongo console. The tMongoDBInput only execute find(), and the tMongoDBRow doesn't generate an exit to be processed
If I user the same statement, with $out
> db.persons.aggregate( [ 
  { $unwind: '$social_networks.twitter.followers' },
  { $project : { _id: 0,
                 name: '$name',
                 screen_name:'$screen_name',
                 date:'$social_networks.twitter.followers.date',
                 value:'$social_networks.twitter.followers.value' } },
  {$out:"persons_aggr"} ] )

> db.persons_aggr.find({},{_id:0})
{ "name" : "Phillip Parker", "screen_name" : "PhilP", "date" : ISODate("2015-10-17T12:00:47Z"), "value" : 125 }
{ "name" : "Phillip Parker", "screen_name" : "PhilP", "date" : ISODate("2015-10-19T05:17:51Z"), "value" : 129 }
{ "name" : "Phillip Parker", "screen_name" : "PhilP", "date" : ISODate("2015-11-01T17:20:22Z"), "value" : 135 }
{ "name" : "Phillip Parker", "screen_name" : "PhilP", "date" : ISODate("2015-11-04T14:13:26Z"), "value" : 137 }

In the Mongo console works fine, a a new collection is created, but again, this statement doesn't work in the tMongoDBRow component