Big Data Problem

Report Inappropriate Content · ‎2013-05-07

Hello guys,
been developing for Qlikview for the last 8 months on and off (doing SQL and Java as well)

I've stombled upon a major problem.

my company deals with user data for cellular companies.

we have a subscriber base of 10M + subscriber for the medium size compaines,

for each subscriber we reserve data of usages (SMS/Minutes of use, spending...) for the last 90 days

so far = 900M rows.

each subscriber can participate in ~700 Marketing activities, and he can be invited for each activity several times during the 90 days.

using this data,

I need to calculate and present in various charts and tables the usages for each subscriber, per each marketing activity relative to the distance of the usage date from the invitation date.

the hypothasis is that each subscriber can enter each MA 15 times at the most

so we are at worst talking about ~15 BILLION records.

the requiement is for this granularity of data (subscriber, MA, days from invitation).

all the data models processes I have tested (been using generic loads, cross tables etc) were to little success.

the data is huge.

I want to start a disccusion on how to desgin the data model to the best possible performance.

should I use 1 fact cross table of some design, break it down to dims and fact, use key tables, try hash, apply map... don't know.

I need to be able to load and process the data in ~3 hours and then present it. Qlikview table object take forever to load that much of data. charts are behaving better.

please help, this is as much a data modeling problem as it is a technical challenge.

thank you in advance,

Matan.

Report Inappropriate Content · ‎2013-05-08

lol

Big data baby, it's the future - if you're not there you're gone!

Related Topics