Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hello,
I'm currently in the process of doing an ETL process and during it i've encountered an obstacle that i can't quite find a solution for it. So basically my datasets are from a .csv file and im loading them into my datawarehouse, my fact table has as its composite primary key "schoolID" "courseID" "locationID" and as facts "totalMale" "totalFemale" and "total". However on the .csv file i have 2 rows that have the same primary key but different values for "genre" and "totalRegistered", I would like to unite those 2 rows into a single one that has both totalMale and totalFemale as shown in the picture bellow. I have no idea how to do it right now as I'm quite new to the software, any suggestions?
Separate M from F using tFilterRow then compute (sum) total per genre with tAggregateRow giving 2 linked tHashOutput.
Join both tHash rows using a tMap with appropriate filter on genre for each and sum gendered total giving gran total.
Looks like this:
tAggregate are identicals:
And tMap is like this:
Here is the final result:
[statistics] connecting to socket on port 3832 [statistics] connected .--------+--------+----------+---------+-----------+-----. | tLogRow_14 | |=-------+--------+----------+---------+-----------+----=| |schoolId|courseId|locationId|totalMale|totalFemale|total| |=-------+--------+----------+---------+-----------+----=| |1 |2 |3 |40 |30 |70 | '--------+--------+----------+---------+-----------+-----' [statistics] disconnected
Separate M from F using tFilterRow then compute (sum) total per genre with tAggregateRow giving 2 linked tHashOutput.
Join both tHash rows using a tMap with appropriate filter on genre for each and sum gendered total giving gran total.
Looks like this:
tAggregate are identicals:
And tMap is like this:
Here is the final result:
[statistics] connecting to socket on port 3832 [statistics] connected .--------+--------+----------+---------+-----------+-----. | tLogRow_14 | |=-------+--------+----------+---------+-----------+----=| |schoolId|courseId|locationId|totalMale|totalFemale|total| |=-------+--------+----------+---------+-----------+----=| |1 |2 |3 |40 |30 |70 | '--------+--------+----------+---------+-----------+-----' [statistics] disconnected