Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi All,
I ran into a funny situation. Do you know why the result between the two formula is different?
Following is the script:
OracleTable:
Load
select
Col_A, Col_B,Col_C, Number from Oracle Table;
RangeTable:
Load * inline
Start, End, Group
1, 10, A
11, 20 B
21, 30 C
left Join
IntervalMatch(Number)
Load:
Start, End
Resident RangeTable;
Suppose there are 20000 rows returned from Oracle and all rows from the Oracle Table have successfully matched to the Range Table. I have a different results between the following two different expressions from the front-end.
Expression 1
Count(Col_A) -- A column that is compulsory from the oracle table.
Result: 20000 -- This is correct
Expression 2
Count(Group) -- A column from the link table.
Result : 180 -- This is crazy and is unexpected.
As mentioned above, all rows from the Oracle Table are successfully matched to the range table, I am expecting the expression 2 should also return me 20000, but it only returned me 180.
Do you have any reason for this happened?
Thanks in advance!
Yes, because the LEFT JOIN joins a larger table of discrete Number/Start/End combinations to your RangeTable. You can inspect this larger table if you remove/comment the LEFT JOIN prefix and open the Table Viewer.
Indeed you will now find a synthetic key that is connecting fields from this larger table to the counterparts in other tables (new table for all of them, Oracle Table for Number, RangeTable for the others). This synthetic key is not dangerous, and as long as your combinations aren't too many (for example while developing your document) you can leave it there. It will not negatively influence associativity.
Based on your code (almost exact copy), I did two tests. When my OracleTable contains three discrete values in field Number, I get this:
If on the other hand I enter 5 discrete values (same number of rows) in my OracleTable, I get this:
If your OrcleTable has a Number field with 180 different values, you'll get a RangeTable with 180 entries after the JOIN. You want to keep only 3 entries in RangeTable? Then omit the LEFT JOIN and let the new table + synthetic key
take care of all combinations of Number and Start/End range limits.
Best,
Peter
Are there 180 rows in the Range table? Because that is what you are counting with the second expression - not the number of rows in the Oracle table.
Hi Jonathan, the range table only have three rows. And because of intervalmatch (number), I am able to match the range table against all rows from the Oracle table using the number field from oracle. But I just don't know why when I count the number of tows that have a value in the group column, it just returned me 180 while should be 20k something.
Hi Elim, I think Jonathan refers to the result table after loading. In other words: there are 180 different values for Number field?
What if you remove 'left Join' ?
The range table cannot have just 3 rows after you perform the INTERVALMATCH, because you are LEFT JOINing the synthetic key onto the range table itself. The Oracle field Number probably has 180 distinct values. This results in the large number of Group values for all distinct rows in Range. The latter now has a layout like:
Number, Start, End, Group
If I remove the left join, I will have a synthetic key. Therefore, removing a left join is not an option.
Peter, I don't have synthetic key.
I think Peter, Jonathan and Ruben are right. If we read your script correctly, you should have (after running the script) two tables, OracleTable and RangeTable, both linked by Number field. You should have a record in RangeTable per distinct Number.
So if you count a field in one table, then count a field in the other table, why should the result be the same?
Or in other words, you asked: "Count from a column joined by intervalmatch give different value from counting from a normal column"
But intervalmatch itself hasn't joined a column to the table with the 'normal column', there are still two tables, right?
I think the only thing missing is a final JOIN of the RangeTable to the OracleTable:
OracleTable:
Load
select
Col_A, Col_B,Col_C, Number from Oracle Table;
RangeTable:
Load * inline
Start, End, Group
1, 10, A
11, 20 B
21, 30 C
left Join
IntervalMatch(Number)
Load:
Start, End
Resident RangeTable
LEFT JOIN (OracleTable)
LOAD Number, Group RESIDENT RangeTable;
DROP TABLE RangeTable;
Now you should have only one table and the count of both fields should be the same.
Yes, I meant dropping the unnecessary RangeTable too which is actually causing a Composite Key - join. Rather join the interval-matched table with the source table. Stefan has already posted the piece of code below for the same.