Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
I know we use LEFT JOIN for one to one and KEEP for one to many relations
I want to know one real time scenario implementation for LEFT JOIN and one REAL TIME scenario implementation for LEFT KEEP while designing data model
Customers:
LOAD Cust_ID,
First_Name as Cust_First_Name,
Last_Name as Cust_Last_Name
FROM
[..\BASE QVD\Base_Customers.qvd]
(qvd);
Orders:
//loading data from Order Details.Qvd
LOAD Order_ID,
Prod_ID,
Qty,
U.Price
FROM
[..\BASE QVD\Base_Order Details.qvd]
(qvd);
Temp_DimensionsWithoutFacts:
right keep(Customers)
Load
Distinct Cust_ID
Resident Orders;
Drop Table Temp_DimensionsWithoutFacts;
edited sorry : I have given right keep ...
Understanding Join, Keep and Concatenate
The above post demonstrates exactly how each works and is easy to understand
You only use keep we you want to retain matching data from only one table
I need the real time scenario
Script error:
Field not found - <Cust_ID>
Temp_DimensionsWithoutFacts:
RIGHT KEEP (Customers)
Load
Distinct Cust_ID
Resident Orders
Vibhu Sharma wrote:
I know we use LEFT JOIN for one to one and KEEP for one to many relations
I don't think that is always true. I mostly decide to Join or Keep based on the fact if I want to merge the two tables into one or keep them as separate tables. There are times when I have used Left Keep when the relation was One-to-One and I have also left joined when the relation was One-to-Many (although this would be rare).
in which scenario you have merged two tables into one?
Well I guess there is not general rule of thumb here, but one scenario where I have joined in the past is when I way to many tables and I wanted to clear the clutter to see the data model clearly.
Secondly, I have heard that a single fact table performs slightly better than a star schema when the data is large (although I have tested this to be not true all the time), but if you test tells you that a joined table is going to more efficient and you don't mind extra time for joining while reloading, then you know to join the tables.
Again, each scenario is different and its very difficult to generalize. Would love to hear what others have to say here -> johnw, pcammaert
I have a guilty confession - I don't use KEEP. I don't even know the syntax. I guess this is as good a time to learn as any.
...several minutes pass...
OK, that's easy. So why haven't I been using that for the past 10 years?
I guess if I needed a left or right keep, I just loaded my tables in the correct order and did a where exists, which doesn't take any more code, and I'm guessing doesn't take any more time. For inner keep, well, I guess I must just inner join, as I can't remember ever wanting the functionality of an inner keep, even if it seems clever. Heck, I almost never use right or outer join. Just inner and left join. That's almost all I do in practice. I'm not saying that's the right way to go, just that it's the habit I'm in.
As far as when I merge tables into one vs. leaving them separate, it's often as simple as "how many fields from this table do I need?" If it's just one, join or map. If it's lots, leave them separate. Two or three, eh, I'll decide based on if it "feels" like the data deserves a separate table or not.
One thing I don't like in my data models is a whole bunch of tables that just hold a little bit of dimensional data. I hate seeing description tables, say. Like I would never create a Customers table if all that was on it was a Customer ID and a Customer Name. I'd join or map the customer name back onto the main table. But if I had all sorts of customer data - address, customer sales representative, billing preferences, who knows what else - then I would almost certainly create a separate Customers table. I almost always create a separate Calendar table if I'm dealing with dates, but I have no good justification for doing so vs. joining it onto the main table. Maybe just so that I'm always handling dates the same way and there are no surprises.
I've also never dealt with real time data. All of our loads are periodic, and I don't think any load more often than hourly, but a lot load hourly, or at least hourly during work hours.