Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Regarding my script... goal is a single Master file that contains unique schoolIDs, with respective SchoolName and SchoolType.
Thus, to combine/merge two datasets(ds), let's use T and P, where P has two columns and T has 3 columns. Some SchoolID values will be common in both, and some different, hence why we want to combine for a Master SchoolID ds. We want to also pull in the SchoolName and SchoolType for each SchoolID.
For each SchoolID, the SchoolName MUST be the same, but of course it's not... they are similar in several cases and one may have an extra space, or a hyphen, i.e. Smoky Hill HS vs. Smoky Hill High School.
For SchoolType, only T has this field, values are either a text string or Null.
RULE: if there are multiple name values for a given schoolid, 1) keep only a single record and 2) from table "B:" (more current) .
I've tried a day's worth of things and decided it's time to ask the experts... OUTER JOIN, vs. Where not exists vs. MaxString.... please advise.
I really want to say something like this...
Names:
LOAD * from P
OUTER JOIN LOAD * from T
Please help 🙂
Stephanie
Maybe this?
Master:
LOAD
SchoolID
,SchoolName as PSchoolName
FROM P
;
OUTER JOIN (Master)
LOAD
SchoolID
,SchoolName as TSchoolName
,SchoolType
FROM T
;
LEFT JOIN (Master)
LOAD
SchoolID
,if(len(PSchoolName),PSchoolName,TSchoolName) as SchoolName
RESIDENT Master
;
DROP FIELDS
PSchoolName
,TSchoolName
;
Hi John,
thanks for the solution idea. I also came up with a workaround very similar to yours except I left SchoolName as the same column name, and used "MaxString" for SchoolName and then for SchoolType. It was a few extra steps. Your way makes good sense since we know that SchoolName and SchoolType have different values and should not JOIN on them... hence the "len" option.
Thanks for your time and response.
Stephanie Lind
Hello John
I stumble across this post and this is exactly what I was looking for in solving merging two QVDs together with different columns.
I needed to load old legacy ERP Supplier data with new ERP Supplier data and only show the missing Suppliers that were NOT in the new ERP system.
So, below script solved the issue.
Thanks
Les
Master:
LOAD
Text([Supplier Id]) as "Supplier Id",
'R' AS RType
FROM
[..\Data\Vendors\Suppliers.qvd]
(qvd)
;
OUTER JOIN (Master)
LOAD
Text([Supplier Id]) as "Supplier Id"
,Supplier as FSupplier,
[Supplier Full Name] AS FName,
'F' AS FType
FROM
[..\Data\Vendors\FSSuppliers.qvd]
(qvd)
;
FSSupplier:
LOAD
[Supplier Id]
,FSupplier as Supplier
,FName as "Supplier Full Name"
,FType as "Type"
RESIDENT Master
WHERE IsNull(RType) and FType='F'
;
DROP TABLE Master;
Results: FSSupplier only contains Suppliers that were in the old legacy ERP system and NOT in the new ERP system that we purchased.