Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi,
Can you help me please,
I receive each month two files xlsx (fileA, FileB) and I want to store in one QVD file:
1) the intersection of the two files (using unique ID)
2) the records of fileB not present in fileA
like this :
fileA |
|
|
ID | MR |
|
1 | 1 |
|
2 | 4 |
|
3 | 8 |
|
|
|
|
fileB |
|
|
ID | MR |
|
1 | 2 |
|
2 | 5 |
|
4 | 15 |
|
|
|
|
file result in one QVD? |
| |
ID | MR | data source from |
1 | 1 | fileA |
2 | 4 | fileA |
4 | 15 | fileB |
thanks for your help,
Thierry
Sure it is. Change this:
Final:
NOCONCATENATE LOAD *
RESIDENT FileA
WHERE Exists(IDX,ID);
CONCATENATE (Final)
LOAD IDX as ID, MR, Amount, FilenameB, FiledateB
RESIDENT FileB
WHERE not Exists(ID,IDX);
DROP TABLES FileA, FileB;
store Final into Final.qvd (QVD)
to
Final:
NOCONCATENATE LOAD *
RESIDENT FileA
WHERE Exists(IDX,ID);
CONCATENATE (Final)
LOAD IDX as ID, MR, Amount, FilenameB as Filename, FiledateB as Filedate
RESIDENT FileB
WHERE not Exists(ID,IDX);
DROP TABLES FileA, FileB;
store Final into Final.qvd (QVD)
Hi thiery,
the issue of JOINs is always the same, regardless of the exact scripting_language or whatever you use:
=> The best way to visualize it IMHO is two circles that have an intersection.
=> If you want just that intersection (present in circle_A AND in circle_B) -> inner_join (standard in QV)
=> If you want the part of circle_A outside the intersection (present in A, BUT NOT in B) -> left_outer/ left
=> If you want the part of circle_B outside the intersection (present in B, BUT NOT in A) -> right_outer/ right
=> If you want the parts of both circles, but not the intersection (present EITHER in A or in B) -> full_outer
HTH
Best regards,
DataNibbler
This should do the trick:
FileA:
LOAD *, 'File A' as [Data Source] INLINE [
ID,MR
1,1
2,4
3,8
];
FileB:
LOAD ID as IDB, MR, 'File B' as [Data Source] INLINE [
ID,MR
1,2
2,5
4,15
];
Final:
NOCONCATENATE LOAD *
RESIDENT FileA
WHERE Exists(IDB,ID);
CONCATENATE (Final)
LOAD IDB as ID, MR, [Data Source]
RESIDENT FileB
WHERE not Exists(ID,IDB);
DROP TABLES FileA, FileB;
Example file is also attached.
DataNibbler said it all. Make sure to use different table names for each and use NoConcatenate to avoid the tables automatically merging again after the load because of the same field names.
Thanks a lot,
1) If I good understand, in the line "LOAD IDB as ID, MR, [datasource]" the datasource can be replaced by any field from fileB ?
2) I tried to write the script. I take in account the name of the two files like date and filename. How can I put the filedateB and filenameB in the field filedate and filename ?
My script :
FileA :
LOAD
filename() as Filename,
DATE(DATE#(left(right(filename(),11),6),'YYYYMM'),'MM/YYYY') as Filedate,
ID,
MR,
Amount
FROM
[201202.xlsx]
(ooxml, embedded labels, table is Sheet1);
store FileA into FileA.qvd (QVD);
FileB :
LOAD
filename() as FilenameB,
DATE(DATE#(left(right(filename(),12),6),'YYYYMM'),'MM/YYYY') as FiledateB,
ID as IDX,
MR,
Amount
FROM
[201202X.xlsx]
(ooxml, embedded labels, table is Sheet1);
store FileB into FileB.qvd (QVD);
Final:
NOCONCATENATE LOAD *
RESIDENT FileA
WHERE Exists(IDX,ID);
CONCATENATE (Final)
LOAD IDX as ID, MR, Amount, FilenameB, FiledateB
RESIDENT FileB
WHERE not Exists(ID,IDX);
DROP TABLES FileA, FileB;
store Final into Final.qvd (QVD)
result :
ID | MR | filedate | filenameA | filedateB | filenameB |
1 | 1 | 02/2012 | 201202.xlsx | ||
2 | 4 | 02/2012 | 201202.xlsx | ||
4 | 15 | 02/2012 | 201202X.xlsx |
is it possible to have in the final :
ID | MR | filedate | filename |
1 | 1 | 02/2012 | 201202.xlsx |
2 | 4 | 02/2012 | 201202.xlsx |
4 | 15 | 02/2012 | 201202X.xlsx |
Sure it is. Change this:
Final:
NOCONCATENATE LOAD *
RESIDENT FileA
WHERE Exists(IDX,ID);
CONCATENATE (Final)
LOAD IDX as ID, MR, Amount, FilenameB, FiledateB
RESIDENT FileB
WHERE not Exists(ID,IDX);
DROP TABLES FileA, FileB;
store Final into Final.qvd (QVD)
to
Final:
NOCONCATENATE LOAD *
RESIDENT FileA
WHERE Exists(IDX,ID);
CONCATENATE (Final)
LOAD IDX as ID, MR, Amount, FilenameB as Filename, FiledateB as Filedate
RESIDENT FileB
WHERE not Exists(ID,IDX);
DROP TABLES FileA, FileB;
store Final into Final.qvd (QVD)
Thanks a lot,
It is perfect.
Thierry