Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 
hopkinsc
Partner - Specialist III
Partner - Specialist III

Best way to add 2 tables together?

Hi all,

Can someone guide me on the best or fastest way to add 2 tables together please?

Currently i am building an application locally on my laptop, i want to try and get it right before i hand it over to the administrator to put it on the server.

There are over 40 million records currently on our server which will need to be reloaded into this app, and obviously the data is increasing daily, so i am looking for the best way to add 2 tables together.

The tables have 20+ fields and roughly half of the fields have the same field names.

Here are my 2 tables...

SET cFileName = $(vcRetailQVD)StoreStockMovement.qvd;
IF QvdCreateTime('$(cFileName)') >= 0 THEN
StoreStockMovement:
  LOAD DISTINCT
     StoreNumber,
     ProductCode,
     EANCode,
     IssueNumber,
     CurrentRSP,
     CostPrice,
     StockLevel,
     ProductStatus,
     ExtractedDate,
     RentalCopies,
     NumberOfRentals,
     NumberOfRefunds,
     ExRentalPrice,
     ExRentalQuantity,
     NoExRentalSold,
     RentalRevenues,
     OverdueCharges,
     PriceBand,
     MediaFormat,
     RentalItem,
     OnOrderQuantity,
     MinimumStockLevel,
     ReorderQuantity,
     StockOnOrder,
     ReservedStock,
     StockInTransit,
     SafetyDays,
     OrderWeightWeek1,
     OrderWeightWeek2,
     OrderWeightWeek3,
     OrderWeightWeek4,
     OrderWeightWeek5,
     OrderWeightWeek6,
     OrderWeightWeek7,
     OrderWeightWeek8,
     OrderWeightWeek9,
     OrderWeightWeek10,
     OrderWeightWeek11,
     OrderWeightWeek12,
     OrderWeightWeek13,
     MinimumOrderQuantity,
     SoldOnWebsite,
     DiscountCategoryName,
     RSPAdjustment,
     AdjustmentReason

     FROM $(cFileName) (qvd);

ENDIF

SET cFileName = $(vcRetailQVD)StoreProductStock.qvd;
IF QvdCreateTime('$(cFileName)') >= 0 THEN

StoreProductStock:
LOAD DISTINCT
  StoreNumber,
     StoreDate,
     [Stock Movement Reason Code],
     ProductCode,
     EANCode,
     IssueNumber,
     CommodityGroup,
     EmployeeId,
     Narrative,
     StockQty,
     StockValue,
     LinesAffected,
     CostValueDiff,
     RSPValueDiff,
     ExtractedDate,
     StoreTime
     FROM $(cFileName) (qvd);

ENDIF

Thanks

5 Replies
ToniKautto
Employee
Employee

Joining these two sources will lead to a table with a lot of NULL values, since they do not share all fields. Could you provide a sample of data to demonstrated what the resulting table should looks like, since this determines what you should do in order to get the desired result?

fernandotoledo
Partner - Specialist
Partner - Specialist

I think the best way is to concatenate them.

You shall create a unique date field that works for both stock and movement.

If you want to know movement, you use sum(OrderQuantity), if you want stock use sum(StockQty)

Hope it helps

Fernando

Not applicable

If you've got 40 million records don't be shocked if you run out of memory... my laptop can't load ~21 million rows (~2gb csv).  It can load stuff from that file with WHERE clauses but it can't actually load all 21 million rows.

hopkinsc
Partner - Specialist III
Partner - Specialist III
Author

Hi, yes i know i cannot load all that data on my laptop. i am working locally with limited data, but the server will be loading in all records. I have used an INNER JOIN which seems to work ok.

Thanks

johnw
Champion III
Champion III

Is it necessary to "load distinct"?  That's really going to slow things down, as you'll be getting an unoptimized load instead of an optimized load.

Beyond that, it's down to what you're trying to do.  If inner join is producing the results you want, then stick with it.  Other forms of join or a concatenate will behave differently.