2 Replies Latest reply: Nov 12, 2013 9:22 PM by Stephen Daspit RSS

    scripting to get summary/aggregate statistics on Cartesian product of two fields in table

      Hello! I've tried Google and the QV forums, but this is a fairly complicated one to explain and search for. I don't think I'm missing an obvious solution, but then again...


      I have a single table with ~30 million records. To simplify greatly, assume the table looks like this:




      For each unique DateSerial+Code, I need summary/aggregation statistics like count() and sum(). I have "solved" this problem, but my solution is incredibly slow! The Cartesian product of DateSerial and Code is ~60,000; there are ~300 unique DateSerials and ~200 unique Codes.


      After already scrubbing out nulls in DateSerial and Code, I loaded distinct values of each of those fields into their own new tables, then I did a join on the tables and concatenated the fields' values to get my ~60,000 unique DateSerial+Code values (example: "41590_AA"). I loop through that Cartesian table using a For Next loop hanging on RecNo(); for each Cartesian unique, I left join from my master table. I then left join onto that table the aggregation values of count(), numbercount(), and sum(). Finally, I take the topmost row from each Cartesian table and store that into a storage table.


      The above craziness works, but it's incredibly slow. I end up with a table that looks like this:


      Cartesian IndexCounted TotalNum Counted TotalSum Total


      For reference, my script has been running for ~3 hours, and its on record ~250 of the ~60,000 Cartesian uniques. Help! I appreciate any guidance you can offer.


      Unfortunately, due to the nature of the data, I can't really put the .QVW up here for reference.