12 Replies Latest reply: Oct 28, 2016 4:31 PM by Sunny Talwar RSS

    Group by statement

    Jes Lee

      Hi Everyone,

       

      I'm trying to resolve an issue with my data loads where the fields are not being joined by date.  As a result, when I table them together I end up with a lot of nulls in adjacent cells and duplicate dates.  Please see screenshot #1 .   So I tried using the Group By statement to join the fields by MonthYear, which is the preferred date format.   However, I get an error message , pls see Screenshot #2.

       

      I hope someone would be able to help me.  Thanks in advance.

       

       

       

      I used the following script after my main load step.

       

      Load *

      Resident Total_Nonfarm_SA_thous

      Group by MONTHYEAR ;

       

       

      Screenshot #1:

       

       

       

       

      Screenshot #2: 

        • Re: Group by statement
          Sunny Talwar

          What is the script you have used?

            • Re: Group by statement
              Jes Lee

              Here is the entire script -

               

               

              Total_Nonfarm_SA_thous:
              LOAD
                  series_id,
                  "year",
                  period,
                  Date(Makedate("year",right(period,2)), 'MMMYYYY') as MONTHYEAR,
                  value,
                  If(series_id = 'SMS01000000000000001', value) as [Alabama_NF_thous],  
                  If(series_id = 'SMS02000000000000001', value) as [Alaska_NF_thous]
                 
              FROM [lib://BLS Employment Series AllData]
              (txt, codepage is 1252, embedded labels, delimiter is '\t', msq);

              Load
              MONTHYEAR,
              [Alabama_NF_thous],
              [Alaska_NF_thous]
              Resident Total_Nonfarm_SA_thous
              Group by MONTHYEAR ;

                • Re: Group by statement
                  Sunny Talwar

                  May be this:

                   

                  Total_Nonfarm_SA_thous:
                  LOAD
                      series_id,
                      "year",
                      period,
                      Date(Makedate("year",right(period,2)), 'MMMYYYY') as MONTHYEAR,
                      value,
                      If(series_id = 'SMS01000000000000001', value) as [Alabama_NF_thous], 
                      If(series_id = 'SMS02000000000000001', value) as [Alaska_NF_thous]
                     
                  FROM [lib://BLS Employment Series AllData]
                  (txt, codepage is 1252, embedded labels, delimiter is '\t', msq);

                  Load
                  MONTHYEAR,
                  Sum([Alabama_NF_thous]) as [Alabama_NF_thous],
                  Sum([Alaska_NF_thous]) as [Alaska_NF_thous]
                  Resident Total_Nonfarm_SA_thous
                  Group by MONTHYEAR ;

                    • Re: Group by statement
                      Maxim Senin

                      Exactly!

                       

                      When you use Group By you should use some aggr. function in the Load. Please give a look at the manual.

                       

                      Best regards,

                      Maxim

                      • Re: Group by statement
                        Jes Lee

                        Thanks, Sunny.  It loaded, but it still hasn't resolved the original problem.  When I place them in the table, there are still lots of duplicate dates and nulls.... I'm wondering if there's a another solution for this.

                         

                        • Re: Group by statement
                          Jes Lee

                          Sunny,

                           

                          I tried rerunning the group by statement, but it's not working.  I'm not sure what I'm doing wrong this time.

                           

                          Here's the original script.

                           

                          BLS_data_1:
                          LOAD
                              series_id,
                              "year",
                              period,
                              value,
                              Date(Makedate("year",right(period,2)),'M/D/YYYY') as DATE,
                              Date(Makedate("year",right(period,2)), 'MMMYYYY') as MONTHYEAR,
                              If(series_id ='SMS01000000000000001', value) as [Alabama],
                              If(series_id ='SMS02000000000000001', value) as [Alaska]

                          FROM [lib://BLS Employment Series AllData]
                          (txt, codepage is 1252, embedded labels, delimiter is '\t', msq);

                          BLS_data:
                          LOAD
                              MONTHYEAR,
                              DATE,
                              SUM([Alabama]) as [Alabama_NF],
                              SUM([Alaska]) as [Alaska_NF]

                          Resident BLS_data_1
                          Group by MONTHYEAR ;
                          Drop Table BLS_data_1;

                           

                            • Re: Group by statement
                              Sunny Talwar

                              Anytime you have a non-aggregating field (one which doesn't have Sum, Min, Max, Median, Only, Concat.... function), you need to add it to the Group By Statement

                               

                              BLS_data_1:
                              LOAD
                                  series_id,
                                  "year",
                                  period,
                                  value,
                                  Date(Makedate("year",right(period,2)),'M/D/YYYY') as DATE,
                                  Date(Makedate("year",right(period,2)), 'MMMYYYY') as MONTHYEAR,
                                  If(series_id ='SMS01000000000000001', value) as [Alabama],
                                  If(series_id ='SMS02000000000000001', value) as [Alaska]

                              FROM [lib://BLS Employment Series AllData]
                              (txt, codepage is 1252, embedded labels, delimiter is '\t', msq);

                              BLS_data:
                              LOAD
                                  MONTHYEAR,
                                  DATE,
                                  SUM([Alabama]) as [Alabama_NF],
                                  SUM([Alaska]) as [Alaska_NF]

                              Resident BLS_data_1
                              Group by MONTHYEAR, DATE ;
                              Drop Table BLS_data_1;