Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hello,
I am trying to sum a variable on 2 fields. It seems to work for one, but for two it does not execute correctly. This works:
Sum:
LOAD ch_id Sum(trans_amt) as tr_sum Resident flag_13_table Group By ch_id Order By ch_id;
This also works:
Sum:
LOAD temp_trans_date Sum(trans_amt) as tr_sum Resident flag_13_table Group By temp_trans_date Order By temp_trans_date ;
This does not:
Sum:
LOAD temp_trans_date,ch_id, Sum(trans_amt) as tr_sum Resident flag_13_table Group By temp_trans_date,ch_id Order By temp_trans_date,ch_id;
I get a column of ch_id values and temp_trans_date values, but no sum of trans_amt. How can I fix this?
I also want to left join it to my original table on both ch_id and temp_trans_date. What would the code be for this? I've not done it on multiple fields.
Thanks!
May be something along these lines
TempTable:
LOAD temp_trans_date,
ch_id,
Sum(trans_amt) as tr_sum
Resident flag_13_table
Group By temp_trans_date,ch_id;
Left Join (flag_13_table)
LOAD *,
If(tr_sum > 1.25 * Previous(tr_sum), 1, 0) as Flag1
Resident TempTable
Order By ch_id, temp_trans_date;
That us strange that it doesn't give you a sum, but for left join, you can do this:
Left Join (flag_13_table)
LOAD temp_trans_date,
ch_id,
Sum(trans_amt) as tr_sum
Resident flag_13_table
Group By temp_trans_date,ch_id
Order By temp_trans_date,ch_id;
I think you probably don't need Order By here...
Hello,
The reason I need an Order By is because I want it to be ordered first by CH_ID, then by date. It will look like this:
CH_ID TEMP_TRANS_DATE
3443 Nov-16
3443 Dec-16
3443 Jan-17
4553 Nov-16
....etc.
The reason is that the next step for me is to compare the value of trans_amt for ch_id 3443 in Dec-16 against the value of trans_amt for ch_id 3443 in Nov-16. If it is increased by 25% of more, I need to create a new field that has a '1'. If it is not increased, the new field will have a value of '0'. That is my end goal with this exercise.
Currently when I do the code as it is, it orders by CH_ID correctly, but the temp_trans_date is not ordering correctly - it gives me March, then May, then jumps back and forth in no real order.
Temp_trans_date is a created variable with this code:
(Date(monthstart(trans_date), 'MMM-YY')) as temp_trans_date, /*gives date in MMM-YY format */
The variable trans_date is the original date variable, but it is in the format 12/31/2016 - I need the date to be monthly.
Do you have any suggestions on how to code this full process? To recap, steps required:
- Sort data by CH_ID, then date (which is month-year) - with the oldest date at the top
- Create a field which has a value of 1 if trans_amt for a particular CH_ID on a particular date is 25% greater than the trans_amt for a particular CH_ID the month before it, and '0' if not.
- There are additional parameters which I have not mentioned, because I think I can figure them out. But if you are able to help that would be fantastic. It is all in the SAS code below.
An additional bonus is to ignore the first month of data in all cases (since there is no previous month to compare it to)... I have no idea how to even begin with that.
For reference, my SAS code that I did this previously is:
proc sql;
create table tr_sum_flag1 as select
CH_ID, temp_trans_date, flag1, trans_ID,
sum(trans_amt) as tr_sum,
count(trans_id) as tr_count
from combined_pcard_data
group by CH_ID, temp_trans_date
order by CH_ID, temp_trans_date ;
quit;
proc sql noprint;
select intnx('day',min(temp_trans_date),0,'E') into: MinDate
from tr_sum_flag1;
quit; /*this code identifies the first month of data, so that flag 1 can be performed on all data AFTER the first month. */
data tr_flag1;
set tr_sum_flag1;
by CH_ID temp_trans_date ;
if first.temp_trans_date then do;
if temp_trans_date > &MinDate. AND tr_sum ge 100 AND tr_count ge 10 AND tr_sum ge 1.25*(lag(tr_sum)) then flag1 = 1;
else flag1=0 ;
output;
end;
run;
Thanks so much for your help - any insight at all that you can provide is much appreciated!
May be something along these lines
TempTable:
LOAD temp_trans_date,
ch_id,
Sum(trans_amt) as tr_sum
Resident flag_13_table
Group By temp_trans_date,ch_id;
Left Join (flag_13_table)
LOAD *,
If(tr_sum > 1.25 * Previous(tr_sum), 1, 0) as Flag1
Resident TempTable
Order By ch_id, temp_trans_date;
Thanks Sunny, that works very well!
I tried to replicate this method for another problem I am working on, but without success. I think I may be missing something small. Can you help? This goes directly after the code you provided:
tr_sum_flag3:
LOAD ch_id,
temp_trans_date,
merchant,
trans_amt,
Sum(trans_amt) as tr_sum_merchant
Resident flag_13_table
Group By ch_id, temp_trans_date;
tr_sum_:
LOAD temp_trans_date,
ch_id,
Sum(trans_amt) as tr_sum_,
count(trans_amt) as tr_count_
Resident tr_sum_flag3
Group By ch_id, temp_trans_date;
Left Join (flag_13_table)
LOAD *,
If(tr_sum_ >= 100 AND tr_count_ >= 10 AND (tr_sum_merchant/tr_sum_) >= 0.4, 1, 0) as flag_3_test
Resident tr_sum_
Order By ch_id, temp_trans_date;
I believe the logic is similar, but for some reason it is giving me an error.
Thanks!
You are not grouping by all your non-aggregated dimensions here:
May be this:
tr_sum_flag3:
LOAD ch_id,
temp_trans_date,
merchant,
trans_amt,
Sum(trans_amt) as tr_sum_merchant
Resident flag_13_table
Group By ch_id, temp_trans_date, merchant;