Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 
Not applicable

Set analysis causing out of object memory

Good evening, im using a bit of set analysis to identify duplicate records. Prior to the set analysis being used the script runs fine up to this point.

I then use this :-

=If(Count(TOTAL <Payment_Reference_In_CA>CA) > 1,'X')

As soon as I do (looking at task manager) the memory usage goes through the roof until I get out of object memory.

I have 4.7 million lines of data loaded which I dont really need anyway.

Can anyone help me by limiting the load by restricting the data to only load data relating to Payment_Reference_In_CA if it has more than 1 record under field CA.

Thanks.

Paul.

1 Solution

Accepted Solutions
Not applicable
Author

Sorted it, I just told it to load entries where len>0 there were quite a few entries that were blank therefore useless to me, this then made my script only load what I needed

View solution in original post

13 Replies
swuehl
MVP
MVP

Paul,

could you please clarify if your issue arise while executing your load script or when you add an expression to a chart object?

You are saying, '..the script runs fine to this point' so I assume you are executing a load script.

But your posted expression looks like a chart object expression.

Also, the expression you posted

=If(Count(TOTAL <Payment_Reference_In_CA>CA) > 1,'X')

does not contain a set analysis construct, I only can see a total qualifier with field list as part of your aggregation function.

The expression itselft doesn't look problematic to me with regard to extreme memory consumption, but I also know next to nothing about your setting, so it may be helpful if you could add some information about your data model, chart dimensions and expressions (if you are using that expression in chart object), or the relevant part of your load script.

If possible, attaching a small sample application that demonstrate your issue might speed things up.

Regards,

Stefan

Not applicable
Author

Hi Stefan, thanks for your reply, I am running a load script where I only want it to provide non-distinct records. Here is the script I am using:-

Main Table:

LOAD BP,

     CA,

     Incoming_Payment_Method,

     Annual_DD_Payer,

     Standing_Order,

     Payment_Reference_In_CA

FROM*

I am wanting to make the script load where field 'Payment_Reference_In_CA' has more than 1 relating entry for Field 'CA'.

The Expression =If(Count(TOTAL <Payment_Reference_In_CA>CA) > 1,'X' works well if there isnt a large amount of data to pull. I am wanting to achieve a restricted load in order to limit the load to what I only need.

nilesh_gangurde
Partner - Specialist
Partner - Specialist

Dear mrpjspencer,

You can follow the below mentioned things:

1) Use of if statement in the expression is not the good practice,because for the large data it wont work.

2) You can load the limited no. of records then make selections and then load the whole data. To avoid selection erasing use lock.

Hope this helps..

Regards,

Nilesh Gangurde

Not applicable
Author

I am trying to construct an IF statement to limit the load to what I need. This is where I am wanting to go, am drawing blanks at the moment.

swuehl
MVP
MVP

I don't think a single if() statement would be sufficient, since you can't decide to load / not load the record only looking at the one single record, right?

I believe you probably need to do a group by load first, something like

FILTER:

LOAD Payment_Reference_In_CA where Count >1;

LOAD

Payment_Reference_In_CA,

count(CA) as Count

FROM ... group by Payment_Reference_In_CA;

LOAD

Main Table:

LOAD BP,

     CA,

     Incoming_Payment_Method,

     Annual_DD_Payer,

     Standing_Order,

     Payment_Reference_In_CA

FROM ... where exists(Payment_Reference_In_CA);

drop table FILTER;

edit: Forgot the group by clause ...

Not applicable
Author

Hi,

I think you need a preceeding LOAD with a group by clause to fix the records you need during the following (main-)load.

LOAD

    CA,

Where CountDistinct > 1

;

LOAD

    Count(DISTINCT Payment_Reference_In_CA)    AS CountDistinct,

    CA,

Resident RawUser

Group By CA,

;

Second:

Do the Load with en exists() like

LOAD

     *

where exists(CA)

This should be faster.

HtH

Roland

Not applicable
Author

This seems a bit messy where I am only after achieving something quite simple. (to anglicise what I want:-)

(this goes for all entries within my database, there are millions)

Only load property if it has more than 1 telephone number,

or

Only load record A if record B has more than 1 entry thats linked to record A...

swuehl
MVP
MVP

Maybe I still don't get what you are after, but as I said, this is not a simple request, since you can't decide based on only looking at single records, you need to aggregate, right?

Also in your chart expression, that's what you are doing (using dimensions to group by):

=If(Count(TOTAL <Payment_Reference_In_CA>CA) > 1,'X')

How would you answer your question right, if a record A needs to be included, without looking at all records for B?

Not applicable
Author

This seems a bit messy where I am only after achieving something quite simple. (to anglicise what I want:-)

(this goes for all entries within my database, there are millions)

Only load property if it has more than 1 telephone number,

or

Only load record A if record B has more than 1 entry thats linked to record A...