Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 
harsha
Creator
Creator

Compare two itemIDs' values and highlight with flag

Hi Experts,

Hope everyone is doing great..!

I want to compare the value of pointID (60634843286) with other value of pointID (60634843285)

Ex:

CompanyID: 4129034, year: 2017, value of pointID: 60634843286 is 287966000 greater than the 
CompanyID: 4129034, year: 2017, value of pointID: 60634843285 is 145077000

I have applied below script to get anomalies and It was working awesome with 9 million data.

"

tab1:

LOAD RowNo() As RowID,* INLINE [

    CompanyID, PointID, Year, Value

    635939, 5290, 2017, 350

    635939, 5291, 2017, 250

    635939, 5290, 2018, 200

    635939, 5291, 2018, 300

    635939, 5290, 2019, 420

    635939, 5291, 2019, 280

    516278, 5290, 2017, 810

    516278, 5291, 2017, 850

    516278, 5290, 2018, 460

    516278, 5291, 2018, 340

    516278, 5290, 2019, 290

    516278, 5291, 2019, 310

    635939, 6320, 2017, 460

    635939, 6321, 2017, 320

    635939, 6320, 2018, 650

    635939, 6321, 2018, 750

    635939, 6320, 2019, 820

    635939, 6321, 2019, 600

    516278, 6320, 2017, 270

    516278, 6321, 2017, 350

    516278, 6320, 2018, 510

    516278, 6321, 2018, 750

    516278, 6320, 2019, 470

    516278, 6321, 2019, 360

];

 

tab2:

LOAD CompanyID, Year, Value As Value1

Resident tab1

Where PointID=5290;

 

Left Join(tab2)

LOAD CompanyID, Year, Value As Value2

Resident tab1

Where PointID=5291;

 

Left Join(tab1)

LOAD CompanyID, Year, If(Value1<Value2, 'Yes','No') As Glsag

Resident tab2;

Drop Table tab2;

"

I received a new QVD file from other team to find the discrepancies today with 26 million data and I have applied the same solution but this time it is 3 hours still the data is in loading stage. I tried making some changes to see if the loading gets faster but no use.  

Request you to please help ..!

23 Replies
Taoufiq_Zarra

@harsha  can you elaborate ?

or share the input data and the expected output from the sample ?

Regards,
Taoufiq ZARRA

"Please LIKE posts and "Accept as Solution" if the provided solution is helpful "

(you can mark up to 3 "solutions") 😉
harsha
Creator
Creator
Author

Hi @Taoufiq_Zarra  , 

Thank you for your response.

Requirement: If the companyID and Periods are same and then compare value of pointID (5290) with other value of pointID (5291) and find  if pointID 5290 value is (350) greater than the pointID  5291 value  (250)

Ex:

CompanyID: 4129034, year: 2017, value of pointID: 5290 is 350 greater than the 
CompanyID: 4129034, year: 2017, value of pointID: 5291 is 250 

attached is a sample

Taoufiq_Zarra

@harsha 

if we have that :

CompanyID: 4129034, year: 2017, value of pointID: 5290 is 350 greater than the 
CompanyID: 4129034, year: 2017, value of pointID: 5291 is 250 

what's the next action to take?


the expected output ?

Regards,
Taoufiq ZARRA

"Please LIKE posts and "Accept as Solution" if the provided solution is helpful "

(you can mark up to 3 "solutions") 😉
harsha
Creator
Creator
Author

I want to compare value of 5290 is greater than 5291 then create a Flag_1 in a new dimension.

Like wise, i want to compare 6320 is greater than 6321 as Flag_2 in a new dimension.

7618 is greater than 7619 as Flag_3 in a new dimension.

 

GregoireVG
Contributor II
Contributor II

Hello Harsha,

Did you try with the peek function? It'll be way lighter than left joins.

 

tab2:
LOAD
	CompanyID,
    Year,
    PointID,
    Value,
    IF(Value < IF(CompanyID=Peek('CompanyID'), Peek(Value), 0), 'Yes', 'No') AS Glsag
RESIDENT tab1
//WHERE MATCH(PointID, 5290, 5291)
ORDER BY CompanyID, Year, PointID DESC
;

 

This will create a flag as you requested (see attachment)

 

harsha
Creator
Creator
Author

Hi  @GregoireVG ,

Thanks for your response.. seems working. however, i want to highlight both IDs if condition meets.

Ex: company ID: 635939, year : 2018 should highlight both PointID as yes.. pls see below

Untitled1.png

sunny_talwar

May be try this

tab1:
LOAD * INLINE [
    CompanyID, PointID, Year, Value
    635939, 5290, 2017, 350
    635939, 5291, 2017, 250
    635939, 5290, 2018, 200
    635939, 5291, 2018, 300
    635939, 5290, 2019, 420
    635939, 5291, 2019, 280
    516278, 5290, 2017, 810
    516278, 5291, 2017, 850
    516278, 5290, 2018, 460
    516278, 5291, 2018, 340
    516278, 5290, 2019, 290
    516278, 5291, 2019, 310
    635939, 6320, 2017, 460
    635939, 6321, 2017, 320
    635939, 6320, 2018, 650
    635939, 6321, 2018, 750
    635939, 6320, 2019, 820
    635939, 6321, 2019, 600
    516278, 6320, 2017, 270
    516278, 6321, 2017, 350
    516278, 6320, 2018, 510
    516278, 6321, 2018, 750
    516278, 6320, 2019, 470
    516278, 6321, 2019, 360
];

tab2:
LOAD CompanyID,
     Year,
     PointID,
     Value,
     If(CompanyID = Previous(CompanyID) and Year = Previous(Year) and PointID = Peek('PointID') + 1, If(Value < Peek(Value), 'Yes', 'No'), '') AS tmpFlag,
     If(CompanyID = Previous(CompanyID) and Year = Previous(Year) and PointID = Peek('PointID') + 1, Peek('SNo'), RangeSum(Peek('SNo'), 1)) as SNo
Resident tab1
Order By CompanyID, Year, PointID;

Left Join(tab2)
LOAD CompanyID,
	 Year,
	 SNo,
	 MaxString(tmpFlag) as Flag
Resident tab2
Group By CompanyID, Year, SNo;

DROP Table tab1;
DROP Field tmpFlag;
harsha
Creator
Creator
Author

Hi @sunny_talwar , thank u for your response. 

I'm getting an error while loading... Error.png

sunny_talwar

Did you create SNo?

image.png