Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Independent validation for trusted, AI-ready data integration. See why IDC named Qlik a Leader: Read the Excerpt!
cancel
Showing results for 
Search instead for 
Did you mean: 
JamieKay
Contributor II
Contributor II

Pull Random Sample of Table Records

We are trying to load random rows from the source table in Oracle to do validation of what is in our target.  Is there a way to do that?

A query in Oracle would look something like:

SELECT * FROM <TABLE> 

SAMPLE(<sample percentage>)

Thank you!

Qlik Replicate 

 

 

 

 

 

 
Labels (1)
3 Solutions

Accepted Solutions
john_wang
Support
Support

Hello @JamieKay ,

It looks like you are trying to replicate random rows using a Full Load ONLY task, which may not be meaningful for the CDC (Change Processing) stage. If my understanding is correct, here are a few possible approaches:

  1. Filter Random Rows Using a Primary Key (PK) Range

    • You can retrieve random rows by applying a filter based on the PK random values(or other column values).
    • Example:
      WHERE PK BETWEEN TRUNC(DBMS_RANDOM.VALUE(1, 30000))  
                   AND TRUNC(DBMS_RANDOM.VALUE(1, 30000)) * 2​
       
    • Add this condition as a passthrough filter.
  2. Use a View to Select Random Rows

    • You can define a VIEW that selects a random subset of rows from the base table and then include the view in a Full Load ONLY task.
    • Example:
      • Select a percentage of rows:
        SELECT <column_list> FROM <table> SAMPLE(10);​
         
      • Select a fixed number of rows:
        SELECT <column_list> FROM <table>  
        ORDER BY DBMS_RANDOM.VALUE  
        FETCH FIRST 10 ROWS ONLY;​
         

Hope this helps!

John.

Help users find answers! Do not forget to mark a solution that worked for you! If already marked, give it a thumbs up!

View solution in original post

JamieKay
Contributor II
Contributor II
Author

That worked - thank you!!! 

Can I do the same using a Record Selection Condition?   

What I am wanting is 10 records randomly chosen where <some date column> >= to_char(current_date - 180, 'YYYY-MM-DD').

Thank you again, @john_wang!

View solution in original post

john_wang
Support
Support

Hello @JamieKay ,

Thanks for the update. Yes, you can; however, the passthrough filter offers the best performance since it eliminates unnecessary rows before they are retrieved from the source database. Please see my comments on the comparison of the three filters.

Regards,

John.

 

Help users find answers! Do not forget to mark a solution that worked for you! If already marked, give it a thumbs up!

View solution in original post

5 Replies
john_wang
Support
Support

Hello @JamieKay ,

It looks like you are trying to replicate random rows using a Full Load ONLY task, which may not be meaningful for the CDC (Change Processing) stage. If my understanding is correct, here are a few possible approaches:

  1. Filter Random Rows Using a Primary Key (PK) Range

    • You can retrieve random rows by applying a filter based on the PK random values(or other column values).
    • Example:
      WHERE PK BETWEEN TRUNC(DBMS_RANDOM.VALUE(1, 30000))  
                   AND TRUNC(DBMS_RANDOM.VALUE(1, 30000)) * 2​
       
    • Add this condition as a passthrough filter.
  2. Use a View to Select Random Rows

    • You can define a VIEW that selects a random subset of rows from the base table and then include the view in a Full Load ONLY task.
    • Example:
      • Select a percentage of rows:
        SELECT <column_list> FROM <table> SAMPLE(10);​
         
      • Select a fixed number of rows:
        SELECT <column_list> FROM <table>  
        ORDER BY DBMS_RANDOM.VALUE  
        FETCH FIRST 10 ROWS ONLY;​
         

Hope this helps!

John.

Help users find answers! Do not forget to mark a solution that worked for you! If already marked, give it a thumbs up!
JamieKay
Contributor II
Contributor II
Author

That worked - thank you!!! 

Can I do the same using a Record Selection Condition?   

What I am wanting is 10 records randomly chosen where <some date column> >= to_char(current_date - 180, 'YYYY-MM-DD').

Thank you again, @john_wang!

john_wang
Support
Support

Hello @JamieKay ,

Thanks for the update. Yes, you can; however, the passthrough filter offers the best performance since it eliminates unnecessary rows before they are retrieved from the source database. Please see my comments on the comparison of the three filters.

Regards,

John.

 

Help users find answers! Do not forget to mark a solution that worked for you! If already marked, give it a thumbs up!
JamieKay
Contributor II
Contributor II
Author

Thank you @john_wang!

john_wang
Support
Support

Thank you so much for your support @JamieKay 

Help users find answers! Do not forget to mark a solution that worked for you! If already marked, give it a thumbs up!