Skip to main content
Announcements
NEW: Seamless Public Data Sharing with Qlik's New Anonymous Access Capability: TELL ME MORE!
cancel
Showing results for 
Search instead for 
Did you mean: 
Not applicable

Please help me get the string pattern of each field.

Hi!  Please let me know the formula to get the string pattern of each field .

12 Replies
MK_QSL
MVP
MVP

What you exactly want?

Can you provide some sample data ?

Not applicable
Author

It's part of the data profiling project Im working on. I want to determine the typical string pattern of each column. So that after getting the typical pattern, I can compare and check each value  as valid or not dependending on the pattern defined for that column.

Ex.  Product_detail_code, Modified_date, etc..

PURCHASE_DATEPRODUCT_DETAIL_CODE
14-Nov-07Others
Dec 201010128738
26-Feb-0811197529
14-Dec-0711438953
18-Nov-0511425786
11-Sep-1311208766
30-Aug-1311209023
2-Sep-1311208856
23-Oct-1311208959
10-Sep-1311209074
12-Feb-1311208907
8-Nov-1311198348
Nov 201043005270
9-Sep-1311170032
MK_QSL
MVP
MVP

Something Like Below?

Load

RowNo() as NO,

PRODUCT_DETAIL_CODE,

PURCHASE_DATE as Original_PURCHASE_DATE,

Date(Alt(

  Date#(PURCHASE_DATE,'DD-MMM-YY'),

  Date#(PURCHASE_DATE,'MMM YYYY'),

  Date#(PURCHASE_DATE,'YYYYMMDD'),

  Date#(PURCHASE_DATE,'D-MM-YY'),

),'DD-MM-YYYY') as Modified_PURCHASE_DATE

Inline

[

  PURCHASE_DATE, PRODUCT_DETAIL_CODE

  14-Nov-07, Others

  Dec 2010, 10128738

  26-Feb-08, 11197529

  14-Dec-07, 11438953

  18-Nov-05, 11425786

  11-Sep-13, 11208766

  30-Aug-13, 11209023

  2-Sep-13, 11208856

  23-Oct-13, 11208959

  10-Sep-13, 11209074

  12-Feb-13, 11208907

  8-Nov-13, 11198348

  Nov 2010, 43005270

  9-Sep-13, 11170032

];

Not applicable
Author

Thanks Manish for quick response but thats not what I want. I want to determine the   typical pattern of  each field

Step 1 : Determine the typical pattern of the column. What is that formula to get the pattern

                                                                 Output

formulaXYZ(PURCHASE_DATE )  ==>  "dd-mm-yy"

formulaXYZ(Product_detail_code) ==> "NNNNNNNN"  where n = 0 to 9  , 8 digits

Step 2: Validate each values in that column follows the right pattern

Purchase_date                    Remarks

     Dec 2010           ==>      Invalid Format

     14-Nov-07          ==>     Valid

Product_Detail_Code

     Others               ==>      Invalid

10128738               ==>      Valid (8 digits)

MK_QSL
MVP
MVP

Got it...

Sorry but I dont have idea.. May be someone can suggest...

jagannalla
Partner - Specialist III
Partner - Specialist III

Hi,

Hope it helps you,

LOAD *,if(IsNum(PRODUCT_DETAIL_CODE),'Valid','InValid') as ProdRemark,

If(Alt(Date#(PURCHASE_DATE,'DD-MMM-YY'),0)=0,'InValid','Valid') as PurDateRemark;

LOAD * INLINE [

    PURCHASE_DATE, PRODUCT_DETAIL_CODE

    14-Nov-07, Others

    Dec 2010, 10128738

    26-Feb-08, 11197529

    14-Dec-07, 11438953

    18-Nov-05, 11425786

    11-Sep-13, 11208766

    30-Aug-13, 11209023

    2-Sep-13, 11208856

    23-Oct-13, 11208959

    10-Sep-13, 11209074

    12-Feb-13, 11208907

    8-Nov-13, 11198348

    Nov 2010, 43005270

    9-Sep-13, 11170032

];

If you want to check for 8 digits, use this calc if(IsNum(PRODUCT_DETAIL_CODE) and Len(PRODUCT_DETAIL_CODE)<=8,'Valid','InValid') as ProdRemark

Thanks,

Jagan

Not applicable
Author

Thanks Jagan.  Any suggestions how to get the typical pattern each column?

jagannalla
Partner - Specialist III
Partner - Specialist III

Rhodora - What do you mean by typical pattern each column ?

MarcoWedel

Hi Rhodora,

you are asking for a very complex function that most likely does not exist "off the shelve" in QlikView.

If you realy intend to implement such a thing (and you would surely be welcome to post the result) then using regular expressions might be the most promising approach.

Here are some links on using RegEx in QlikView:

hope this helps

regards

Marco