Skip to main content
Announcements
See what Drew Clarke has to say about the Qlik Talend Cloud launch! READ THE BLOG
cancel
Showing results for 
Search instead for 
Did you mean: 
Not applicable

Please help me get the string pattern of each field.

Hi!  Please let me know the formula to get the string pattern of each field .

12 Replies
MK_QSL
MVP
MVP

What you exactly want?

Can you provide some sample data ?

Not applicable
Author

It's part of the data profiling project Im working on. I want to determine the typical string pattern of each column. So that after getting the typical pattern, I can compare and check each value  as valid or not dependending on the pattern defined for that column.

Ex.  Product_detail_code, Modified_date, etc..

PURCHASE_DATEPRODUCT_DETAIL_CODE
14-Nov-07Others
Dec 201010128738
26-Feb-0811197529
14-Dec-0711438953
18-Nov-0511425786
11-Sep-1311208766
30-Aug-1311209023
2-Sep-1311208856
23-Oct-1311208959
10-Sep-1311209074
12-Feb-1311208907
8-Nov-1311198348
Nov 201043005270
9-Sep-1311170032
MK_QSL
MVP
MVP

Something Like Below?

Load

RowNo() as NO,

PRODUCT_DETAIL_CODE,

PURCHASE_DATE as Original_PURCHASE_DATE,

Date(Alt(

  Date#(PURCHASE_DATE,'DD-MMM-YY'),

  Date#(PURCHASE_DATE,'MMM YYYY'),

  Date#(PURCHASE_DATE,'YYYYMMDD'),

  Date#(PURCHASE_DATE,'D-MM-YY'),

),'DD-MM-YYYY') as Modified_PURCHASE_DATE

Inline

[

  PURCHASE_DATE, PRODUCT_DETAIL_CODE

  14-Nov-07, Others

  Dec 2010, 10128738

  26-Feb-08, 11197529

  14-Dec-07, 11438953

  18-Nov-05, 11425786

  11-Sep-13, 11208766

  30-Aug-13, 11209023

  2-Sep-13, 11208856

  23-Oct-13, 11208959

  10-Sep-13, 11209074

  12-Feb-13, 11208907

  8-Nov-13, 11198348

  Nov 2010, 43005270

  9-Sep-13, 11170032

];

Not applicable
Author

Thanks Manish for quick response but thats not what I want. I want to determine the   typical pattern of  each field

Step 1 : Determine the typical pattern of the column. What is that formula to get the pattern

                                                                 Output

formulaXYZ(PURCHASE_DATE )  ==>  "dd-mm-yy"

formulaXYZ(Product_detail_code) ==> "NNNNNNNN"  where n = 0 to 9  , 8 digits

Step 2: Validate each values in that column follows the right pattern

Purchase_date                    Remarks

     Dec 2010           ==>      Invalid Format

     14-Nov-07          ==>     Valid

Product_Detail_Code

     Others               ==>      Invalid

10128738               ==>      Valid (8 digits)

MK_QSL
MVP
MVP

Got it...

Sorry but I dont have idea.. May be someone can suggest...

jagannalla
Partner - Specialist III
Partner - Specialist III

Hi,

Hope it helps you,

LOAD *,if(IsNum(PRODUCT_DETAIL_CODE),'Valid','InValid') as ProdRemark,

If(Alt(Date#(PURCHASE_DATE,'DD-MMM-YY'),0)=0,'InValid','Valid') as PurDateRemark;

LOAD * INLINE [

    PURCHASE_DATE, PRODUCT_DETAIL_CODE

    14-Nov-07, Others

    Dec 2010, 10128738

    26-Feb-08, 11197529

    14-Dec-07, 11438953

    18-Nov-05, 11425786

    11-Sep-13, 11208766

    30-Aug-13, 11209023

    2-Sep-13, 11208856

    23-Oct-13, 11208959

    10-Sep-13, 11209074

    12-Feb-13, 11208907

    8-Nov-13, 11198348

    Nov 2010, 43005270

    9-Sep-13, 11170032

];

If you want to check for 8 digits, use this calc if(IsNum(PRODUCT_DETAIL_CODE) and Len(PRODUCT_DETAIL_CODE)<=8,'Valid','InValid') as ProdRemark

Thanks,

Jagan

Not applicable
Author

Thanks Jagan.  Any suggestions how to get the typical pattern each column?

jagannalla
Partner - Specialist III
Partner - Specialist III

Rhodora - What do you mean by typical pattern each column ?

MarcoWedel

Hi Rhodora,

you are asking for a very complex function that most likely does not exist "off the shelve" in QlikView.

If you realy intend to implement such a thing (and you would surely be welcome to post the result) then using regular expressions might be the most promising approach.

Here are some links on using RegEx in QlikView:

hope this helps

regards

Marco