3 Replies Latest reply: Dec 10, 2012 4:16 AM by LionelBdv RSS

    Datascript pattern matching

      Hi,

       

      In a transform function I'm trying to implement controls on postal codes.

       

      For example, I need to check if a French postal code has the following format : DDDDD (5 digits).

       

      In order to do this, I do the following pattern matching:

       

      string.match(input.POSTCODE, '^%d%d%d%d%d$')

       

      I works fine, but is there a more compact way to write the same pattern?

       

      I've tried %d{5} or %d[5] but it doesn't seem to work.

       

      Thanks

      Lionel

        • Re: Datascript pattern matching
          Johannes Sunden

          For QlikView, you might want to use IsNum(), boolean function for checking if a value can be interpreted numerically, and Len(), which returns an integer with the length of the parameter fed to the function which should be 5 for a valid post code.

          • Re: Datascript pattern matching

            Thank you for raising this question as it gives me an opportunity to detail pattern matching in QlikView Expressor.

             

            In Expressor there are actually two different pattern matching syntaxes.  In your post, you employed the syntax that is used in the Datascript string functions.  This syntax is the Lua pattern matching syntax and the pattern you used is correct.  The Lua pattern matching syntax does not support the use of the curly braces { }, and the square braces [ ] are used to enclose a range of characters not a repeating character.

             

            Expressor also supports the standard regular expression syntax when you are defining a constraint on an atomic type.  This is something you could do when creating a shared atomic type or defining a composite type.  This syntax does support the curly braces { } to indicate a repeating character.

             

            There are a few other differences between these syntaxes: they use different escape characters and they interpret parentheses differently.

             

            So now you have the question of which syntax to use?  That depends on what you want to do when you find a bad pattern.  For example, if you want to skip a record containing a bad pattern, or reject and recover the record, or replace the bad pattern with a default pattern, then applying your pattern matching test as a type constraint is appropriate.  This way, your application can handle records with an invalid pattern in a way that requires no coding.

             

            But if you have some processing logic that you might be able to use to correct an invalid pattern, you could then use a Transform operator, the string.match function and some additional code in an attempt to correct the value.

             

            In fact, you could use both techniques.  For example, identify the invalid pattern in a Transform operator and attempt to correct it.  If your code cannot resolve the problem, leave the value as it is and a constraint assigned to the corresponding attribute in the output from the Transform will then fail the record and you will be able to skip or reject the record.

             

            To learn more about the different pattern matching syntaxes in Expressor, search the documentation for the word 'pattern' and read the topics 'Datascript Pattern Matching' and 'Reference: Constraint Pattern Matching'.  Also, 'Reference: Constraints on Semantic Types' and review the discussion of the Error Handling option on the Transform and Input operators.