Qlik Community

QlikView Documents

Documents for QlikView related information.

Pretty fast Pattern Collector

MVP & Luminary
MVP & Luminary

Pretty fast Pattern Collector

Hi all,

inspired by Barry’s great post “A very elegant link table script” about speeding up link table generation in Qlik by using the FieldValue function I looked for other use cases. Operations on the symbols of loaded tables in Qlik are blazing fast compared to record based operations (join, group by, etc.)…

Considering an ETL chain with Data Quality in mind one typical task is to check the data pre-load to discover issues (eg. unallowed formats or patterns in data, value ranges, thresholds, missing dimensional values etc.) to have an ability to stop loading processes in an early stage. Data Profiling is a usual task to solve these kind of problems.

This sparked the idea in me to create a Pretty fast Pattern Collector based on the above principle:

Looping over all fields and symbols of a loaded table to collect all formats and patterns from the data values

Pseudocode Pattern Collector

The performance of this simplified Data Profiling approach does not depend on the amount of records but on the amount of symbols: distinct values per field

The trade off is, you do not have a relation from the symbol to the record and so you cannot do aggregations like count, avg or sum, although min/max will work.

Pattern Collector App

I was surprised about the load speed during my testings! Just try yourself and download the QlikView app here. The Load Script code will work in Qlik Sense also.

I’m curious what other use cases would exist for using the speed of Qlik’s symbol processing.

Happy profiling!

Ralf

Attachments
Comments
santiago_respane
Valued Contributor

Great article!

Thanks for sharing!

0 Likes
Not applicable

useful one.

thank you

0 Likes

Indeed, performance is incredible.

However, I almost immediately got stuck on something weird resulting from the code in your first screenshot. The FieldValue() assignment loses the text representation of all dual values and replaces it with a question mark. At least, that's what shows up in a List box or a Table box.

DOC-17340-tablebox.jpg

I found two workarounds for this weird behavior:

  • Either override the Value column format in the Table box using the Properties->Number tab (not very attractive as the value type differs for each field), or
  • Recreate a real dual value using the code below. But that looks a bit stupid to me.

DOC-17340-code.jpg

I don't really know what is causing this hickup. Possibly forgot to read a few essential KB articles.

Peter

[Edit] Text(FieldValue) and Num(FieldValue) as dual() parameters now destroy all text-only values. Removing num() and text() makes the dual call work in all (?) cases.

MVP & Luminary
MVP & Luminary

Had same issue so I used Text(FieldValue( )) in the uploaded app. I think dual() could make sense for some use cases (min/max) but probably not for the pattens here.

0 Likes

Well, In my case I had to remove the text() and num() functions from the dual() call because they seemed to destroy all text-only values. dual(FieldValue(), FieldValue()) seems to work (so far). Weird...

0 Likes
MVP & Luminary
MVP & Luminary

Yea, that's how dual() works getting duals 😉

0 Likes
MVP & Luminary
MVP & Luminary

Weird, the first solutions should work too. Can you attache it?

0 Likes
Version history
Revision #:
1 of 1
Last update:
‎2016-09-19 09:51 AM
Updated by: