Skip to main content
Announcements
Qlik Connect 2024! Seize endless possibilities! LEARN MORE
hic
Former Employee
Former Employee

The search functionality is central to QlikView. You enter a string, and QlikView immediately searches in the active list box and displays the matches. But what really defines a match? For example, should you find strings containing ‘Š’ when your search string contains an ‘S’? Or ‘Ä’ when you search for ‘A’?

These may be odd questions for people with English as first language, but for the rest of us who use “strange” characters daily, these questions are important as the answers affect not just search results, but also sort orders.

It is called Collation.

A collation algorithm defines a process of how to compare two given character strings and decide if they match and also which string should come before the other. So, the collation affects everything from which search result you get in a query, to how the phone directory is sorted.

Basically the collation is defined differently in different languages. Examples:

  • The English collation considers A, Å and Ä to be variants of the same letter (matching in searches and sorted together), but the Swedish collation does the opposite: it considers them to be different letters.
  • The English collation considers V and W to be different letters (not matching, and not sorted together), but the Swedish collation does the opposite: it considers them to be variants of the same letter.
  • Most Slavic languages consider S and Š to be different letters, whereas most other languages consider them to be variants of the same letter.
  • In German, Ö is considered to be a variant of O, but in Nordic and Turkish languages it is considered a separate letter.
  • In most western languages I is the upper case version of i, but in Turkish languages, I is the upper case of dotless ı, and İ (dotted) is the upper case of dotted i.

An example of how these differences affect sort orders and search results can be seen in the pictures below:

English.png   Swedish.png

The search string is the same in both cases, and should match all field values that have words beginning with ‘a’ or ‘v’. Note that sort orders as well as search results differ.

Hence: A number of differences exist between languages that have special characters or characters with diacritic marks, e.g. Å, Ä Ö, Æ, Ø, Þ, Ś, Ł, Î, Č. Sometimes these characters are considered as separate letters, sometimes not. Some languages even have collation rules for letter combinations and for where in the word an accent is found. An overview can be found on Wikipedia.

So, how does QlikView handle this?

When QlikView is started, the collation information is fetched from the regional settings of the operating system. This information is then stored into the qvw file when the script is run.

Locale.png

Usually you don’t need to think about this, but should you want to test it yourself, just change the regional settings in the control panel (the Formats tab – not the Location tab), restart QlikView, and run the script of your application.

Bottom line – should you need to change the collation, you should do it on the computer where the script is run.

HIC

Further reading related to this topic:

Text searches

The Search String

The Expression Search

26 Comments
luciancotea
Specialist
Specialist

Valuable insights from HIC, as always!

One thing:

"You enter a string, and QlikView immediately searches..."

Could there be added a short delay so I can get the chance of typing more letters? When searching in big files it takes forever to pass the first letter.

0 Likes
4,684 Views
Not applicable

Great and useful post!

One question remains for me: why does Qlikview choose to let developers explicitly define the number representation (decimal and thousand separator, time format etc) in the script, but use the regional settings for collation? To me it would make more sense to either define both in script or get both from the regional settings

4,684 Views
hic
Former Employee
Former Employee

I agree that it would make sense to have an environment variable for the collation also. I really don't know why there isn't one already. It's only recently that we have become aware of this "anomaly". We just haven't thought of it earlier, I guess...

HIC

PS I just spoke to our main developer about this, and his answer was that it has historical reasons: The collation information of other regional settings than the current, was just not available in earlier versions of Windows. So, at the time, it didn't make sense to have a variable for collation.

4,684 Views
Not applicable

Great post..

About regional settings, it would be a big advantage if the user could choose the number and date representation in the qvw file. For example a US user vs EU user for one file that log into the same file and choose the regional settings via a variable.

0 Likes
4,684 Views
rbecher
MVP
MVP

Thanks Henric, very good insights! How this can work in a multilingual company / environment if it is load based ("This information is then stored into the qvw file when the script is run.")?

4,684 Views
simondachstr
Luminary Alumni
Luminary Alumni

"You enter a string, and QlikView immediately searches in the active list box and displays the matches."

You're entirely correct and this is a major issue. In bigger applications this makes search boxes become completely useless (loading times of over 10 seconds before I can type the second letter). And in big applications with many columns and rows is where you need the search functionality the most.

0 Likes
4,684 Views
hic
Former Employee
Former Employee

If you with "multilingual" mean that you'd want different collations for different users, my answer is: It can't. The evaluation is done by the engine in the back-end, and it works independently of the user settings.

And I am not sure that you'd want it to work differently. It would imply that a bookmark containing a search would result in different selections in Sweden and Germany. Imagine the confusion if we were to call each other and discuss the result of the bookmark selection...

HIC

0 Likes
3,922 Views
rbecher
MVP
MVP

Would be good to have threshold to start search after 2-3 characters or so..

0 Likes
3,922 Views
rbecher
MVP
MVP

You're right. But then it could be difficult to find names in a multilingual scenario considering local branches..

0 Likes
3,922 Views
jerrysvensson
Partner - Specialist II
Partner - Specialist II

Yeah we found this about a year ago.

The accounts welul and velul was considered the same.

0 Likes
3,922 Views