Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 
Not applicable

autonumberhash128 VS autonumber

Hello,

Can anybody explain what the advantage is of using the expression 'autonumberhash128 versus the 'classic' autonumber?

Currently I don't see any advantages, since it is not even possible tocreate multiple counter instances, which is possible with autonumber. Perhaps autonumberhash128 has better performance; although this is not clearly stated in the reference manual.

Is there anyone who can shed more light to this issue?

Kind regards,

Daniel

8 Replies
Not applicable
Author

Daniel,

I read it that autonumber stores the expression value and gives it a unique integer value whereas autonumberhash128 stores just the hash value (in 128 bits) of the corresponding expression value. Therefore, autonumberhash128 should be more efficient in data storage (particularily when the expression value is larger) and so the document size reduced.

Willing to be proven wrong though!

Regards,

Gordon

Not applicable
Author

Hello Gordon,

Thank you for your reply! Your reasoning seems logical to me. If I find the time, I will do a test to prove you are right.

In addition to my first question, I'm now doubting wether my unique 'semantical key', which I use as input for the autonumberhash128 function, always generates a unique hash and corresponding autonumber. According to Wikipedia (http://en.wikipedia.org/wiki/Hash_function), most hash algorithms cannot guarantee unique hashes for unique inputs. I don't read anaything about it in the QlikView documentation, but I wonder if autohashnumber128 creates unique autonumbers for unique inputs. If not, I don't see a use for this function.

Can you (or anyone else) clarify on this one?

Kind regards,

Daniel

Not applicable
Author

Hi to all,

I am also very interested in a detailled (technical) description on autonumber, autonumberhash128 and the differences regarding:

- reliability
- memory usage
- performance
- storage

Would be nice if anyone from the QlikView-Team could post some additional information about these methods here ...

Best regards

Stefan WALTHER

johnw
Champion III
Champion III


ddoord wrote:I wonder if autohashnumber128 creates unique autonumbers for unique inputs. If not, I don't see a use for this function.


While hash functions don't usually guarantee unique results, there are lots of ways for hash tables to handle collisions. I'd be shocked if QlikView isn't using something robust. Hashing can be pretty basic.

Speaking VERY generally and with no testing and no knowledge of their internal implementation, I'm GUESSING that the advantage of hashing over autonumber isn't in space utilization (it's a 16 byte result, after all), but rather in load speed. Let's say you have a million keys in your table. In an autonumber table, these keys are numbered 1, 2, 3... 1000000. Big loads would never finish if they were just linearly searching this table every time they come across a key during the load (O(n^2) performance), so QlikView is probably using some sort of self-balancing tree with O(n log n) performance. A hash table, on the other hand, will have O(n) performance in the typical case, which is going to be faster on large data sets, and quite possibly even on small data sets due to the simplicity of hashing compared to maintaining a self-balancing tree.

I imagine the load speed difference would be negligible in most cases, though.

Perhaps I should do some testing.

Not applicable
Author

Can i re-activate this thread, Is there any futrther information from qliktech on this with relation to the Unique properties for the various autonumber/AutnoumberHash Functions ?

tresesco
MVP
MVP

Wondering, why there should not be any official information on it. If anyone has got any updated information, please share.

Regards,

tresesco

Not applicable
Author

I wonder if autohashnumber128 creates unique autonumbers for unique inputs

This is technically impossible to guarantee. No hash function can guarantee to return unique values for unique inputs unless it is at least of the same size as the input itself. In which case there is no use of such a hash function.

But, collisions is a rare thing for a hash function.

dtbit123
Partner - Contributor II
Partner - Contributor II

I SUGGEST NOT TO TRUSTS THOSE FUNCTIONS

IF YOU NEED IT TO IMPROVE PERFORMANCE CHECK ITS RESULTS CAREFULLY...