Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Qlik Open Lakehouse is Now Generally Available! Discover the key highlights and partner resources here.
cancel
Showing results for 
Search instead for 
Did you mean: 
sujit_nath
Creator III
Creator III

Start Schema on big data set

I'm working on a large data set with csv file with 7-10 gb size. Now I need to combine 2(or more) files on a key field that is alphanumeric. How should I acheive this in the most optimized way?

Labels (1)
1 Solution

Accepted Solutions
marcus_sommer

To associate the tables within a qvw you could use: autonumber(YourKeyField) as KEY. But if you need this key already before, for example for incremental approaches autonumber() might not be suitable anymore.

An alternatively could be to replace the alpha-numeric key with a numeric ones, for example by summing/multiplying ID fields instead of string-concatenate them, for example one of my most important keys look like:

([StoreID] * pow(10, 4)) + ([OrderID]) * pow(10, 8)) + [PosID] as [UniqueKey]

Beside this the data-size in Qlik might be significantely smaller as the rawdata especially if there are only a rather low cardinality of the field-values and if not it might be worth to look if it could be optimized in that way.

- Marcus

View solution in original post

1 Reply
marcus_sommer

To associate the tables within a qvw you could use: autonumber(YourKeyField) as KEY. But if you need this key already before, for example for incremental approaches autonumber() might not be suitable anymore.

An alternatively could be to replace the alpha-numeric key with a numeric ones, for example by summing/multiplying ID fields instead of string-concatenate them, for example one of my most important keys look like:

([StoreID] * pow(10, 4)) + ([OrderID]) * pow(10, 8)) + [PosID] as [UniqueKey]

Beside this the data-size in Qlik might be significantely smaller as the rawdata especially if there are only a rather low cardinality of the field-values and if not it might be worth to look if it could be optimized in that way.

- Marcus