Hello,
I have several problems
1. Suppose I have a table like this:
name | surname |
---|---|
John | Johnson |
Peter | Peterson |
Jack | Jackson |
Hugh | Hughson |
Gary | Garrison |
I want to shuffle (anonymate, or how to say it - to make the records untraceable and anonymous that do not match the original records) like this:
shuffled_name (desired result) | shuffled_surname (desired result) |
---|---|
John | Peterson |
Hugh | Garrison |
Jack | Hughson |
Gary | Jackson |
Peter | Johnson |
To test, it is pretty easy: IF(name = shuffled_name AND surname = shuffled_surname; 1, 0); then SUM and if result > 0 a record hasn't been shuffled properly. But the question is how to do it?
2. Suppose I have a table like this:
product | anonymated product (desired result) |
---|---|
shampoo | product 1 |
soap | product 2 |
soap | product 2 |
bath gel | product 3 |
shampoo | product 1 |
It would be easy to do if the records would all be unique, but how to do it if there are duplicates like in the table below?
3. Suppose I have a table like this:
address | anonymated address (desired result) |
---|---|
Johnson Street 15 | address 1 |
Gary Street 96 | address 2 |
Jackson Avenue 22 | address 3 |
Gary Street 96 | address 2 |
Johnny Street 21 | address 4 |
Again, it would be very easy to do if all the records would be unique and there would be no empty or <null> values.
4. Also, what would be the best way to explore the data - suppose you have a huge transactions or sales table, and 10± other ones, is it better to use easyqlikviewer or to import all the data and to vizualize and search it? And why?
Thank you,
Juras
QlikView Desktop has a feature to scramble fields (not entire rows or you may lose your links). It works by replacing the current textual values in the symbol table of a field with scrambled data (just plain unreadable jibberish but not garbage data). See Settings->Document Properties->Scrambling
This feature is typically used for anonymizing data that shouldn't be transmitted to any outside party (which creates problems when for example you want help with the design of your document from a remote 3rd party)
However, AFAIK you cannot dictate what those scrambled values should look like. For that, you may need to design and implement a translation of your own. For example, replace all product names with "Product "+index number of the original field value (or autonumber). May require quite a few mapping tables
#2 and #3, can be solved using autonumber function.
LOAD address,
'Address ' and autonumber(address) as anonymated_address
.....
For #1,
probably you can use something like:
LOAD name, above(surname) as surname
....