Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Oct 21, 2022 7:51:21 AM
Oct 21, 2022 7:46:53 AM
Growing up as a childhood one of my fondest memories, thoughts, activities was of Secret Decoder Rings. The ability to convert cryptic keys into something more readable just fascinated me. Turn the ring to a coded image and a letter/number would appear. Or place the ring in a certain position to reveal a clue. Very likely why I love analytics so much. Take a bunch of raw 0's and 1's, spin data stewards from the business units around to find the super top secret Boolean logic and out comes actionable insights.
If you were intrigued with Secret Decoder Rings as a child, or still are, then you understand immediately why I would bring them up and would probably be happy if I elaborated even more. But since you may not be, I will get right to the point of why I chose this imagery.
My buddy Dave Freriks was recently tasked with creating a Qlik Cloud Data Integration project that moves SAP data from raw to analytics ready. To me the term "analytics ready" means that millions of data workers can work with the data in a self-service fashion. Let me share two problems that Dave had to solve in making that vision a reality.
As you can imagine, SAP provides thousands of columns of data. Yeah!
What you might not have known is that all of the columns are named with cryptic German shorthand abbreviations like VBELN, MATNR, KUNNR. Yuck!
Let me be clear, in a centralized SAP "reporting" world for experts that's brilliant :
Thus, the problem isn't for them, the problem is in decentralizing all of that data to a more active intelligence space:
The answer to both questions is an emphatic NO. For Dave, the problem was in building a project that moved the SAP data to a cloud platform, say Google Big Query, in such a manner that the operational data stores, the data warehouses and the data marts contained user friendly names. Making the data consumable by them, not continuing their reliance on the SAP reporting experts. Which means that Dave needed to apply the SAP Secret Decoder Ring they have in their heads, while moving the 0's and the 1's.
Fortunately for Dave, we have one of those super duper, SAP Secret Decoder Rings embedded in his brain, people on staff named Vincent Hammond. Out of the kindness of his heart, and his faith it would be returned, Vincent transferred his brain to Dave so that he could translate the thousands of column names needed for this work.
Obviously I'm just kidding. But he did the next best thing.
You see, although SAP uses the german short hand codes for the column names, it also retains metadata descriptions about them. Ones that it's global customers can customize as desired in the languages they need. So Vincent wrote a query in SAP that extracted all of that lovely metadata, and then exported it to a CSV file.
You see metadata, about the data, is always the secret decoder ring. Within a few minutes we had a fantastic translation map for over 8,000 column names from 175 extractors. It's not as shiny and fun to look at as the AI generated secret decoder ring I generated for this post, but it's the real deal translations.
If you are familiar with data storage you know that columns don't just exist on their own, they are grouped into tables. Like columns, those tables also have names. In the SAP world they have names like 0ACCOUNT_ATTR, 0CHRT_ACCTS_ATTR, 0CUSTOMER_ATTR, 2LIS_11_VAHDR and 2LIS_11_V_ITM.
Those names are more readable than the column names for sure but they also posed a problem for Dave. You see Google Big Query doesn't allow names that start with numbers. Don't go shaking your finger at Google, all data storage vendors have name restrictions of some sort.
Compounding the issue is the fact those metadata descriptions for the column names also contain characters that Google Big Query, and all vendors, don't like to have in names either.
Fortunately for Dave, Qlik Cloud Data Integration provides a way to rename tables and columns via what is called Transformation Rules. He simply pointed the rule to a CSV file that renames the tables to compliant names. Ones that Google Big Query and the other vendor's data targets are happy with. Ones that are also a lot more "analytics ready" because they are more human-readable.
The column name issue is slightly more complicated than the table names. You see, each column's metadata descriptions can vary for each table they are contained in.
YIKES!!!!!
That means that my poor buddy Dave not only needed the translation for those german short hand code names, he needed the translation to be on a table by table level and in a format that was formatted for Google Big Query. To do that my buddy Dave, reached out to his buddy, me.
I utilized Qlik Sense to read that massive SAP metadata export from Vincent, and with a few lines of code (attached) replaced all of the characters and blank spaces and generated the needed table level translations needed.
Dave was then able to provide the different translations for each of the tables in his Qlik Cloud Data Integration project. He simply set up a Transformation rule for each of the datasets in the project.
And then pointed the transformation rule to the personalized dictionary/csv file.
Qlik Cloud Data Integration then made him a cup of coffee and proceeded to apply the secret decoder ring to all of the column names in each table/dataset.
To be honest, it didn't really make him a cup of coffee, and the point of this post wasn't to make you realize he deserves one. But there are a few really key points I want to ensure you don't miss: