Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi Friends,
I got a requirement that is Categories the types of Incidents from ServiceNow data.
Our source data from ServiceNow, every day we will load data from ServiceNow to Qlik sense. Actually the data flow is ServiceNow >> Gsheet >> Qliksense.
Actually the requirement is in the below:
Please find the below sample incident data which was raised by users with different short descriptions.
Incident_number | Short Description | Created By | Open Date | Due Date | State |
INC001 | Unable to access XYZ stream, please provide access | ABC | 3/27/2024 | 4/1/2024 | Open |
INC002 | Got error message like 'Access Denied' while accessing the ABC dashboard URL link | CDE | 3/26/2024 | 4/1/2024 | Open |
INC003 | I am unable to access the folder | FGH | 3/25/2024 | 4/1/2024 | Open |
INC004 | I am unable to sign in Workday portal | IJK | 3/24/2024 | 4/1/2024 | Open |
INC005 | Can you grand me access to the database | LMN | 3/23/2024 | 4/1/2024 | Open |
INC006 | Batch job failed to run | OPQ | 3/22/2024 | 4/1/2024 | Open |
INC007 | XYZ Dashboard qmc task aborted | RST | 3/21/2024 | 4/1/2024 | Open |
INC008 | Failed - Informatica xyz_inflow task incomplete | UVW | 3/20/2024 | 4/1/2024 | Open |
Here row number from 1 to 5 should fall under "Access Related Issue" category and row number from 6 to 8 should fall under "Batch Job Related Issue".
Category | Count of Incidents |
Access Related Issue | 5 |
Batch Job Related Issue | 3 |
How can we achieve this. Can you please help me this.
Appreciate your help and thanks in advance.
This is a very hard task without using a flag or a default list of descriptions.
Thanks @F_B for your comment.
Yes it is hard task. I tried with wildmatch function with hardcode a specific words but clients are not accepting. They are approaching ML and Python way to implement this task.
I am not an expert in ML and even Qlik also so want your all help for this post.
Thank you.
Indeed. Perhaps you can use SubStringCount() or WildMatch() to identify specifics in the text.
Actually clients requested to don't use hardcode anything in the dashboard. Why because please see the below example you can understand the reason of it
Ex. If the user creating an incident regarding about server issues. Accidentally user mentioned the short description like "serrver" instead of "server".
But in our wildmatch function and Subfield function we wrote like "server" only then this incident won't be fall under "Server Related Issue" Category right.
I hope you can understand.
The difficulties are mainly not related to the used tool else to define the appropriate rules and logic which are suitable to the scenario. Implementing it in any tool will have their challenges and efforts.
If I had this task I would probably do it in Qlik with the following measurements:
whereby the real order may differ to the provided listing and also be combined in some way and not always applied sequentially.
Like hinted above the main-work is not to apply the various measurement in one or two dozens statements else defining all the stuff. Getting a proper categorizing for 30% to 50% will be surely quite easy, the following 10% - to 20% will need more efforts and the costs to get 90% to 98% will definitely increase exponentially and you won't never reach 100%.
Much simpler as the above would be to include some mandatory fields within the incident data with pre-defined values in a dropdown.
Hi @marcus_sommer ,
Thanks for your update. But here you are using hardcode the text right.
Can we approach different way without hardcoding the text. Is it possible ?
Of course, it was meant without hard-coding the most things else using external maintained lists and rules - maybe within Excel - and loading them as mappings against the text/words.
The company-definitions are specific and would need to be done from your business-departments but the other things like the mentioned classifying/cleaning/preparing could happens against (commercial) dictionaries and rules respectively derived programming logic.
Just a further hint - instead of diving very deep in the cleaning-measurements by looking and fixing double-chars within the words or correcting very common mistakes by chars like p and b / t and d and so on you may also implement the LevenshteinDist - script and chart function | Qlik Cloud Help to check the similarity between the words.