Skip to main content
Announcements
Introducing Qlik Answers: A plug-and-play, Generative AI powered RAG solution. READ ALL ABOUT IT!
cancel
Showing results for 
Search instead for 
Did you mean: 
JustinDallas
Specialist III
Specialist III

AutoML-Classify By Tags from Description

Hello Folks,

I am working with the AutoML.  My task is given articles that have tags attached to them, how do I use AutoML to tag untagged articles.

For instance, let's say this is my datamodel.

[BI Platforms]:
Load * Inline
[
	'Application Id', 'Application', 'Is Tagged', 'Description'
    1, Qlik, 'Y', '... Excepteur sint occaecat cupidatat...'
    2, Tableau, 'Y', '...adipisci velit, sed quia non numqua...'
    3, Cognos, 'N', '...Sed ut perspiciatis unde omnis iste natus error sit voluptatem accu...'
    4, PowerBI, 'N', '...erum facilis est et expedita d...'
]
;

[BI Platforms Tags]:
Load * Inline
[
	'Application Id', 'Tag'
    1, 'Enterprise Offering'
    1, 'Cloud Offering'
    1, 'Green'
    1, 'No Templates'    
    2, 'Templates'
    2, 'Enterprise Offering'
    2, 'Cloud Offering'
]
;

Exit Script
;

 

In the model, we see that Qlik and Tableau have tags attached based on their Description column.  What I want to do is create tags for Cognos and PowerBI based on their description.

 

Is this possible with AutoML?

I've worked through the Wine Quality tutorial, but it only allows you to select a single field to target.  This use case is a little different.

Any help is greatly appreciated.

Labels (3)
2 Replies
KellyHobson
Former Employee
Former Employee

Hey @JustinDallas ,

Thanks for reaching out to the AutoML forum.  At this time, AutoML is designed to work with labeled datasets for binary classification, multi-class classification, and regression problems. 

In your case, it seems like some text modeling or generative AI may be a good option to explore as you are trying to generate a tag based on string/text field.

Best,

Kelly 

 

christy2951hernandez
Contributor
Contributor

Hello there @JustinDallas ,

You're right, the standard AutoML classification functionality in Qlik isn't directly designed to handle multi-label classification based on descriptions and existing tags. However, there are a couple of approaches you can consider to achieve your goal.
Preprocess the "Description" text. This might involve cleaning, tokenizing, and converting the text into numerical features suitable for machine learning models (e.g., TF-IDF).
Create additional binary features for each existing tag you want to predict (e.g., "Has_Enterprise_Offering", "Has_Cloud_Offering", etc.). You can achieve this using a CASE statement based on the existing tags table.

Train an AutoML model for multi-class classification. Set the target variable to be a combination of the binary tag features you created. This allows the model to predict multiple tags for each untagged article.
Use the trained model to predict tags for your untagged articles (Cognos and PowerBI descriptions).
Analyze the model's feature importance to understand which parts of the descriptions contribute most to specific tag predictions. This can help improve the interpretability of the results.

Train an AutoML model for text classification using the "Description" as the input and existing tags as the target variable (one-hot encoded). Use the model to predict tags for your untagged articles.
Set a confidence threshold for the predicted tags. This helps filter out predictions with low confidence, reducing the risk of assigning inaccurate tags.

 

Best Regards,
ny state of health