Qlik AutoML is a tool within Qlik Cloud where you can quickly train and deploy models, and the make predictions against said models. In this article, we address best practices for preparing training datasets for ML experiments and/or apply datasets for generating predictions.
Guidelines
- For column names, use camelCase or column names without spaces or punctuation.
- No special characters in column names.
- When working with Excel data, remove all formatting such as bold/italics, borders, currency, or color formats.
- Make sure data types are consistent between training and prediction datasets. It may be worth checking in Qlik Catalog to see if datasets are profiling the data type you are expecting.
- Qlik AutoML only works with structured, tabular data. Any flat file which can be uploaded and profiled in Qlik Cloud can be used by AutoML. Based on experience, we have best results with CSV, QVD, XLSX formats.
- AutoML does not support sentiment analysis. This would require a third-party service (such as Amazon Comprehend) to generate structured data points for AutoML to use. You can use Comprehend in Qlik Sense directly using our connector and then use that output as a feature in AutoML.
- For multi-table files such as Excel, only the first sheet will be used for a table.
- Date columns are currently treated as categorical feature type. Feature engineering of date columns to numeric type should be done prior to using the dataset in an ML experiment.
- Be aware of dataset size limits based on your tenant type.
- For null data, if a column contains more than 50% null values it will be dropped.
If greater than 50%:
For numeric type: it uses the mean.
For categorical type: uses value 'other'
We will continue to update this list as we encounter other issues related to data used in Qlik AutoML.
Environment
Qlik AutoML
The information in this article is provided as-is and to be used at own discretion. Depending on tool(s) used, customization(s), and/or other factors ongoing support on the solution below may not be provided by Qlik Support.
Related Content
How To Get Started with Qlik AutoML