Qlik Community

Knowledge

Search or browse our knowledge base to find answers to your questions ranging from account questions to troubleshooting error messages. The content is curated and updated by our global Support team

Announcements
Support Case Portal has moved to Qlik Community! Read the FAQs to start exploring Support resources.

Using Sagemaker AutoML in an Automation

NikoNelissen-Qlik

Using Sagemaker AutoML in an Automation

This article shows how the Sagemaker connector in Qlik Application Automation can be used to leverage the AutoML features from Sagemaker AutoPilot. In this example we are using data in a CSV file to train a model using AutoML. Next we use the newly created endpoint from Sagemaker to get predictions on data in a Google Sheet. We write the predictions to a new column in the same Google Sheet and we visualize this data in a Qlik Sense app.

Next, we make the automation available using a button on a sheet in the Qlik Sense app and we add an input to the automation so that the user can vary the input data, get new predictions and immediately see the visualization of these new predictions in Qlik Sense.

Prerequisites

  • GSuite account with Google Sheet access (optional, the automation can be modified to use S3 only)
  • AWS account with Sagemaker and S3 access
  • Qlik Sense Enterprise account on Qlik Cloud
  • Understanding of Sagemaker Autopilot

Create an AutoML job

We have a CSV file with input training data stored in a bucket on S3.
We are using this dataset for demo purposes: https://datahub.io/machine-learning/bank-marketing

We're using this file to create an AutoML job in an automation using the block "Create AutoML job from S3" in the Sagemaker connector. We're using the same S3 bucket for output:

Sagemaker Create AutoML blockSagemaker Create AutoML block

Getting the status of the AutoML job

Once the AutoML job is created, we'll use the "get AutoML job" block to fetch its status:

Sagemaker get AutoML job blockSagemaker get AutoML job block

It can take minutes or hours for the job to complete, we're running the Automation manually every hour or so until the AutoMLJobStatus in the response is "Completed" (bottom of screenshot):

Result of Get AutoML jobResult of Get AutoML job

 

Getting predictions from the endpoint

We are taking note of the name of the endpoint that was created (from the "Get AutoML job" block output, or we can use the lookup feature), and we use the block "Invoke endpoint" to get prediction on one row of new CSV input data:

Get a prediction from SagemakerGet a prediction from Sagemaker

Saving the predictions

In the following example we have input data in a Google Sheet, we're looping over the lines in the sheet and invoking the endpoint for each line:

Getting predictions for rows in a Google SheetGetting predictions for rows in a Google Sheet

Note that we apply the "csv" formula on the row data because we have set the content-type to text/csv.

We write the prediction and probability back to the Google Sheet in a separate column Q and R:

Writing the prediction to a cell in the Google SheetWriting the prediction to a cell in the Google Sheet

We use the following json path to calculate the correct line number, needed in the input "Start Cell". We take the index of the loop (which starts with 0) and we add 2 because the first line number in Google Sheet is 1 and this contains a header line so we want to start writing on line 2:

Calculating the line number to updateCalculating the line number to update

Visualizing the data

Finally we are doing a reload of a Qlik Sense app that visualizes the data from the Google Sheet:

Do reload of Qlik Sense appDo reload of Qlik Sense app

Note that we set up a datasource in Qlik Sense pointing to the same Google sheet file as the Automation is using, in order to visualize its content:

Datasource in Qlik Sense to Google SheetDatasource in Qlik Sense to Google Sheet

What-if simulations

We can add an input in the Automation, to allow the user to do "what-if" simulations. In this example, we're allowing the user to change one variable, which is the "number of days since last contact":

Automation with input for what-if simulationAutomation with input for what-if simulation

In the Automation, we replace one column of the data from the Google Sheet with the input that the user provided, before we send it to Sagemaker to get a prediction. We do this using a variable of type "object". First we set the variable equal to the line from the Google Sheet, and next we replace one key with the input from the user:

Variable to replace one value in the row from Google SheetVariable to replace one value in the row from Google Sheet

Same as before, we write the prediction to a new column in the Google Sheet (as well as the user input). This is what our Google Sheet will look like:

Google Sheet with predictions in green added by automationGoogle Sheet with predictions in green added by automation

And at the end of the automation, again we reload the Qlik Sense app.

Finally, we're adding a button on a Sheet in the Qlik Sense app, that we link to the Automation.

Button in Qlik Sense sheet linked to an automationButton in Qlik Sense sheet linked to an automation

Now we support a full cycle of a What-if simulation:

  • User clicks the button and the Automation detail page is opened
  • User enters a value for the what-if simulation when the Automation runs
  • Predictions are made and the Qlik Sense app is reloaded so that the user immediately sees the result of the simulation:

Visualization of Sagemaker AutoML predictionsVisualization of Sagemaker AutoML predictions

Download

See attachments to download the automation template (json file) and the Google Sheet file used in this article.

In the automation editor, right mouse click on the canvas (center pane) and select "upload workspace". Next, select the json file to import the template of the automation. 

Attachments
Version history
Revision #:
1 of 1
Last update:
‎2021-09-09 07:29 AM
Updated by:
 
Contributors