How to read Amazon S3 files - approach #1 - Web File

    SUMMARY: This short how-to will show you how to read Amazon S3 files that Qlik Sense or QlikView can consume via either a Web File data connection or the Qlik REST Connector. That will cover CSV, TSV, other text-formats, HTML, XML, JSON, XLSX, XLS, QVD, QVX to name the most important file types. The requirement for using this approach is that the files or the bucket in question has to be public. That is that the permissions has been set to at least READ in the permissions for the file or bucket. Other HOW-TOs will follow with additional approaches for S3 that include full security and authentication.

     

    Introduction

    I know there are a lot of people that might be struggling with finding out how to access data they have in Amazon S3 (Simple Storage System). Amazon S3 is quite popular as a storage system due to the cost and also the eco-system surrounding it. The eco-system constitutes the number of applications both in Amazon's EMR and Amazon's other applications and almost any software that will happily run in the virtual machines of Amazon EC2, all of whom that can access this as a low cost storage. Additionally there is nothing stopping you from accessing this remotely either.

     

    Limitations for this approach

    I have several approaches to share. I will start with the simplest and most straight forward. The reason that this is not enough for all use-cases is that this method is limited to files or buckets that you have marked as public. You can however make these public files with very convoluted and cryptic file names. This will make it extremely hard if not impossible for anyone to figure out which can constitute an acceptable form of security for many use-cases.

     

    The recipe (the Qlik Sense version - QlikView is very similar though):

     

    1. Go to Amazon Console for S3 and into the bucket where you have your file. Alternatively you create a bucket.
    2. Upload a file to the bucket.
    3. Set the file permissions to public by either pressing the button "Make Public" or go into the permissions and select READ for Everyone for the file.
    4. Copy the full link for the file that looks something like this: https://s3.us-east-1.amazonaws.com/mybucket/SampleData.xlsx.
    5. Go into Qlik Sense or QlikView and create a new connection and select "Web File".
    6. Paste the link into the URL and give it a sensible name and save.
    7. You can then click the Select Data icon and make the selections you want and press Insert Script.

      LOAD

          *

      FROM [lib://MySpreadsheet] (ooxml, embedded labels, table is Sheet1);

    8. Press Load Data.
    9. If everything was done correctly you should have the data populated into a Qlik table in your data model.
    10. You are now ready to make your visualizations in the UI.

     

    Any type of file can be uploaded. To access it from either Qlik Sense or QlikView it is necessary that the access method that will be "Web File" or "Qlik REST Connector" for this example supports the file type that you want to read from S3.