Re: Looking for Best Practise regarding loading da... - Qlik Community

Skage

Hi all.
I'm looking for some suggestions/advise on how to extract data from an GraphQL api where the data needs to be explored and adapted during development.

The flexibility of the api is making it extremely difficult to iterate since the script most likely will be generated differently each time the query is altered. I'd like to stay away from attempting to write everything by hand.

My strategy, when working with regular REST api:s, is to keep each script untouched, save all tables with the names the script generated, and do the joins and renaming in the transform script.

This works but once the iterations start, I have a hard time aligning the old and the new names and structures. This is a huge api and I'm tackling the most complex part of it with paging and "cost" constraints for the queries.

So, the strength of GraphQL when it comes to frontend development and flexibility make it a nightmare to work with in Qlik.

Working with the current rather dx-unfriendly REST connector can be a challenge at times, but this is at a completely different level.

Any suggestion to keep me sane are appreciated!
/lars

marksouzacosta

Hi @Skage,

One approach worth considering, depending on your environment, is to stop trying to handle GraphQL in the Qlik Cloud REST Data Connector altogether and let a scripting layer deal with it instead.

The idea is simple: a Python script (or whatever language fits your stack) talks to the GraphQL API. It handles the query, the pagination, the cost tracking, and any retry logic. The output is just flat CSV files dropped into Cloud Storage Service - AWS, Azure, Dropbox, etc - or even in a on-prem server with Qlik Data Gateway installed to expose the files to Qlik Cloud.

So, when your GraphQL query changes, you fix the Python script. Qlik doesn't care. The contract between the two sides is just the CSV column names, and you control those.

It also buys you proper tooling for the hard parts: cost budgeting, pagination loops, logging, and debugging in an actual IDE instead of inside the script editor.

Where you run it depends on what you have available. AWS Lambda, Azure Functions, even a scheduled task somewhere. If you're already inside a cloud environment there's usually an obvious place to hook it in.

Might be overkill depending on how locked-in you are to the REST connector approach, but for a large API with cost constraints and changing queries, the separation of concerns tends to be worth the setup. Happy to go deeper if this direction makes sense for your setup.

Regards,

Mark Costa

Read more at Data Voyagers - datavoyagers.net
Follow me on my LinkedIn | Know IPC Global at ipc-global.com

Skage

Thank you @marksouzacosta

I briefly considered this option so thank you for confirming that it is a way forward.

If this wasn't for a customer who wants to minimize dependencies and they want the solution to be self-contained within Qlik.

The current solution "works", but since the API is so huge, each query isn't finalized and inspiration change and evolves the scope. The business logic is in the original client, and I/we have to try to "discover" many of the rules as we go.

I think the only way forward is to minimize the queries. Instead of trying to fetch nested structures, I will do many shallower fetches instead. So instead of manipulating, as in recreate, a working query another query with the id and the new field/structures are added. That means no changing and if more fields or structures are needed, that will be its own isolated handling. Never change anything, add, and eventually disable.

The total load time will increase but perhaps that enables me to make incremental loads once the queries stabilize.

I will definitely build a POC to evaluate...

The world is moving to APIs of varying quality, and the DX and features for the REST connector is more or less unchanged. It is quirky to work with for anything beyond demos or simple plain table extracts.

/lars

Looking for Best Practise regarding loading data from an GraphQL-api via REST connector

Client Managed

Data Prep

SaaS

Web Connectors