Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Qlik Open Lakehouse is Now Generally Available! Discover the key highlights and partner resources here.
cancel
Showing results for 
Search instead for 
Did you mean: 
DoctorPolidori
Contributor III
Contributor III

Best practice to format EXCEL files as data source

I am preparing some guidelines on how EXCEL files should look like to serve as source data for Qlik Sense.
(see draft below)

Is there any document available in QLIK COMMUNITY for further inspiration?

1. Structure of the Excel File

1.1. Headers & Field Names

  • The first row should contain clear and unique field names (column headers).
  • Field names should not contain special characters (e.g., @, #, $, %).
  • Avoid spaces in field names; use underscores (_) instead (e.g., Customer_ID instead of Customer ID).
  • Field names should be consistent across multiple sheets or files (if used for data concatenation).

1.2. Data Consistency

  • Each column should contain one type of data (e.g., numeric, text, or date).
  • Dates should be formatted as dates, not as text.
  • Ensure numbers are not stored as text to avoid conversion issues.
  • Avoid merged cells, as they can cause data misalignment.

1.3. Data Integrity

  • Each row should represent a single record (flat-table structure).
  • No blank rows within the data table.
  • No subtotals or aggregated values (Qlik Sense performs aggregations dynamically).

1.4. Table Format

  • Use an Excel Table Format (Ctrl + T) if necessary, but Qlik Sense can also work with raw tabular data.
  • Ensure each sheet follows a tabular structure (rows as records, columns as attributes).

2. Sheet & Workbook Considerations

2.1. Sheet Naming

  • If using multiple sheets, give them meaningful names (Sales_Data, Customers_List, etc.).
  • Avoid default names like Sheet1, Sheet2.

2.2. File Size Optimization

  • Qlik Sense can handle large files, but reduce unnecessary columns to improve performance.
  • Compress data if possible, removing redundant or duplicate records.

2.3. Use of Multiple Sheets

  • If multiple sheets represent different datasets, ensure they have a common key for linking (e.g., Customer_ID in both Sales and Customers sheets).

3. Data Formatting & Preprocessing

3.1. Handling Missing Data

  • Replace empty fields with NULL values instead of leaving them blank.
  • Use a placeholder (N/A, Unknown) if missing values need identification.

3.2. Date Formats

  • Store dates in YYYY-MM-DD or standard formats (MM/DD/YYYY or DD/MM/YYYY) to ensure proper recognition.
  • Ensure date fields contain only valid date values.

3.3. Numeric Values

  • Do not include currency symbols ($, €, £) directly in numeric fields; Qlik Sense can handle formatting separately.
  • Ensure percentages are stored as decimal values (0.25 instead of 25%).

4. Data Relationships & Keys

4.1. Primary & Foreign Keys

  • If using multiple tables, define unique key fields (Customer_ID, Product_Code) for linking.
  • Avoid duplicate keys in a primary key column.

4.2. Avoid Synthetic Keys

  • Synthetic keys occur when multiple tables have more than one field in common.
  • Use concatenation or rename fields (Customer_ID_Sales vs. Customer_ID_Orders) to prevent auto-linking errors.

5. Naming Conventions & Best Practices

  • CamelCase or Underscores (ProductName or Product_Name).
  • Avoid spaces and special characters in file names (Sales_Data.xlsx instead of Sales Data.xlsx).
  • Keep sheet names meaningful to avoid confusion.

6. Special Cases

6.1. If Using Hierarchical Data

  • Flatten the hierarchy in Excel or use separate columns for levels (e.g., Category, Subcategory, Product).

6.2. If Using Time Series Data

  • Ensure a continuous date range without missing values.
  • Use a separate Date/Time field instead of embedding it in other fields.

7. Data Validation & Pre-Qlik Preparation

  • Check for duplicate records.
  • Remove unnecessary columns before loading into Qlik.
  • Use data validation in Excel to restrict incorrect data entry.
Labels (1)
1 Solution

Accepted Solutions
F_B
Specialist II
Specialist II

Hi @DoctorPolidori ,

I find quite helpful Qlik guidelines:

Loading data from Microsoft Excel spreadsheets | Qlik Sense on Windows Help

You can add more specific instructions, but it depends on the distinct case necessities and, in my opinion, can't be summarized in a general way.

View solution in original post

7 Replies
henrikalmen
Specialist II
Specialist II

That's a lot of guidelines... Why? What is the context? Is it some kind of app that will be able to import a standardized excel file without any manual fixing in load script/data load editor?

I think perhapsit will be difficult for people to verify their files against this checklist? Perhaps so difficult that they simply will not do it?

1.1 Why do you want to "Avoid spaces in field names"? That shouldn't be necessary.

1.2 In my opinion, the first three bullets are not needed. Data load should be built so that formatting is handled. Perhaps some source data actually uses text format for dates, then there shouldn't be a need for a user to manually fix such things in Excel. The loading process in qlik should make sure things end up correctly in the data model.

3.1 Why? In some scenarios you may want blank values, not nulls. And how do you want peiople to do that in excel,so that the value is actually read by qlik as a null value? If you do need to make all blank values to nulls, load them in Qlik using trim().

2.3 and 4.1 is more or less the same instruction. But I get a bit confused here at 4.1 and 4.2 (and 6.1). The overall feeling is that these instructions are for people that aren't very famiiar with how to treat data in files. But here you mention "primary & foreign keys", concepts that will likely not be understood.

Section 5 is already mentioned elsewhere in the document (kind of).

6.2 Really? If someone has got a huge excel file with many rows and there are dates in there, do you actually want them to go over the data in excel to make sure there are no gaps in the time series? I strongly believe that should be taken care of when data is loaded instead of people spening lots of time to fix their file first.

I could probably comment more, or I should have commented less, if I knew more about the context. But basically: I would not like to be handed these instructions myself.

henrikalmen
Specialist II
Specialist II

Oh, I suddenly realise that I know who you are and therefore possibly know a tiny bit more about the context. 😄 Well, I am probably right then that the guidelines are aimed at "regular" people, not data people. So I do believe parts of this is too technical for people to understand. But at the same time these non data people are often building apps themselves (without the support of a developer) in your organization, right? 

Do you actually have a template app and you want people to be able to prep data files so that it can be loaded by this app, or are the instructions more aimed at best practice so that it will be easier for them to build apps from scratch? I guess they are using the data load manager more than the script editor?

diegozecchini
Specialist
Specialist

thanks for sharing!

F_B
Specialist II
Specialist II

Hi @DoctorPolidori ,

I find quite helpful Qlik guidelines:

Loading data from Microsoft Excel spreadsheets | Qlik Sense on Windows Help

You can add more specific instructions, but it depends on the distinct case necessities and, in my opinion, can't be summarized in a general way.

amonjaras_c40
Luminary
Luminary

If you must absolutely do it, this is a good set of guidelines, just make sure to warn the users that they risk losing a limb if they break them 😉. In practice, they will; warning or not!

5dlmhc6431s51.jpg

diegozecchini
Specialist
Specialist

😂

F_B
Specialist II
Specialist II

So true!