Skip to main content
Announcements
Qlik Connect 2024! Seize endless possibilities! LEARN MORE
cancel
Showing results for 
Search instead for 
Did you mean: 
evan_kurowski
Specialist
Specialist

.XLSX copied & pasted from email into load repository produces syntax error. Opening spreadsheet and saving as same .XLSX extension "fixes" problem.

Hello Qlik Community,

I'm receiving an .XLSX spreadsheet being delivered from an internal client.

The .XLSX arrives attached to an email, and I am taking the attachment and copying & pasting from email into the loading directory.

When attempting to reload after pasting the .XLSX into the load folder, this step consistently produces a load script syntax error.

This issue can be resolved by opening the download in Excel and saving again as an .XLSX (with the exact same file name & extension it arrived as).  Once this is done, reload proceeds.


Is there any way to find out what attributes are changing between the arriving form & the resaved form, and what is preventing the reload in original delivery form?

(this particular disruption has been used to good effect in our corporate environment to inhibit many instances of excel content automation.  A resolution here might free up a lot of prevented automation)

4 Replies
marcus_sommer

I fear that there would be rather no solutions but there are workarounds. I think this will be useful to bypass the origin problem: Re: Cannot read certain XLSX files

- Marcus

marcus_sommer

Addition:

If the file is not too big/complex and you have a bit time you could compare the origin and re-saved file on the xml-level. The xlsx is just a zip-file and after extracting it you could open each part within an editor and comparing old vs. new.

But I think the knowledge to know the core of the issue is only partly useful because I doubt that you could adjust the load in a way to bypass the problem. But maybe you could inform the creator of the file so that they change the source.

- Marcus

evan_kurowski
Specialist
Specialist
Author

Hello Marcus, both of these responses were good suggestions, thank you for responding.

That thread you referenced describes my same issue, and that was written in March 2018 so we have other users in our present timeframe experiencing the same problem.

I exported before & after versions via 'Save As .XML Spreadsheet 2003' and ran the .XML through text comparison.

(My only concern here was losing programmatic differences even during the 'Save As XML' step.  If differences are "cured" by resaving as .xlsx, then maybe this export to .xml step also removes evidence of the issue?)


The .XML comparison showed differences in:

<Last Saved>
<WindowHeight><WindowWidth><WindowTopY>

The original also had a small set of <Print> instructions, that aren't present on the resaved version.

And then it assigned new Style id's to 3 numbered styles.  These relate to date format, and you can see each data row referenced the new Style ID #'s instead of the old. 

20180424_thread_299283_XLSX_pasted_from_email_not_loading.png

Also checked the encoding of the .XLSX versions.  We've had some issues with stuff coming off Linux devices sometimes changing the BOM presence, but both versions show the same opening hex sequence 50 4B 03 04.


? ?

marcus_sommer

It doesn't look like massive differences but obviously they made the difference between loadable in third-party tools or not.

I could imagine that the cause is something like that those indexes starts with an invalid number maybe with 0 although the specification defined a start with 1 (maybe just a small mistake by creating an array to adress/write these data especially if there are some inconsistency by the start-point of an array - even by the various Qlik functions there are both types - 0 and 1 - present) and therefore Excel starts a kind of silent repair and re-creates these indexes new. This doesn't explained the index-offset of 3 in your example but maybe there are some further dependencies to this index.

- Marcus