Skip to main content
Announcements
See what Drew Clarke has to say about the Qlik Talend Cloud launch! READ THE BLOG
cancel
Showing results for 
Search instead for 
Did you mean: 
rwunderlich
Partner Ambassador/MVP
Partner Ambassador/MVP

Document Analyzer V1.15

I've published an updated version 1.15 of Document Analyzer to

http://robwunderlich.com/downloads/

Changes in this update:

- Fix a hang caused by unicode values in variables or expressions. Property files are now written in unicode.

If you have any questions or problems with DocumentAnalyzer, post to this thread or email to the address given on the DocumentAnalyzer "About" sheet.

-Rob

29 Replies
rwunderlich
Partner Ambassador/MVP
Partner Ambassador/MVP
Author

There is an issue with processing large docs where the reload phase hangs. I'm addressing this in a new version I'm working on. In the meantime, try this as a workaround.

1 Open the Module (Ctrl-m).

2. Comment out line 107 with a REM statement so it looks like this:

REM If IsInternal() Then ActiveDocument.Reload

That will suppress the Reload phase. After the "Process Doc" completes (hopefully), reload manually using Ctrl-R.

If that doesn't resolve the problem, please post the last several lines of the CollectProperties log which you can find in your

C:\Users\userid\AppData\Local\Temp

folder,

Re the second issue about line breaks. The multi-line expressions should be properly escaped and read as multi-lines. My simple testing works out ok. Can you post the expression you are having issue with so i can test it?

-Rob

Not applicable

Issue 1) Cannot Process Doc

I've commented out the line you refered to and tried again to "Process Doc". After 21 minutes of working, the application seems to be still spinning with CPU usage spiking all over the place. Is this a normal processing time?

Server details

Processor:           Quad-Core AMD Opteron(tm) Processor 8381 HE 2.50 GHz (8 processors)

Memory (RAM):    32.0 GB

System type:        64-bit Operating System

Issue 2) Fields "not in use" are actually in use

At least these 3 fields were identified as not in use but are used int he expression below.

IFS.INTERFACE_CATEGORY

IFS.SPEED_MAX

IFS.CAPACITY_STATUS

Here is the sanitized expression where they are used. I have replaced actual customer values with 'DummyValue' in the Set Analaysis.

sum ({$<LOAD_DATE_IDM={'$(TrendEndPORT)'}, IFS.INTERFACE_CATEGORY={'DummyValue'}, IFS.SPEED_MAX={'DummyValue'}, INVENTORY_DEVICE_MAIN.DEVICE_CLASS = {'DummyValue'},
SC={'DummyValue'}, CONTAINER.CONTAINER_CLASS_DESC = {'DummyValue'},IFS.CAPACITY_STATUS={'DummyValue'}, BRANCH_NAME={'DummyValue'}>} InterfaceCount) -
sum({$<LOAD_DATE_IDM={'$(TrendStartPORT)'}, IFS.INTERFACE_CATEGORY={'DummyValue'}, IFS.SPEED_MAX={'DummyValue'}, INVENTORY_DEVICE_MAIN.DEVICE_CLASS = {'DummyValue'},
SC={'DummyValue'}, CONTAINER.CONTAINER_CLASS_DESC = {'DummyValue'},IFS.CAPACITY_STATUS={'DummyValue'}, BRANCH_NAME={'DummyValue'}>} InterfaceCount)

rwunderlich
Partner Ambassador/MVP
Partner Ambassador/MVP
Author

1) No, that is not a normal processing time. Is there anything in the CollectProperties.log?

2) I'll test with that expression and get back to you.

Does DocAnalyzer complete normally on other documents?

-Rob

Not applicable

1) I reviewed the CollectProperties.log and didn't see anything strange. It looks like it really is just processing the document but this particular document must have so many objects that processing take a very long time. Here is the tail end of the log after I manually killed the process.

...

Writing xml file C:\Users\...\AppData\Local\Temp\4\qvwork\QlikApp\Document\LB1284.xml

Extracting Expressions

Extracting Dimensions

Extracting Fonts

Processing Sheet Objects for Sheet Document\SH16

Writing xml file C:\Users\...\AppData\Local\Temp\4\qvwork\QlikApp\Document\TX6119.xml

Extracting Expressions

Extracting Dimensions

Extracting Fonts

Writing xml file C:\Users\...\AppData\Local\Temp\4\qvwork\QlikApp\Document\TX6210.xml

DocAnalyzer works fine in the same environment on much smaller files around 20MB.

rwunderlich
Partner Ambassador/MVP
Partner Ambassador/MVP
Author

I've seen Documents with a 1000 objects take a couple minutes to extract. Can I infer from the objectid numbers you posted above that you have many thousands of objects?

-Rob

rwunderlich
Partner Ambassador/MVP
Partner Ambassador/MVP
Author

Hi Philip,

Another thought. Every object is visited, and therefore calculated. Even hidden objects and container objects are calculated. So if it's a big app, that could take a long time.

Make some selections and use File, Reduce Data to make a smaller copy and analyze that.

-Rob

Not applicable

I am out of the customer environment now so I can't check the number of objects, but it it probably true that there are thousands of objects in the qvw. This makes sense that it would take a long time since from the long file I can see that EVERY object is profiled. Although a 400MB file is not huge, if it contains thousands of objects then it would take a long time to process.

Any news on the expression, and my idea that the line-breaks may not be considered properly? (#2 above)?

rwunderlich
Partner Ambassador/MVP
Partner Ambassador/MVP
Author

I added your expression to a test document and the field references were detected fine. I have not encountered any problems with line breaks in the past. Line breaks in expressions are pretty common. So I don't know why it's not being detected in your case. Could have something to do with the doc size or complexity.

I'll be releasing a major update to Doc Analyzer withing a few days. Try that version when it's available.

-Rob

Not applicable

I will.

Thanks for all the help!

rwunderlich
Partner Ambassador/MVP
Partner Ambassador/MVP
Author

Note that a new version of Document Analyzer is now available. Read more about it here.

Document Analyzer V2 -- a major update

-Rob