Skip to main content
Announcements
Introducing Qlik Answers: A plug-and-play, Generative AI powered RAG solution. READ ALL ABOUT IT!
cancel
Showing results for 
Search instead for 
Did you mean: 
ryanhicks
Contributor III
Contributor III

ApplyMap increasing file size???

I am working on optimizing an application.  It is 800 MB in size and the issue it that is takes 60 GB to run the script (300 mill rows).

Well I did get it optimized (20GB) but now the file size is almost double 1,350 MB.  I have isolated the script that causes the increase I am just unsure as to why.

I have a date field stored as a numeric value (e.g. 123 = January 2007).

----EDIT--- The date field is not a timestamp, it is a 3 character numeric field----

I need to get the month name from this field so loaded all dates into a mapping table along with the month name into a mapping table:

     MonthMap:

     mapping load

          TimeValue,

          MonthName(TimeValueEndDate) as [Month Name]

     from

     ..\TIME.qvd (qvd);

and then apply map in the load script:

     applymap('MonthMap', TimeValue) as [Month Name]

this is what causes the increase in file size of 1,350 MB.

If I instead take this out and do a left join:

     Left join (table1)

     Load TimeValue,

              MonthName(TimeValueEndDate) as [Month Name]

     from

     ..\TIME.qvd (qvd);

     drop field TimeValue;

The resulting file size is 805K (No other changes were made)

I can use the left join without a problem and have implemented this into the app but I am still concerned about the affect that the applymap had on file size since I use this alot in my development.  Any idea why the mapping load would cause that large of an increase in file size?

Message was edited by: ryan hicks

12 Replies
swuehl
MVP
MVP

Do you have a match in your mapping table for every TimeValue in table1?

If not, the first version using the applymap() will use the TimeValue as default value, where the second version will create a NULL. So the number of distinct values of both versions might hugely differ, which might explan a difference in file size.

sebastiandperei
Specialist
Specialist

I think it is because Applymap returns the Monthname in String format, with Left join, you are calculating the Monthname, and it saves like "date" format (more efficient).

Try in Mapping, use Num(Monthname(...)) instead of only Monthname(...), and when you use ApplyMap, use Monthname([Month Name]). This will take the same effect that Left Join, but using Maps.

Please, let me know if it works!!

preminqlik
Specialist II
Specialist II

HI,

use distinct in left join you can get some accurate results and  this will reduce memory too

Left join (table1)

     Load DISTINCT TimeValue,

              MonthName(TimeValueEndDate) as [Month Name]

     from

     ..\TIME.qvd (qvd);

     drop field TimeValue;

---> check the rows of table1 before and after leftjoining , and check with disinct too

regards

Premhas

jonathandienst
Partner - Champion III
Partner - Champion III

Ryan

I suspect the problem has to do with the resolution of the timestamp you are using (down to milliseconds?), which means that you have a very large number of distinct values. I also suspect that the qvd contains one line per day (hence swuehl's point).

The time portion plays no part in the month selection, so I would use applymap(), but do it like this:

For the mapping table:

MonthMap:

mapping load

    Distinct Floor(TimeValue),

    MonthName(TimeValueEndDate) as [Month Name]

from ..\TIME.qvd (qvd);

and for the applymap:

Applymap('MonthMap', Floor(TimeValue), 'Missing') as [Month Name]

That should prevent an explosion of memory or disk space. Note the 'Missing' if no match is found. You could also base the mapping table on the actual data, rather than a separate qvd (which would mean no unmatched values), or use the Month() or MonthName() function on the timestamp during your load.

HTH

Jonathan

Logic will get you from a to b. Imagination will take you everywhere. - A Einstein
ryanhicks
Contributor III
Contributor III
Author

swuehl,

Thank you for the reply.  there is a 100% match between teh tables.  the funny thing is there are only 3 possible values for this load (test load).  there will be more once we implement the app in production.

ryanhicks
Contributor III
Contributor III
Author

the QVD only contains one value for each month and there is a 100% match between the QVD's. so there are only around 50 distinct values (i make sure of this when I create the Time.qvd.  also the date field that i am mapping on is only a 3 character numeric field not a timestamp.

while the TimeValueEndDate is a timestamp field, there is only one per record and that is removed with the MonthName function.

As far as using the monthname function during the load that is not an option as the main data source has only the 3 character field so i need to map that character to a monthname hence the left join/map options.

swuehl
MVP
MVP

So your TIME.qvd is really small, right?

Have you compared the two versions of your qvws with regard to number of distinct field values and rows in your resident tables? You can do this easily looking at the tables tab of Document Properties.

Is the count and distinct count of field [Month Name] equal in both qvws?

edit: And please also check the other fields / tables. In general, is the resulting data model the same in both versions?

ryanhicks
Contributor III
Contributor III
Author

yes i have compared them. in both apps the number of values and the number of distinct values are the same in both apps.  68 mill values (one for each record) 3 distinct values.

count and count distinct are equal in both qvws.

I have also run the document analyzer on both of the qvw's and there is no difference.  they should logically be exactly the same file.

swuehl
MVP
MVP

Sorry, I don't have another idea right now, if all other qvw settings (file compression) are identical as well.

Which version are you using? Can you replicate this issue with a very downstripped version of your app (just the TimeValue mapping)? Can you share this simplified app then?

If not, I would consider creating a ticket with QT.