Qlik Community

QlikView Scripting

Discussion Board for collaboration on QlikView Scripting.

richardcripps
New Contributor III

Concatenation - De - duplication of Rows

Hello

I guess this question stems around the behaviour of the concatenate function, but I was wondering if any one had seen anything similar.

I have a piece of script that loads in some data (c. 60 million rows) from a QVD and then takes some data from an existing resident table (c. 85k rows) and concatenates it to the data from the qvd. I would expect the result of this piece of script to be a table with a total number of rows that is the number from the QVD and the number from the resident table i.e. 60 million + 85k. However, the result is a table with around 40 million rows in it.

Below is the script in question.

UserAction:
LOAD  join.Subscriber,
     join.ResourceCode,
     join.UserActionType,
     Month,
  Year,
  YearMonth,
  Week,
  YearWeek,
  Hour,
  Day,
  join.ResourceCommentID,
  ActionCount
FROM
[UserAction.qvd]
(qvd);

CONCATENATE(UserAction)  
LOAD DISTINCT Resource.SITEUSERCODE AS join.Subscriber,
  join.ResourceCode,
  99 AS join.UserActionType,
   MONTH(join.PubDate) AS Month,
  YEAR(join.PubDate) AS Year,
  YEAR(join.PubDate) & NUM(MONTH(join.PubDate),'00') AS YearMonth,
  WEEK(join.PubDate) AS Week,
  YEAR(join.PubDate) & NUM(WEEK(join.PubDate),'00') AS YearWeek,
  HOUR(join.PubDate) AS Hour,
  DAY(join.PubDate) AS Day,
  null() AS join.ResourceCommentID,
   1 AS ActionCount
   Resident Resource;

So what I did was to amend my code so that when loading from the QVD it uses the recno() function to basically make every row unique. And this resulted in a table with the number of rows that I would expect, see the code below.

UserAction:
LOAD recno() AS REC,
  join.Subscriber,
     join.ResourceCode,
     join.UserActionType,
     Month,
  Year,
  YearMonth,
  Week,
  YearWeek,
  Hour,
  Day,
  join.ResourceCommentID,
  ActionCount
FROM
[UserAction.qvd]
(qvd);

CONCATENATE(UserAction)  
LOAD DISTINCT Resource.SITEUSERCODE AS join.Subscriber,
  join.ResourceCode,
  99 AS join.UserActionType,
   MONTH(join.PubDate) AS Month,
  YEAR(join.PubDate) AS Year,
  YEAR(join.PubDate) & NUM(MONTH(join.PubDate),'00') AS YearMonth,
  WEEK(join.PubDate) AS Week,
  YEAR(join.PubDate) & NUM(WEEK(join.PubDate),'00') AS YearWeek,
  HOUR(join.PubDate) AS Hour,
  DAY(join.PubDate) AS Day,
  null() AS join.ResourceCommentID,
   1 AS ActionCount
   Resident Resource;

Has anyone else seen any similar behaviour? Is the concatenation function supposed to behave like this?

Thanks

Tags (2)
1 Solution

Accepted Solutions
simon_hallworth
New Contributor II

Concatenation - De - duplication of Rows

Hi Richard,

This is the way that concatenate works in QlikView. The DISTINCT will be applied to the whole of the UserAction table even though it is only defined in the second statement. When I originally came across this behaviour I used the same approach as you to work around it, i.e. create a unique row ID in the first table.

I have not clarified this behaviour with QlikTech support but it was highlighted as being WAD by one of their UK consultants some time ago.

Kind regards,

Simon

4 Replies
simon_hallworth
New Contributor II

Concatenation - De - duplication of Rows

Hi Richard,

This is the way that concatenate works in QlikView. The DISTINCT will be applied to the whole of the UserAction table even though it is only defined in the second statement. When I originally came across this behaviour I used the same approach as you to work around it, i.e. create a unique row ID in the first table.

I have not clarified this behaviour with QlikTech support but it was highlighted as being WAD by one of their UK consultants some time ago.

Kind regards,

Simon

richardcripps
New Contributor III

Concatenation - De - duplication of Rows

Cheers Simon!

I would prefer it if I could avoid having the row id in the document to reduce the size of it.

However it doesn't look like it is possible.

I tried to drop the field after I had done the concatenation and that reduced the number of rows in my table as well.

So I will have to leave it in.

Not applicable

Re: Concatenation - De - duplication of Rows

I just noticed this as well after doing multiple concatenations so now I am placing a RecNo or RowNo all the time on concatenated tables to prevent a distinct from happening

MVP
MVP

Re: Concatenation - De - duplication of Rows

if you need a distinct only on the small table Resource I think you can (without recno / rowno)

TmpResource:

load distinct ......Resident Resource;

UserAction:

noconcatenate load * ........... FROM [UserAction.qvd] (qvd);

concatenate (UserAction) load * resident Resource;

drop table Resource;

Community Browser