Skip to main content
Announcements
Introducing a new Enhanced File Management feature in Qlik Cloud! GET THE DETAILS!
cancel
Showing results for 
Search instead for 
Did you mean: 
android1686764069
Contributor
Contributor

tdenormalizesortedrow : difference with tdenormalize and what does "Input rows count " do?

hello

I understand what tdenormalize  does.

However, I don't know when i should use tdenormalize 

or

tdenormalizesortedrow .

Also, when I use tdenormalizesortedrow 

, what

 does "Input rows count " do?

I put some random number for 'input rows count' and it seems that when 'Input rows count' is actually greater than the number of rows for my input, the output is wrong in my vase below

0695b00000mRDhsAAG.png

0695b00000mRDiWAAW.png

can you help me?

Labels (3)
4 Replies
lcupito
Contributor
Contributor

In Talend, the tDenormalize component is used to transform and denormalize data from a normalized format to a denormalized format, whereas the tDenormalizeSortedRow component is specifically used when you want to denormalize data based on a sorted key column.

You would typically use tDenormalize when you have data in a normalized format, where related information is split into multiple rows, and you want to combine this information into a single row.

tDenormalizeSortedRow, on the other hand, is used when you have sorted data based on a key column, and you want to denormalize it based on this sorting. It is useful when you want to combine sorted rows with similar key values into a single row.

When you use tDenormalizeSortedRow, the "Input rows count" parameter specifies the number of sorted rows to be used to form a single denormalized row. For example, if you set the "Input rows count" to 5, it means that 5 sorted rows with the same key value will be combined into a single denormalized row.

android1686764069
Contributor
Contributor
Author

thanks opr your reply.

I have to make some tests to understand what you wrote regarding tDenormalizeSortedRow.

 

There is one thing I stil: don't really get. it's about the "Input rows count" parameter.

In your exsample with 5, it means i know that I have 5 rows with the same key.

However, in my csv, let's say I have 30 rows :

  • 5 of them may have the same keys,
  • then the other 10 of thems may have another same key
  • then the last 15 of them may have another same key

What am i supposed to set as the "Input rows count" parameter?

 

Also, does it that mean I use tDenormalizeSortedRow, the rows are supposed to be sorted?

in my example in my screesnhot, my csv is connected to tDenormalizeSortedRow, but now I understand i have to use a tSortRow compoment before, right?

lcupito
Contributor
Contributor

 

The "Input rows count" parameter in tDenormalizeSortedRow component refers to the number of rows you expect to group together based on a common key. In your case, where you have 30 rows with various keys, you should set this parameter to the total number of rows you have, which is 30. This parameter helps the component know how many rows to process as a group before denormalizing.

Regarding sorting, yes, you are correct. To use tDenormalizeSortedRow, your input rows should be sorted based on the key you want to denormalize. You can use the tSortRow component to sort your rows based on the key before feeding them into tDenormalizeSortedRow. This ensures that rows with the same key are processed together for denormalization. So, your job flow would typically be:

  1. Read your CSV data.
  2. Sort the rows using tSortRow based on the key.
  3. Pass the sorted rows to tDenormalizeSortedRow, and set the "Input rows count" to 30 or the total number of rows you have.

This setup will help you correctly denormalize your data based on the common keys.

android1686764069
Contributor
Contributor
Author

thank you