Re: tdenormalizesortedrow : difference with tdenor... - Qlik Community

android1686764069 · ‎2023-10-02

hello

I understand what tdenormalize does.

However, I don't know when i should use tdenormalize

or

tdenormalizesortedrow .

Also, when I use tdenormalizesortedrow

, what

does "Input rows count " do?

I put some random number for 'input rows count' and it seems that when 'Input rows count' is actually greater than the number of rows for my input, the output is wrong in my vase below

can you help me?

lcupito · ‎2023-10-05

In Talend, the tDenormalize component is used to transform and denormalize data from a normalized format to a denormalized format, whereas the tDenormalizeSortedRow component is specifically used when you want to denormalize data based on a sorted key column.

You would typically use tDenormalize when you have data in a normalized format, where related information is split into multiple rows, and you want to combine this information into a single row.

tDenormalizeSortedRow, on the other hand, is used when you have sorted data based on a key column, and you want to denormalize it based on this sorting. It is useful when you want to combine sorted rows with similar key values into a single row.

When you use tDenormalizeSortedRow, the "Input rows count" parameter specifies the number of sorted rows to be used to form a single denormalized row. For example, if you set the "Input rows count" to 5, it means that 5 sorted rows with the same key value will be combined into a single denormalized row.

android1686764069 · ‎2023-10-06

thanks opr your reply.

I have to make some tests to understand what you wrote regarding tDenormalizeSortedRow.

There is one thing I stil: don't really get. it's about the "Input rows count" parameter.

In your exsample with 5, it means i know that I have 5 rows with the same key.

However, in my csv, let's say I have 30 rows :

5 of them may have the same keys,
then the other 10 of thems may have another same key
then the last 15 of them may have another same key

What am i supposed to set as the "Input rows count" parameter?

Also, does it that mean I use tDenormalizeSortedRow, the rows are supposed to be sorted?

in my example in my screesnhot, my csv is connected to tDenormalizeSortedRow, but now I understand i have to use a tSortRow compoment before, right?

lcupito · ‎2023-10-06

The "Input rows count" parameter in tDenormalizeSortedRow component refers to the number of rows you expect to group together based on a common key. In your case, where you have 30 rows with various keys, you should set this parameter to the total number of rows you have, which is 30. This parameter helps the component know how many rows to process as a group before denormalizing.

Regarding sorting, yes, you are correct. To use tDenormalizeSortedRow, your input rows should be sorted based on the key you want to denormalize. You can use the tSortRow component to sort your rows based on the key before feeding them into tDenormalizeSortedRow. This ensures that rows with the same key are processed together for denormalization. So, your job flow would typically be:

Read your CSV data.
Sort the rows using tSortRow based on the key.
Pass the sorted rows to tDenormalizeSortedRow, and set the "Input rows count" to 30 or the total number of rows you have.

This setup will help you correctly denormalize your data based on the common keys.

android1686764069 · ‎2023-10-10

thank you

tdenormalizesortedrow : difference with tdenormalize and what does "Input rows count " do?

Talend Big Data

Talend Studio

v8.x