Skip to main content
Woohoo! Qlik Community has won “Best in Class Community” in the 2024 Khoros Kudos awards!
Announcements
Nov. 20th, Qlik Insider - Lakehouses: Driving the Future of Data & AI - PICK A SESSION
cancel
Showing results for 
Search instead for 
Did you mean: 
rubenmarin

Split groups of columns in rows

Hello,

I have a table with 30 values reported in a single line, and each value is composed of 5 fields, ex .:

Key1, Key2, Field101, Field201, Field301, Field401, Field501, Field102, Field202, Field302, Field402, Field502, ...Field530

I would have to distribute in rows each group of 5 fields, it is a qvd of several million records, for now I have it done with a loop but it has to read 30 times and it takes a lot. Do you know if there is a way to avoid reading it 30 times?

FOR i = 1 to 30

    LET vNum = Num ($(i), '00');

    Concatenate (TableName)

    LOAD Key1,

          Key2,

          Field1$(vNum) as Field1,

          Field2$(vNum) as Field2,

          Field3$(vNum) as Field3,

          Field4$(vNum) as Field4,

          Field5$(vNum) as Field5

    From

        [$(vPath)FileName.qvd]

        (qvd)

    Where

        Field1$(vNum) <> 0

    ;

NEXT

I'm not looking for delta/incremental loads solutions.

Thanks,

Regards.

25 Replies
rubenmarin
Author

Hi Marcus, I'm not sure to understand the code

Fields go from Field101 to field530, so 150 fields, if I'm not wrong the bucle should be from 1 to 30 and the qvd should be readed 30 times, isn't?


Thanks for the tip about optimized load, real code also does some transformations of the fields that prevents for optimized load but I take note of this to check if it's worth doing it in 2 steps.


I tried "load * from qvd where not Exists(chkTest,HIDECA$(vNumDef))" but I don't see the load optimized. Any tip on how it should be?

petter
Partner - Champion III
Partner - Champion III

No looping required - one single load statement is sufficient.

This logic could form the basis for UNPIVOTING the columns completely. You can even identify which numbers come from exactly which column by using RecNo() and RowNo(). The SubField()-function does the hard work and unpivot the columns:

DATA:

LOAD

  K1,

  K2,

  RecNo() AS Rec#,

  RowNo() AS Row#,

  SubField( F10 & '|' & F11 & '|' & F20 & '|' & F21 , '|' ) AS VAL

INLINE [

K1, K2, F10, F11, F20, F21

A, X, 1, 2 , 3 ,4

A, Y, 5, 6, 7, 8

];

2017-11-24 14_38_59-Dialog.png

marcus_sommer

Sorry, I didn't notice that it was really 30 iterations and not  just 30 columns. In this case my suggestion would be also need 30 iterations.

Beside them I don't believe that there are really faster solutions than such kind of loop. Of course there are solutions with crosstable thinkable but they are quite performance intense by millions of records so that a optimized qvd-load should be faster.

The only option to keep it optimized is to remove the where-clause (and applying it within a following resident-load) or a where exists() with only one parameter.

Beside this if you could change or maybe extend the logic which creates these qvd you might be able to skip this task.

- Marcus

rubenmarin
Author

Thank Petter, that's pretty close to the idea I get from Jonathan's answer, in my case from each record 1 to 30 rows will be created with those 5 fields (and keys and other fields) for each row.

I'll try next day I have access and go back with the results.

rubenmarin
Author

Only one parameter in Exists(), that's what I didn't tested.

It's not that bad to change the logic behind, just I didn't like the bucle and looking for better ways.

I'm very optimistic of reducing time loads with the ideas I get. Thanks

avkeep01
Partner - Specialist
Partner - Specialist

Hi Ruben,

My solution is based on building a script in a variable and then use that variable as text in a script. It just takes a bit more scripting to get there. I've got the app attached.

/* Creating a dynamic script with in a variable */


// Step 1 - Initial setup by only loading the header fields from the datasource with FIRST 1 LOAD.
temp01_Setup:
FIRST 1 LOAD
NULL() AS TempField,
A,
B,
C,
D,
E,
F,
G,
H,
I,
J,
K,
L
FROM
\\SSCDATA09.frd.shsdir.nl\LT_516745$\Downloads\DummyData.xlsx
(
ooxml, explicit labels, table is Data);

// Step 2 - Create a crossteable with the fieldnames in one column, so they become values instead of fieldnames. 
temp02_Setup:
CROSSTABLE (ExcelColumnName,ColumnNameLOAD * RESIDENT temp01_Setup;
DROP TABLE temp01_Setup;


// Step 3 - Rename the different values to create fields (ExcelColumnName), aliasses (ColumnName), and counter fields to set up the loop.
temp03_Setup:
NOCONCATENATE LOAD
ExcelColumnName,
LEFT(ColumnName,6) AS ColumnName,
RIGHT(LEFT(ColumnName,6),1) AS ColumnNo,
RIGHT(ColumnName,1) AS ColumnGroup
RESIDENT temp02_Setup;
DROP TABLE temp02_Setup;

/* Create a table for the loop size */

// Step 4 - Set up the loop boundaries based on the maximum groups and on the maximum fields, in this set there are 2 groups, and 5 fields per group.
MAX_Groups:
LOAD
MAX(ColumnGroup) AS MaxColumnGroup,
MAX(ColumnNo) AS MaxColumnNo
RESIDENT temp03_Setup
WHERE NOT MATCH(ColumnName,'Key1','Key2');

// Set length of loop for the group
LET vGroups = PEEK('MaxColumnGroup',0,'MAX_Groups');

// Set length for the fields within the group
LET vColumns = PEEK('MaxColumnNo',0,'MAX_Groups');

DROP TABLE MAX_Groups;

/* LOOP START */

FOR i = 1 TO $(vGroups)  // start loop over the groups

FOR j = 1 TO $(vColumns) // start loop over the fields in the group

// First run
IF $(j) = 1 THEN

// Only pick the fieldname and the alias for the first field in a group
temp04_Setup:
NOCONCATENATE LOAD
ExcelColumnName,
ColumnName
RESIDENT temp03_Setup
WHERE NOT MATCH(ColumnName,'Key1','Key2')
AND ColumnNo = $(j)
AND ColumnGroup = $(i);

// Fill a variable with the values from the setup table. For example: 1 AS Group, C AS Field1
LET vScript = $(i) & ' AS Group,' &  PEEK('ExcelColumnName',0,'temp04_Setup') &' AS '& PEEK('ColumnName',0,'temp04_Setup');

DROP TABLE temp04_Setup;

ELSE

temp04_Setup:
NOCONCATENATE LOAD
ExcelColumnName,
ColumnName
RESIDENT temp03_Setup
WHERE NOT MATCH(ColumnName,'Key1','Key2')
AND ColumnNo = $(j)
AND ColumnGroup = $(i);

// Add the following fields to the variable vScript, so every field loop an extra field is added, for example, D AS Field2 etc.
LET vScript = '$(vScript)'&','&  PEEK('ExcelColumnName',0,'temp04_Setup') &' AS '& PEEK('ColumnName',0,'temp04_Setup');

DROP TABLE temp04_Setup;

END IF

NEXT j

// With the previous build script the first group is loaded into the result tabel. The variable vScript is executed as a script text. For the current example: 1 AS Group, C AS Field1, D AS Field2, etc.
// Because the column names are exactly the same the next group will be auto concateneated.
Result:
LOAD
A AS Key1,
B AS Key2,
$(vScript)
FROM \\SSCDATA09.frd.shsdir.nl\LT_516745$\Downloads\DummyData.xlsx
(
ooxml, no labels, header is 1 lines, table is [Data]);

// Go and get the next group.
NEXT i

DROP TABLE temp03_Setup; 

petter
Partner - Champion III
Partner - Champion III

This is my last go at it - I know the question has been "answered" but I think it might be interesting for other purposes as well.

2017-11-24 17_08_02-Edit Script [C__Users_Petter_Documents_# QC 2017-11-24 UnPivot.qvw].png

This could of course been reduced to one table by jusing the FIELDNAMES table as a MAPPING LOAD and use ApplyMap() in the DATA table...

rubenmarin
Author

Thanks, this also gives me an idea to create the scripting with a variable.

rubenmarin
Author

Thanks again Petter, I tested with this sample, changed it a bit, attached is the modified sample.

I'll try next time I have access to real data.

MarcoWedel

Hi,

another solution might be:

QlikCommunity_Thread_282877_Pic1.JPG

tabData:

CrossTable (FldNam,FldVal,2)

LOAD * FROM [https://community.qlik.com/servlet/JiveServlet/download/1388746-305317/DummyData.xlsx] (ooxml, embedded labels, table is Data);

tabResult:

Generic

LOAD Key1,

    Key2,

    Right(FldNam,2) as Group,

    Left(FldNam,6),

    FldVal

Resident tabData;

DROP Table tabData;

Probably combined with Rob's generic table recombination code

Use cases for Generic Load | Qlikview Cookbook

for those who tend to the evil side of the synthetic-key-discussion 😉

Qlik Design Blog

Should We Stop Worrying and Love the Synthetic Key?

Synthetic Keys

hope this helps

regards

Marco