Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Qlik Open Lakehouse is Now Generally Available! Discover the key highlights and partner resources here.
cancel
Showing results for 
Search instead for 
Did you mean: 
Anonymous
Not applicable

Split fields from fields in a text line

Hi,

I have the following Problem.

Normally, the field names an data in a file look like that:

ID:          1

Date:      20.04.2018

Item1:     Content1

Item2:     Content2

ID:          2

Date:      21.04.2018

Item1:     Content1

Item2:     Content2

...

But in many cases it looks like that:

ID:          1

Date:      20.04.2018

Item1:     Content1     Item2:     Content2

ID:          2

Date:      21.04.2018

Item1:     Content3     Item2:     Content4

The result should be:

IDDateItem1Item2
120.04.2018Content1Content2
221.04.2018Content3Content4

Does anyone has an idea, how to solve the problem?

Best regards

Michael

16 Replies
sunny_talwar

Did you look at the response here?

Re: Data Transformation

Anonymous
Not applicable
Author

Yes, I've looked at it and the problem is solved. But this is an additional Problem from a different file with that slightly different structure of data.

sunny_talwar

I don't see much difference from the previous thread.... is this data coming from Excel file?

Anonymous
Not applicable
Author

No, its an exctract from a PDF. The difference is, that you have some of the fieldnames and the content of that fields are comming from one text line, whereas in the the former file each fieldname/content has its separate line.

sunny_talwar

Oh, so you want to know how to split them in two columns... try SubField for that

Table:

LOAD *,

PurgeChar(Field, ':') as NewField,

If(Field = 'ID', Data, Peek('RowNum')) as RowNum;

LOAD Trim(SubField(F1, ':', 1)) as Field,

Trim(SubField(F1, ':', 2)) as Data;

LOAD * INLINE [

F1

    ID:          1

    Date:      20.04.2018

    Item1:    Content1

    Item2:    Content2

    ID:          2

    Date:      21.04.2018

    Item1:    Content1

    Item2:    Content2

];


FinalTable:

LOAD Distinct RowNum

Resident Table;


FOR i = 1 to FieldValueCount('NewField')


LET vField = FieldValue('NewField', $(i));


Left Join (FinalTable)

LOAD RowNum,

Data as [$(vField)]

Resident Table

Where NewField = '$(vField)';


NEXT


DROP Table Table;

Capture.PNG

Anonymous
Not applicable
Author

Thank you, but the structure of the input data is a little bit different.

It's not:

    ID:          1

    Date:      20.04.2018

    Item1:    Content1

    Item2:    Content2

    ID:          2

    Date:      21.04.2018

    Item1:    Content1

    Item2:    Content2

but

    ID:          1

    Date:      20.04.2018

    Item1:    Content1 Item2:    Content2

    ID:          2

    Date:      21.04.2018

    Item1:    Content1 Item2:    Content2


The information of two fields is put in one input line.

    Item1:    Content1 Item2:    Content2

So QV is interpreting the line as:

Fieldname: Item1

Content: Content1 Item2:    Content2


So the second fieldname/content is "hidden" in the content of the first field.

sunny_talwar

oh okay, I got it now... sorry for the confusion... checkin

sunny_talwar

May be like this

Table:

LOAD *,

PurgeChar(Field, ':') as NewField,

If(Field = 'ID', Data, Peek('RowNum')) as RowNum;

LOAD Trim(SubField(F1, ':', 1)) as Field,

Trim(SubField(F1, ':', 2)) as Data

Where Len(Trim(SubField(F1, ':', 2))) > 0;

LOAD If(Index(F1, 'Item'), 'Item' & SubField(F1, 'Item'), F1) as F1;

LOAD * INLINE [

    F1

    ID:          1

    Date:      20.04.2018

    Item1:    Content1 Item2:    Content2

    ID:          2

    Date:      21.04.2018

    Item1:    Content1 Item2:    Content2

];


FinalTable:

LOAD Distinct RowNum

Resident Table;


FOR i = 1 to FieldValueCount('NewField')


LET vField = FieldValue('NewField', $(i));


Left Join (FinalTable)

LOAD RowNum,

Data as [$(vField)]

Resident Table

Where NewField = '$(vField)';


NEXT


DROP Table Table;

Anonymous
Not applicable
Author

That looks great, but what to do if the fieldnames are not nearly identical (item1, item2) but e.g. "item" and "color"