Skip to main content
Announcements
Qlik Connect 2024! Seize endless possibilities! LEARN MORE
cancel
Showing results for 
Search instead for 
Did you mean: 
Newsense2020
Contributor III
Contributor III

Seperate Field by string

Hello,

I have an issue with the following topic and hope to find a solution from you experienced minds. I hope the question is placed here correctly.

I have a table with the following customer information:

CustomerContact
AMobile +49 1520999999; E-Mail test@web.de
Bcompany +49 2222222; Mobile +49 3333333; E-Mail a.tester@gmx.de; Private +49 5555555

 

So I have several information unstructured in the fields. The goal ist to have the following:

CustomerE-MailcompanyMobilePrivate
Atest@web.de  +49 1520999999 
Ba.tester@gmx.de +49 2222222Handy +49 3333333 +49 5555555

 

Does anybody have an idea how to solve that?

Thanks in advance.

 

1 Solution

Accepted Solutions
ArnadoSandoval
Specialist II
Specialist II

@Newsense2020 

I found a couple of things:

  1. The data in the Load ... InLine[] statement is delimited with tabs!!! in your script the tabs are gone, probably when you took the code I posted above those tabs become spaces; I replaced them with pipes (they are visible, and less chances to become missing in translation -- e.g. cut and paste).
  2. The last preceding load, where the Replace(Lower(Subfield(... are, the semicolons (;) become colons (:) breaking the logic!

These are the sections of the scripts I fixed, first the Replace(Lower(Subfield(...

NoConcatenate
MyTable:
Load Customer,
  Replace(Lower(Subfield(E_Mail, ';', 1)), 'e-mail', '') as E_Mail,
  Replace(Lower(Subfield(Company, ';', 1)), 'company', '') as Company,
  Replace(Lower(Subfield(Mobile, ';', 1)), 'mobile', '') as Mobile,
  Replace(Lower(Subfield(Private, ';', 1)), 'private', '') as Private
  ;

and the Load ... InLine[] with pipes instead:

Load Customer,
	 Contact
Inline [
Customer|Contact
A|Mobile +49 1520999999; E-Mail test@web.de
B|company +49 2222222;  Mobile +49 3333333; E-Mail a.tester@gmx.de; Private +49 555555
C|Company +49 4444444; MOBILE +49 5555555; E-MAIL B.Tester@gmx.de; PRIVATE +49 7777777
] (delimiter is '|');

Just update those parts of the script !!!

Arnaldo Sandoval
A journey of a thousand miles begins with a single step.

View solution in original post

19 Replies
ArnadoSandoval
Specialist II
Specialist II

Hi @Newsense2020 

You should use the SubField function, it is something like this:

SubString(Contact, ';', 1) As Field1
SubString(Contact, ';', 2) As Field2
SubString(Contact, ';', 3) As Field3

Hope this helps, 

Arnaldo Sandoval
A journey of a thousand miles begins with a single step.
Newsense2020
Contributor III
Contributor III
Author

Thanks for your quick response.  It might not have been clearly enough. But the order of the topics is not always the same. So it could be that E-Mail ist the first string befor the ';', sometimes mobile. Therefore I think your solution would not work for my problem.

My intention, though I do not know how to transform this logic into qlik, would be, that I search for the word mobile, then give me the following signs until the next space.

 

marcus_sommer

I think subfield() is really suitable to split the strings but I wouldn't create multiple fields with it (the number of string-parts and also their order seems to be quite different) else using subfield() without the third parameter. In this case subfield looped through all string-parts and creates for each a new record.

Within the following steps comes the real work - to identify and clean the content. With your example-records it would be quite easy - searching for a @ to identify a mail and + for a phone-number (and the + could be used within another subfield() to split this part further). But quite probably there will be all possible varieties of values - with/out + or () or - or … and probably in those strings isn't just one contact …

Therefore should there be any possibility to get these data in a well-build data-structure you should use them.

- Marcus

ArnadoSandoval
Specialist II
Specialist II

@Newsense2020 

Yes, I noticed that, you should combine the SubField function with Index function! you will also need Replace

If(Index(Contact, 'Mobile') = 0, '',
   SubField( Mid( Contact, Index(Contact, 'Mobile')) & ';', 1) ) As Mobile

If(Index(Contact, 'company') = 0, '',
   SubField( Mid( Contact, Index(Contact, 'company')) & ';', 1) ) As Company 

If(Index(Contact, 'E-Mail') = 0, '',
   SubField( Mid( Contact, Index(Contact, 'E-Mail')) & ';', 1) ) As E_Mail

If(Index(Contact, 'Private') = 0, '',
   SubField( Mid( Contact, Index(Contact, 'Private')) & ';', 1) ) As Private

Then, you have to clean each one of these fields with an expression like the one below (one per field)

For Mobile: Replace(Mobile, 'Mobile', '')
For Company: Replace(Company, 'company', '')
For E_mail: Replace(E_mail, 'E-Mail', '')
For Private: Replace(Private, 'Private', '')

All these expressions ARE NOT case sensitive, use LOWER or UPPER to make them case sensitive.

Hope this helps,

Arnaldo Sandoval
A journey of a thousand miles begins with a single step.
Newsense2020
Contributor III
Contributor III
Author

@marcus_sommer 

I am currently working on getting the data base, but this seems to be difficulty. 

 

@ArnadoSandoval :

As you might have recognized im am not a professional qlik sense user.  I tried to include your formula into the load command:

LOAD,

Contact,

your if sentence(s),

your Replace(s)

 

But that did not work.

ArnadoSandoval
Specialist II
Specialist II

Hi @Newsense2020 

No worries, your script should be like the one below, it was tested with your data and it works, also it is no case sensitive, it parse the contact details regardless of the casing of the Contact tokens; the script is using a concept know as preceding load, this one has 3 preceding loads after loading the source data, each preceding load with a function (you read preceding loads bottom up), the first preceding load compute the tokens positions in the Contact string (eg. Company, Mobile, e-mail and Private), the second preceding load Cut (that's the action of the Mid function) the Contact string at the position where the token was found (appending a ';' at the end of the Cut), it the contact is missing (pos = 0) it returns a ';'), the final preceding load applies the SubField using ';' as the field separator, always picking the first field, and Replace the token with an empty string!

NoConcatenate
MyTable:
Load Customer,
     Replace(Lower(SubField(E_Mail,  ';', 1)), 'e-mail', '')  As E_Mail,
     Replace(Lower(SubField(Company, ';', 1)), 'company', '') As Company,
     Replace(Lower(SubField(Mobile,  ';', 1)), 'mobile', '')  As Mobile,
     Replace(Lower(SubField(Private, ';', 1)), 'private', '') As Private
;

Load Customer,
     If(Pos_E_Mail = 0, ';',
        Mid( Contact, Pos_E_Mail) & ';') As E_Mail,
     If(Pos_Company = 0, ';',
        Mid( Contact, Pos_Company) & ';') As Company, 
     If(Pos_Mobile = 0, ';',
        Mid( Contact, Pos_Mobile) & ';') As Mobile,
     If(Pos_Private = 0, '',
        Mid( Contact, Pos_Private) & ';') As Private
;

Load Customer,
     Index(Lower(Contact), 'e-mail')  As Pos_E_Mail,
     Index(Lower(Contact), 'company') As Pos_Company, 
     Index(Lower(Contact), 'mobile')  As Pos_Mobile,
     Index(Lower(Contact), 'private') As Pos_Private,
     Contact;

Load Customer,
     Contact
Inline [
Customer	Contact
A	Mobile +49 1520999999; E-Mail test@web.de
B	company +49 2222222; Mobile +49 3333333; E-Mail a.tester@gmx.de; Private +49 5555555
C	Company +49 4444444; MOBILE +49 5555555; E-MAIL B.Tester@gmx.de; PRIVATE +49 7777777
] (delimiter is '\t');

This script should work with your data!!!

Arnaldo Sandoval
A journey of a thousand miles begins with a single step.
Newsense2020
Contributor III
Contributor III
Author

 @ArnadoSandoval : First of all thank you for your willingness and effort to help. 

I tried to imitate your script only with the example date given but that does not work. I have attached the file. Maybe you or others may have a look at it and are able to find the mistake I have made?

 

Thanks in advance!

ArnadoSandoval
Specialist II
Specialist II

@Newsense2020 

I found a couple of things:

  1. The data in the Load ... InLine[] statement is delimited with tabs!!! in your script the tabs are gone, probably when you took the code I posted above those tabs become spaces; I replaced them with pipes (they are visible, and less chances to become missing in translation -- e.g. cut and paste).
  2. The last preceding load, where the Replace(Lower(Subfield(... are, the semicolons (;) become colons (:) breaking the logic!

These are the sections of the scripts I fixed, first the Replace(Lower(Subfield(...

NoConcatenate
MyTable:
Load Customer,
  Replace(Lower(Subfield(E_Mail, ';', 1)), 'e-mail', '') as E_Mail,
  Replace(Lower(Subfield(Company, ';', 1)), 'company', '') as Company,
  Replace(Lower(Subfield(Mobile, ';', 1)), 'mobile', '') as Mobile,
  Replace(Lower(Subfield(Private, ';', 1)), 'private', '') as Private
  ;

and the Load ... InLine[] with pipes instead:

Load Customer,
	 Contact
Inline [
Customer|Contact
A|Mobile +49 1520999999; E-Mail test@web.de
B|company +49 2222222;  Mobile +49 3333333; E-Mail a.tester@gmx.de; Private +49 555555
C|Company +49 4444444; MOBILE +49 5555555; E-MAIL B.Tester@gmx.de; PRIVATE +49 7777777
] (delimiter is '|');

Just update those parts of the script !!!

Arnaldo Sandoval
A journey of a thousand miles begins with a single step.
Newsense2020
Contributor III
Contributor III
Author

@ArnadoSandoval  Thank you very much. That worked out. Thank you as well for your patience!