Skip to main content
Announcements
Qlik Connect 2024! Seize endless possibilities! LEARN MORE
cancel
Showing results for 
Search instead for 
Did you mean: 
PaulK
Contributor III
Contributor III

Split field

Hi

I need to split the following column into 2 columns  in the load editor from an Excel file. The first column (Quantity) would include the first numbers and then the second column (Unit) would include all text and numbers after the first numbers in the original column.

Example of the Original column

1 mL
1 mL
20 pouches
100 T
10 pouches
10 pouches
10 pouches
100 T
10µg
1 mL
4.38 µg/mL in Methanol
1 mL
1 mL
1 mL
1 mL
1 mL
10µg
1 mL
1 mL
10 mL
1 mL
10µg
1 mL 
10 mL
100 mM * 1 mL in Water
20 pouches
1 mL
20 pouches
1 mL
10 mL
5mg(107.26 mM * 100 uL in Water)
10 pouches
10µg
100 T
1 mL
1 mL
1 mL
1 mL
1mg (3.14 mM * 1 mL in Methyl acetate)
1 mL
1 mL
10µg
10 pouches
10µg
1 mL-10 pcs
200 μL-2 mL-15 mL
1 mL
10 mL
1 mL
1 ml
1 mL


So the new columns/fields would look like this when split

Original Column  Quantity Unit
1 mL 1 mL
1mg (3.14 mM * 1 mL in Methyl acetate) 1 mg (3.14 mM * 1 mL in Methyl acetate)
20 pouches 20 pouches
100 T 100 T
10 pouches 10 pouches
100 T 100 T
10µg 10 µg
1 mL 1 mL
4.38 µg/mL in Methanol 4.38 µg/mL in Methanol

 

The Original column always starts with a number which I need for the Quantity column and then the Unit column always starts with a letter but can also include numbers, spaces, characters etc and the first numbers are not always separated with a space, sometimes they are not separated example '10µg'

Any help would be much appreciated.

Thanks
Paul

 

Labels (2)
4 Replies
brunobertels
Master
Master

hi 

try this :

 

Left("Original Column ",Len(Keepchar("Original Column ",'0123456789'))) as Quantity
Mid("Original Column ",1+Len(Keepchar("Original Column ",'0123456789'))) as Text

Paul_Sun
Contributor
Contributor

Hi @brunobertels 

Thank you for your response. I've just tried

Left("Original Column ",Len(Keepchar("Original Column ",'0123456789'))) as Quantity
Mid("Original Column ",1+Len(Keepchar("Original Column ",'0123456789'))) as Text

and it hasn't split some of the field correctly, I get the following for some fields.

Original column Quantity Text
2 KU 2.  KU
200 μL-2 mL-15 mL 200 μL -2 mL-15 mL
1mg (3.14 mM * 1 mL in Methyl acetate) 1mg ( 3.14 mM * 1 mL in Methyl acetate)
2.5 mg 2. 5 mg
2.5g 2. 5g
2.5kg 2. 5kg
2.5mg 2. 5mg
7.5mg 7. 5mg
5mg(107.26 mM * 100 uL in Water) 5mg(107.2 6 mM * 100 uL in Water)
4.38 µg/mL in Methanol 4.3 8 µg/mL in Methanol
2g 2. g
1 mL-10 pcs 1 m L-10 pcs
2mg 2. mg

 

I've attached a file that has the original column that needs splitting.

Thanks
Paul

brunobertels
Master
Master

Hi 

Due to complex string the solution is a little bit harder 

after some research in the forum,  try this script who works for me. 

 

1 script example : 

I did 2 preceding load to show who how to extract NUMBER end the how to extract TEXT 

but at he end you can concatenate twice to create the dimension Quantity and the dimension Text :

see script example 2 

Table1 :

load *,

// préceding load to extract from the column Original the text value without the first number part of the string 

mid(Original_column_Quantity,
len(Quantity)+1) as Text;

Load *,

// preceding load to find only first value in string who is a number (can be 1 , 2 , 3.8 etc ) named as Quantity

Mid(Original_column_Quantity,FindOneOf(Original_column_Quantity,'.0123456789'),FindOneOf(Mid(Original_column_Quantity,
FindOneOf(Original_column_Quantity,'.0123456789')),PurgeChar(Original_column_Quantity,'.0123456789'))-1) as Quantity;

// load the column to plit :

load * Inline [
Original_column_Quantity,
2 KU,
200 μL-2 mL-15 mL,
1mg (3.14 mM * 1 mL in Methyl acetate),
2.5 mg,
2.5g,
2.5kg,
2.5mg,
7.5mg,
5mg(107.26 mM * 100 uL in Water,
4.38 µg/mL in Methanol,
1 mL-10 pcs,
2mg,
](delimiter is ',');

 

Table2:


Load *,

Mid(Original_column_Quantity,FindOneOf(Original_column_Quantity,'.0123456789'),FindOneOf(Mid(Original_column_Quantity,
FindOneOf(Original_column_Quantity,'.0123456789')),PurgeChar(Original_column_Quantity,'.0123456789'))-1) as Quantity,

mid(Original_column_Quantity,
len(Mid(Original_column_Quantity,FindOneOf(Original_column_Quantity,'.0123456789'),FindOneOf(Mid(Original_column_Quantity,
FindOneOf(Original_column_Quantity,'.0123456789')),PurgeChar(Original_column_Quantity,'.0123456789'))-1))+1) as Text;

load * Inline [
Original_column_Quantity,
2 KU,
200 μL-2 mL-15 mL,
1mg (3.14 mM * 1 mL in Methyl acetate),
2.5 mg,
2.5g,
2.5kg,
2.5mg,
7.5mg,
5mg(107.26 mM * 100 uL in Water,
4.38 µg/mL in Methanol,
1 mL-10 pcs,
2mg,
](delimiter is ',');

 

 

 

the resulting table should look like this 

brunobertels_0-1674492329286.png

 

hope it helps 

 

MarcoWedel

also using this FindOneOf/PurgeChar approach, a shorter solution might be as well:

MarcoWedel_0-1674504658972.png

 

table1:
LOAD *,
     Left(OriginalColumn,FindOneOf(OriginalColumn,PurgeChar(OriginalColumn,'0123456789.'))-1) as Quantity,
     Trim(Mid(OriginalColumn,FindOneOf(OriginalColumn,PurgeChar(OriginalColumn,'0123456789.')))) as Unit
INLINE [
    OriginalColumn
    1 mL
    1 mL
    20 pouches
    100 T
    10 pouches
    10 pouches
    10 pouches
    100 T
    10µg
    1 mL
    4.38 µg/mL in Methanol
    1 mL
    1 mL
    1 mL
    1 mL
    1 mL
    10µg
    1 mL
    1 mL
    10 mL
    1 mL
    10µg
    1 mL
    10 mL
    100 mM * 1 mL in Water
    20 pouches
    1 mL
    20 pouches
    1 mL
    10 mL
    5mg(107.26 mM * 100 uL in Water)
    10 pouches
    10µg
    100 T
    1 mL
    1 mL
    1 mL
    1 mL
    1mg (3.14 mM * 1 mL in Methyl acetate)
    1 mL
    1 mL
    10µg
    10 pouches
    10µg
    1 mL-10 pcs
    200 μL-2 mL-15 mL
    1 mL
    10 mL
    1 mL
    1 ml
    1 mL
];