Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi
I need to split the following column into 2 columns in the load editor from an Excel file. The first column (Quantity) would include the first numbers and then the second column (Unit) would include all text and numbers after the first numbers in the original column.
Example of the Original column
1 mL |
1 mL |
20 pouches |
100 T |
10 pouches |
10 pouches |
10 pouches |
100 T |
10µg |
1 mL |
4.38 µg/mL in Methanol |
1 mL |
1 mL |
1 mL |
1 mL |
1 mL |
10µg |
1 mL |
1 mL |
10 mL |
1 mL |
10µg |
1 mL |
10 mL |
100 mM * 1 mL in Water |
20 pouches |
1 mL |
20 pouches |
1 mL |
10 mL |
5mg(107.26 mM * 100 uL in Water) |
10 pouches |
10µg |
100 T |
1 mL |
1 mL |
1 mL |
1 mL |
1mg (3.14 mM * 1 mL in Methyl acetate) |
1 mL |
1 mL |
10µg |
10 pouches |
10µg |
1 mL-10 pcs |
200 μL-2 mL-15 mL |
1 mL |
10 mL |
1 mL |
1 ml |
1 mL |
So the new columns/fields would look like this when split
Original Column | Quantity | Unit |
1 mL | 1 | mL |
1mg (3.14 mM * 1 mL in Methyl acetate) | 1 | mg (3.14 mM * 1 mL in Methyl acetate) |
20 pouches | 20 | pouches |
100 T | 100 | T |
10 pouches | 10 | pouches |
100 T | 100 | T |
10µg | 10 | µg |
1 mL | 1 | mL |
4.38 µg/mL in Methanol | 4.38 | µg/mL in Methanol |
The Original column always starts with a number which I need for the Quantity column and then the Unit column always starts with a letter but can also include numbers, spaces, characters etc and the first numbers are not always separated with a space, sometimes they are not separated example '10µg'
Any help would be much appreciated.
Thanks
Paul
hi
try this :
Left("Original Column ",Len(Keepchar("Original Column ",'0123456789'))) as Quantity
Mid("Original Column ",1+Len(Keepchar("Original Column ",'0123456789'))) as Text
Thank you for your response. I've just tried
Left("Original Column ",Len(Keepchar("Original Column ",'0123456789'))) as Quantity
Mid("Original Column ",1+Len(Keepchar("Original Column ",'0123456789'))) as Text
and it hasn't split some of the field correctly, I get the following for some fields.
Original column | Quantity | Text |
2 KU | 2. | KU |
200 μL-2 mL-15 mL | 200 μL | -2 mL-15 mL |
1mg (3.14 mM * 1 mL in Methyl acetate) | 1mg ( | 3.14 mM * 1 mL in Methyl acetate) |
2.5 mg | 2. | 5 mg |
2.5g | 2. | 5g |
2.5kg | 2. | 5kg |
2.5mg | 2. | 5mg |
7.5mg | 7. | 5mg |
5mg(107.26 mM * 100 uL in Water) | 5mg(107.2 | 6 mM * 100 uL in Water) |
4.38 µg/mL in Methanol | 4.3 | 8 µg/mL in Methanol |
2g | 2. | g |
1 mL-10 pcs | 1 m | L-10 pcs |
2mg | 2. | mg |
I've attached a file that has the original column that needs splitting.
Thanks
Paul
Hi
Due to complex string the solution is a little bit harder
after some research in the forum, try this script who works for me.
1 script example :
I did 2 preceding load to show who how to extract NUMBER end the how to extract TEXT
but at he end you can concatenate twice to create the dimension Quantity and the dimension Text :
see script example 2
Table1 :
load *,
// préceding load to extract from the column Original the text value without the first number part of the string
mid(Original_column_Quantity,
len(Quantity)+1) as Text;
Load *,
// preceding load to find only first value in string who is a number (can be 1 , 2 , 3.8 etc ) named as Quantity
Mid(Original_column_Quantity,FindOneOf(Original_column_Quantity,'.0123456789'),FindOneOf(Mid(Original_column_Quantity,
FindOneOf(Original_column_Quantity,'.0123456789')),PurgeChar(Original_column_Quantity,'.0123456789'))-1) as Quantity;
// load the column to plit :
load * Inline [
Original_column_Quantity,
2 KU,
200 μL-2 mL-15 mL,
1mg (3.14 mM * 1 mL in Methyl acetate),
2.5 mg,
2.5g,
2.5kg,
2.5mg,
7.5mg,
5mg(107.26 mM * 100 uL in Water,
4.38 µg/mL in Methanol,
1 mL-10 pcs,
2mg,
](delimiter is ',');
Table2:
Load *,
Mid(Original_column_Quantity,FindOneOf(Original_column_Quantity,'.0123456789'),FindOneOf(Mid(Original_column_Quantity,
FindOneOf(Original_column_Quantity,'.0123456789')),PurgeChar(Original_column_Quantity,'.0123456789'))-1) as Quantity,
mid(Original_column_Quantity,
len(Mid(Original_column_Quantity,FindOneOf(Original_column_Quantity,'.0123456789'),FindOneOf(Mid(Original_column_Quantity,
FindOneOf(Original_column_Quantity,'.0123456789')),PurgeChar(Original_column_Quantity,'.0123456789'))-1))+1) as Text;
load * Inline [
Original_column_Quantity,
2 KU,
200 μL-2 mL-15 mL,
1mg (3.14 mM * 1 mL in Methyl acetate),
2.5 mg,
2.5g,
2.5kg,
2.5mg,
7.5mg,
5mg(107.26 mM * 100 uL in Water,
4.38 µg/mL in Methanol,
1 mL-10 pcs,
2mg,
](delimiter is ',');
the resulting table should look like this
hope it helps
also using this FindOneOf/PurgeChar approach, a shorter solution might be as well:
table1:
LOAD *,
Left(OriginalColumn,FindOneOf(OriginalColumn,PurgeChar(OriginalColumn,'0123456789.'))-1) as Quantity,
Trim(Mid(OriginalColumn,FindOneOf(OriginalColumn,PurgeChar(OriginalColumn,'0123456789.')))) as Unit
INLINE [
OriginalColumn
1 mL
1 mL
20 pouches
100 T
10 pouches
10 pouches
10 pouches
100 T
10µg
1 mL
4.38 µg/mL in Methanol
1 mL
1 mL
1 mL
1 mL
1 mL
10µg
1 mL
1 mL
10 mL
1 mL
10µg
1 mL
10 mL
100 mM * 1 mL in Water
20 pouches
1 mL
20 pouches
1 mL
10 mL
5mg(107.26 mM * 100 uL in Water)
10 pouches
10µg
100 T
1 mL
1 mL
1 mL
1 mL
1mg (3.14 mM * 1 mL in Methyl acetate)
1 mL
1 mL
10µg
10 pouches
10µg
1 mL-10 pcs
200 μL-2 mL-15 mL
1 mL
10 mL
1 mL
1 ml
1 mL
];