Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 
datanibbler
Champion
Champion

Clean contents of a field in an Excel table

Hi,

I have to process a field in an Excel_table and decide whether this is a "clean" record or not.

That is really hard because there are so many possible variations - the people populating those lists are very creative 😉

A "clean" record, according to my logic, would look like this

8638294 / 4 Stück

(there is some flexibility, the text could also be 'St' or 'Stk' or 'Stk.', that doesn't matter)

An "unclean" record is one where there are actually several lines in one field. I can check that with the Substringcount() function - well, I can usually do that - but now I have a field that looks like this

SN 8638294 / 4 Stück / S80140915

That actually constitutes a record that I cannot process automatically - though that number beginning with "S" is one that I could use. But how to even classify this as an unclean record? The letter S of course is part of any variation of 'Stk", so I cannot just use Purgechar() and exclude the big and small S;

=> When I do >> LEN(PurgeChar(Mat_pre, ' /-0123456789STUECKstueckÜü.')) <<, the letter S will be excluded and only one (the "N") will be counted as an unwanted letter.

Also, that "SN" need not be there in every case - so I cannot even say that the 'S' may be there twice at the max.

Can anybody suggest me what to do in that case?

Thanks a lot!

Best regards,

DataNibbler

1 Reply
datanibbler
Champion
Champion
Author

I'm thinking - that "SN" at the beginning may or may not be there.

Likewise, that "Stück" may or may not be there.

A "clean" record might also look like this

7589121/1000

Could I somehow identify those two individual numbers (don't mind the slash, they use anything as a separator, but there is always something inbetween) and then go like, following the second number, there must not be anything more to come?