If you can flag it in script, here's a pure brute force approach. You'll probably need to be more sophisticated than this in practice, like I'm only taking spaces as word breaks, where you'd want to break on punctuation and the like. I have the feeling I'm missing some simpler way to do this (without using regular expressions).
I like your patterns approach better than my brute force. I ended up with this, which also has the start of ignoring punctuation, at least in terms of figuring out the format of the word, if not the word itself in this case.