Man, I hate with a full heart the inventor of Regex, even though I’ll admit that it really can come in handy at times. The main thing with Regex is that “nobody” that are just somewhat sane can maintain these “cyrilic” scripts and strings that Regex ends up in. It’s just gibberish to say the least! Some would even argue that it looks like a quote from Captain Haddock in Tintin!
So – how does this beast look?
input: “here we WE go again” (the ‘”’ is not part of the input)
Regex Expression | Result |
\w
Explanation: all alphanumeric chars | h e r e w e W E g o a g a i b |
\b\w+\b
Explanation: all alphanumeric chars that combines into a word (1 <= char count < infinity) | here we WE go again |
\b\w{1,3}\b
Explanation: all alphanumeric words (1 <= char count <=3) | we WE go |
\b\w{4,}\b
Explanation: all alphanumeric words (4 <= char count < infinity) | here again |
\b\w{2}\b
Explanation: all alphanumeric words (2 = char count) | we WE go |
\b(?!\bwe\b)(\w{2})\b
Explanation: all alphanumeric words but NOT ‘we’ (2 = char count) | go WE |
\b(?!\bgo\b)(\w{2})\b
Explanation: all alphanumeric words but NOT ‘go’ (2 = char count) | we WE |
(?i)\b(?!\bwe\b)(\w{2})\b
Explanation: all alphanumeric words but NOT ‘we’ (2 = char count). Ignore case | go |
(?i)\b(?!\bwe\b)(?!\bgo\b)(\w{2})\b
Explanation: all alphanumeric words but NOT ‘we’ OR ‘go’ (2 = char count). Ignore case | (nothing) |
It still looks like gibberish to me!