Is RegExLib full of "it"?
Today I read a comment on Jeffrey Schoolcraft's regex blog from Randal L. Schwartz which I felt that I needed to respond to. As I started writing the comment I realized that this is probably news that needs to be publicly visible, so I'm posting it to my blog and cross referencing the original comment. First, here is Randal's comment:
Yup. I continue to downvote and negative-comment nearly every entry at "regex lib".
Do not validate email addresses with a regex (unless it's the full regex, as you point out).
Do not parse HTML with a regex. HTML is surprisingly complex.
Do not validate a date with a regex. All these regex I see that try to compute the number of days of february based on the year number just have me going "WTF!".
These are NOT regex tasks. These are dedicated tool tasks.
And yet, "regexlib" is full of them. And full of "it", if you know what I mean.
Randal...
I hear your pain. As the lead developer of RegExLib I also see the problems that you are mentioning and, presently we haven't really provided a good enough toolset for the newbies to really help themselves properly. Should the newbies be randomly using regex's that they find on the site... dunno? That's for another argument.
We implemented the rating and comment system in the middle of last year to try and give some indication about the value of individual patterns - so I'm extremely grateful that dilligent members of the community such as yourself are helping out by casting your votes. We also implement an Rss feed for the comments so that comments such as yours are given public visibility - http://www.regexlib.com/RssComments.aspx
It's a hard battle to win as RegExLib continues to grow and, as of today contains nearly 1000 expressions. There's good news though. Over the past couple of months there's been a lot of effort put into helping solve these problems and, to that effect, users of the site will see a vastly improved set of tools to help deal with some of the problems that you've mentioned.
To give you a quick example, one of the new features will provide users with a shortcut way of finding useful AND ACCURATE expressions by offering a box which says: "Enter N examples of what you want to match and N that you shouldn't and we'll provide you with a list of patterns which that match your requirements". This will help to remove the hit and miss element of a NOOB scanning through 1000 patterns to find the veritable needle in the haystack.
The tools that allow users to manage their expressions is also getting an improvement so hopefully pattern authors might be more responsive in adjusting their patterns based on feedback received.
I hope that, once you see the new features for yourself you will agree with me that RegExLib is a much more valuable resource than it is today.