Regular Expression. Why.
Have you ever encountered the same solution to a problem time after time again, never fully grasping why it works, but accepting that it does? And because this magical solution works so well and you’ve got bigger fish to fry, you put off researching why it works and choose to just accept that it’s magic and you shouldn’t question it?
That’s what regular expression is to me. Up until last week, when Gee slacked me “I’ve actually been interested in learning more about regex, like regular expression,” I didn’t realize that “regular expression” was what this magic was called.
The first time I ever (unknowingly) used regex, was when I was trying to make myself a contact form for my personal website back in 2010 and I needed a way to make sure that the email was valid.
Here is part of the javascript email validation I used in my code.:
var hasError = false; var emailReg = /^[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,}$/; var emailFromVal = $("#emailFrom").val(); if(emailFromVal == '') { $("#emailFrom").after('Please enter a valid email address.'); hasError = true; } else if(!emailReg.test(emailFromVal)) { $("#emailFrom").after('Please enter a valid email address.'); hasError = true; }
/^[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\\.[A-Za-z]{2,}$/;
How does one get from that gibberish to email validation??
I ran into Regular Expression a few more times since then, most notably in the Ruby labs we had to do at the Flatiron School. For example, one of our labs instructed us to be able to break a paragraph string into sentences. To do so, we needed to find a way to identify punctuation.
So, I turned to stack overflow:
string = "I am a lion. Hear me roar! Where is my cub? Never mind, found him." string.gsub(/[.?!]/, '\0|') # "I am a lion.| Hear me roar!| Where is my cub?| Never mind, found him.|"
Okay, I can kind of begin to understand what's gong on. I think. It’s obvious to me that, /[.?!]/ is basically searching for punctuation in the string. Which makes a lot of sense, because Regular Expression (regex) is “a special text string for describing a search pattern.” A wildcard on steroids, if you will.
https://www.reddit.com/r/explainlikeimfive/comments/w1e0p/eli5_how_do_regexeswork/c59gb18/
Regular Expression basically gives you a list of things to look for in a particular sequence with forward slashes often encapsulating the regex. /hello/ - look for an ‘h’ followed by ‘e’ followed by ‘l,’ and so on. Where it gets complex is when factor in special characters that give things special meanings.
And suddenly the gibberish from above makes sense!
/^[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\\.[A-Za-z]{2,}$/;
/.../ - surrounds the regex used. ^ and $ are anchor characters signifying start-of-string and end-of-string. The [ .. ] looks for any character listed or in the range specified, followed by + which is basically looking for one or more of any character in the brackets. So basically, this expression is looking for a collection of any of the characters in the bracket (A-Z and a-z, numbers, and some select symbols) followed by an @ sign, followed by another collection of characters (A-Z and a-z, numbers, ‘-’ and ‘.’ which takes into account subdomains), a “.” (the literal since it’s escaped by a backslash), and another collection of characters (A-Z and a-z) that is greater than 2.




















