zanfranceschi Posted July 6, 2006 Share Posted July 6, 2006 <pre><?php$target = "my.e-mail_with_symbols@domain.com.br";$re = "/^([a-z0-9\._\-]*)(@{1})([a-z0-9\._\-]*)(\.[a-z0-9]{1,3})$/i";echo "<b>$re</b> \t\t\t <b>$target</b>\n\n\n";if(preg_match($re, $target, $out)) { echo "<span style=\"color: blue;\">matchs</span>";} else { echo "<span style=\"color: red;\">doesn't match</span>";}echo "\n\n\n\n";print_r($out);?></pre> I just started trying some things with regular expressions, and I found this not to be that difficult, but i'm sure i just understand the basics.If someone who knows regexp could see if what I mean is what I wrote, I'd be very thankful!Here it goes:^ = must start with the first "block";([a-z0-9\._\-]*) = a to z, 0 to 9, with ".", "_" and "-" allowed infinite times(@{1}) = "@" must come after the first block only once([a-z0-9\._\-]*) = 3rd block, the same as the first(\.[a-z0-9]{1,3}) = last block preceeded by an dot (mandatory) having alphanumeric chars. Can find this pattern at least once and at most 3 tmes.$ = must end after the abovei = makes the pattern case insensitive.Is all this right? Am I on the right track of regexp? Am I using some unusual/bad practice with the patterns?Thanks a lot!! Link to comment Share on other sites More sharing options...
justsomeguy Posted July 6, 2006 Share Posted July 6, 2006 You're generally on the right track, but there is a "more correct" way to do the last part. Instead of doing this:([a-z0-9\._\-]*)(\.[a-z0-9]{1,3})I think this will work better:([a-z0-9_\-]+\.)+([a-z]{2,3})This means that you look for a sequence of one or more letters and numbers (plus _ and -), followed by a dot. You look for that entire sequence one or more times, and then after that is a 2-3 character sequence. You probably don't need digits in the TLD, I don't think any TLDs have numbers in them, and there are no 1-character TLDs.I also replaced the first asterisk with a plus (1 or more, instead of 0 or more. 0 characters for the first subpattern is not valid), so the new pattern looks like this:$re = "/^([a-z0-9\._\-]+)(@{1})([a-z0-9_\-]+\.)+([a-z]{1,3})$/i"; Link to comment Share on other sites More sharing options...
zanfranceschi Posted July 7, 2006 Author Share Posted July 7, 2006 Thanks a lot!!!I learned new things from you.Didn't know that "*" meant 0 or more and "+" 1 or more. And you used a lot of parenthesis too, so i see it's right to separete pattern groups with ().And why didn't you put the last dots with the last group? Is this just a matter of taste or is there a logic behind it I can't see?Thanks a lot again!!! I see a new world of possibilities with regexpr!!! Link to comment Share on other sites More sharing options...
justsomeguy Posted July 7, 2006 Share Posted July 7, 2006 I moved the last dot because that's just how I see it in my head. After the @ sign, there is this group, 1 or more times:([a-z0-9_\-]+\.)All of these will match that pattern:www.akjsdf.w3schools.mail.Since there are 1 or more allowed, you can have as many of those patterns as you want:(www.)(mail.)(w3schools.)www.mail.w3schools.followed by the TLD, which is just "com" or whatever:www.mail.w3schools.comSo, if you strip off the TLD, then you can separate the entire domain into seqences of letters and numbers followed by a dot.The way you were matching it earlier, like this:([a-z0-9\._\-]*)(\.[a-z0-9]{1,3})Would have also matched this:guy@.....comWhich obviously isn't a valid domain name. Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now