Sniffy Posted December 22, 2006 Share Posted December 22, 2006 I really want to get my head around regular expressions but there's so much to knowI read about all the basic characters/sequences in THIS tutorial but there's still a lot of things that are unclear to me.First of all, what are the starting/ending delimeters and when/why would you use them?Also, what is a delimeter?Second, what are the paretheses used for? As in do(ugh)nut?For some reason it looks like it means maybe.Any other useful information would be create. Like how to use regular expressions and how ereg() can be used in a script and a detailed explanation of functions.justsomeguy already gave me the php.net reg expresion reference guide.Also after you answer my questions, if you don't mind posting some basic scripts using valid patterns.Hopefull by the end of this the matter will be cleared up.EDITI've just cleared up most of the matter on the delimeter. Correct me if I'm wrong. A delimeter is something that encloses the special characters in the regular expression. I could declare this string "^Sniffy$" and it would display ^Sniffy$. A delimeter would say that it's special text for a regular expression. A delimeter could be anything non-alphanumeric and that's not used in the expression, most commonly a backslash.But I believe I saw stuff like /\^Sniffy\\ and I don't know if that's just me imagining things or if a backslash can be used a long with a front slash to have a meaning.EDITI just read about the meanings of preg() and ereg().Please say if I'm correct.preg - For Perl Compatible regular expressions.ereg - PHP's Standard Regular Expressions.Note: I haven't really gotten into the differences between the two but that's what they're for I think.ereg would be quicker processed because there is less things to load; I saw something in one of the topics yesterday of which type is faster and that's my theory.EDITI just saw another code on the forum with /\ in it but I was mistaken, it's actually just a / followed by a character, but what character it was is what made me even more confused. It was posted by jesh and I hope he doesn't care that I question it. It's javascript, but it's still probably around the same rules. // replace [b] with <b>content = content.replace(/\[b\]/g, "<b>");// replace [/b] with </b>content = content.replace(/\[\/b\[/g, "</b>"); When would [b be put like \[b\? Link to comment Share on other sites More sharing options...
jesh Posted December 22, 2006 Share Posted December 22, 2006 The parentheses are for grouping. If your regex was "do(ugh)nut" and you ran a match against some string ("I like doughnuts!"), match $0 would be "doughnut" and match $1 would be "oug". If you wanted to make "oug" optional, it'd look something more like "do(ugh)?nut" or maybe even like "do(?:ugh)?nut".A regular expression of "^Sniffy$" would only match "Sniffy". It would not match any of the following:"Sniffy posted this.""I heard Sniffy say...""sniffy"The ^, in this context, means only match at the start of the string and the $, in this context, means only match at the end of the string.The // is how you declare a regular expression in javascript. /test/ is a valid declaration for a regular expression that matches "test". Throwing a "g" on the end of that makes the regular expression global, meaning it will attempt to make more than a single match: /test/gIn the post that you mention, I wrote: /\[b\]/gThe "\", in this context, is an escape character. It makes it so the the regular expression engine treats the character following that as a regular character instead of something special. It's similar to escaping double quotes in strings: "This is a \"string\"". In regular expressions, the [] indicates a set of possible characters. For example, [a-zA-Z] will match any upper- or lower-case letter. [aeiou] will only match vowels. So, if you need to actually match the "[" character, you need to escape it: "\[".I hope this helps a bit. Link to comment Share on other sites More sharing options...
Sniffy Posted December 22, 2006 Author Share Posted December 22, 2006 Whoa, that cleared up quite a bit just in those few posts.I always wondered how the global(g) flag was used.So in PHP, it would be: \/[b/]\g ? Link to comment Share on other sites More sharing options...
Sniffy Posted December 22, 2006 Author Share Posted December 22, 2006 Wait, actually it would be a valid pattern in both PHP and JS, because the delimeter is not part of the pattern.Okay, now I'd like a detailed coverage and examples on:preg_replace()preg_split()I don't really understand the php.net coverage because it's a reference, not a tutorial. Link to comment Share on other sites More sharing options...
boen_robot Posted December 23, 2006 Share Posted December 23, 2006 (edited) To sum regex all up (correct me if I'm wrong)) - Group. Contains a regular expression. Useful for applying a regular expression over several characters.[] - Range. jesh gave the best example.| - "OR". Mathes either of the regular expressions around it.\ - Escape the following character(s). If used before alphanumeric character, it could indicate a special predefined match.. - Any character.* - Zero or more occurances of the previous character(s).? - Zero or one occurence of the previous character(s).+ - One or more occurances of the previous character(s).^ - Begins with the following characters.$ - Ends with the previous characters.{n} - "n" amount occureces of the previous character(s).{n,} - "n" or more occurances of the previous character(s).{,n} - Zero of up to "n" occurances of the previous character(s).{n,m} - minumum "n" occurences of up to "m" occurances of the previous character(s).As for the "/" character. It's used in Perl and JavaScript to enclose regular expressions. Why? I have no idea. I'm not experienced in either.(e|p)reg_replace() searches the third argument against a regular expression in the first argument. Each match is replaced with the string in the second argument.(preg_)?split()... I'm not sure but I think it turns a string (the second argument) into an array. Each array member is a string. The separator is selected with the regular expression (the first argument). For example, the string "12/21/2006" in (preg_)?split("\/","12/11/2006") will return an array with the following three members: 12, 21 and 2006. Edited December 23, 2006 by boen_robot Link to comment Share on other sites More sharing options...
Sniffy Posted December 23, 2006 Author Share Posted December 23, 2006 Thanks, just one thing, I don't really understand how to use the parimeters in the preg_replace() so if I could have an example that would be great.And as for the preg_split, it's a more advanced version of string.split() or whatever it is in PHP?Now, if someone could me some examples of the uses of regular expressions and as to why I should use them. Link to comment Share on other sites More sharing options...
specialguy Posted December 23, 2006 Share Posted December 23, 2006 You asked for an exemple of little script so here it is, a script that'll tell you if an email if valid or not. <p><?if (isset($_POST['mail'])){ if (preg_match("#^[a-z0-9._-]+@[a-z0-9._-]{2,}\.[a-z]{2,4}$#", $_POST['mail'])) { echo 'The email' . $_POST['mail'] . ' is<strong>valid</strong> !'; } else { echo 'The email' . $_POST['mail'] . 'isnt valid, try again!'; }}?></p><form method="post"><p> <label for="email">Your email ?</label> <input id="mail" name="mail" /><br /> <input type="submit" value="Try the email" /></p></form> Link to comment Share on other sites More sharing options...
Sniffy Posted December 23, 2006 Author Share Posted December 23, 2006 Thanks, that's great, now if you don't mind me trying to solve what this means.#^[a-z0-9._-]+@[a-z0-9._-]{2,}\.[a-z]{2,4}$## - Is the starting and ending delimeter.^- Is what the string begins with.$ - Is what the string ends with, which means that all of the string is expressed in this regex.[a-z0-9._-]+ - The string has any alphanumeric character, a single _ or - once in a row, and these characters are in the expression one or more times. What i don't understand though, is that some emails don't have _ or - in them, but your requesting that they must have it, or are you?{2,} - Your saying that there's at least 2 or more alphanumeric and -_ characters in the domain name.\. - \ means the following character is not a special character in the expression, but part of the string.[a-z]{2,4} - The domain extension is between 2 and 4 characters.Just answer my question about the ._- ; everything else is fine. Link to comment Share on other sites More sharing options...
boen_robot Posted December 23, 2006 Share Posted December 23, 2006 Example of replace(), heh? OK. The following example will output the string with all numbers replaced by a sharp sign: <?phpprint ereg_replace("[0-9]","#","There are 6 000 000 000 people in the world. Bill Gates has $50 000 000 000, while my personal savings are only about $150.")?> The output of the above will be: There are # ### ### ### people in the world. Bill Gates has $## ### ### ###, while my personal savings are only about $###. Link to comment Share on other sites More sharing options...
Sniffy Posted December 23, 2006 Author Share Posted December 23, 2006 That's simple, yet cool. :)Now just to answer my other questions..... Link to comment Share on other sites More sharing options...
boen_robot Posted December 23, 2006 Share Posted December 23, 2006 Oh. You mean the "._-" in the rage selection? As far as I'm familiar with, those are just additional characters that fall into this range. This is the only place in a regular expression where the only true metacharacter is "-". But scince it's placed last, it really doesn't do anything, other then selecting a dash.So, the range [a-z0-9._-] Includes lowercase letters from "a" to "z", numbers from "0" to "9", dot, underscore and a dash. Link to comment Share on other sites More sharing options...
Sniffy Posted December 23, 2006 Author Share Posted December 23, 2006 But it says, one or times, with the +; Also, doesn't . mean a single character? Followed by a single dot or dash? EDIT: [._-] could mean a single dot and a dash to nothing? Link to comment Share on other sites More sharing options...
boen_robot Posted December 23, 2006 Share Posted December 23, 2006 But it says, one or times, with the +; Also, doesn't . mean a single character? Followed by a single dot or dash? EDIT: [._-] could mean a single dot and a dash to nothing? The "+" is applyed over the complete range (as with a group). So the "+" in the above states "one or more occurances of any alphanumeric character, a dot, an underscore or a dash".[._-] means "a dot, an underscore or a dash". Nothing more and nothing less. Link to comment Share on other sites More sharing options...
Sniffy Posted December 23, 2006 Author Share Posted December 23, 2006 Okay. :)So when those characters are grouped in such a way, they have no meaning toward the regex.And when characters are grouped, it just means one or more occurances of that group? Link to comment Share on other sites More sharing options...
boen_robot Posted December 23, 2006 Share Posted December 23, 2006 And when characters are grouped, it just means one or more occurances of that group? To be exact, when characters are grouped, the rest of the regular expression treats them like it treats a single chracter. So[._-]* is zero or more occurances of the characters in the range, and [._-]? is zero or one occurance of the characters in the range, etc. Link to comment Share on other sites More sharing options...
Sniffy Posted December 23, 2006 Author Share Posted December 23, 2006 Okay, now just to go over so more uses of regular expressions. Link to comment Share on other sites More sharing options...
jesh Posted December 27, 2006 Share Posted December 27, 2006 Okay, now just to go over so more uses of regular expressions.It's useful anytime you need to look in a string for a particular pattern.Most often, it's used to validate user input (like in specialguy's example). You can use it to make sure a user typed in a date in a correct format, or an email, or a phone number, or a UPC off of a product, or a zip/postal code.But it's a powerful tool and can be used for many other purposes. Link to comment Share on other sites More sharing options...
Sniffy Posted December 30, 2006 Author Share Posted December 30, 2006 Give me some examples of other uses please. :)Thanks for the answer as well though. Link to comment Share on other sites More sharing options...
boen_robot Posted December 31, 2006 Share Posted December 31, 2006 Look at Regexlib for some examples of regular expressions. Use cases for them you ask? Start building ANY application that interacts with a user (say a simple form that returns something on certain conditions) and you'll immediatly see at least one use of regular expressions. Link to comment Share on other sites More sharing options...
Sniffy Posted January 1, 2007 Author Share Posted January 1, 2007 Whoa, regexlib might come in handy, thanks. Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now