Jump to content

PHP Regular Expressions Coverage - Please clear my mind of wonders


Sniffy

Recommended Posts

I really want to get my head around regular expressions but there's so much to knowI read about all the basic characters/sequences in THIS tutorial but there's still a lot of things that are unclear to me.First of all, what are the starting/ending delimeters and when/why would you use them?Also, what is a delimeter?Second, what are the paretheses used for? As in do(ugh)nut?For some reason it looks like it means maybe.Any other useful information would be create. Like how to use regular expressions and how ereg() can be used in a script and a detailed explanation of functions.justsomeguy already gave me the php.net reg expresion reference guide.Also after you answer my questions, if you don't mind posting some basic scripts using valid patterns.Hopefull by the end of this the matter will be cleared up.EDITI've just cleared up most of the matter on the delimeter. Correct me if I'm wrong. A delimeter is something that encloses the special characters in the regular expression. I could declare this string "^Sniffy$" and it would display ^Sniffy$. A delimeter would say that it's special text for a regular expression. A delimeter could be anything non-alphanumeric and that's not used in the expression, most commonly a backslash.But I believe I saw stuff like /\^Sniffy\\ and I don't know if that's just me imagining things or if a backslash can be used a long with a front slash to have a meaning.EDITI just read about the meanings of preg() and ereg().Please say if I'm correct.preg - For Perl Compatible regular expressions.ereg - PHP's Standard Regular Expressions.Note: I haven't really gotten into the differences between the two but that's what they're for I think.ereg would be quicker processed because there is less things to load; I saw something in one of the topics yesterday of which type is faster and that's my theory.EDITI just saw another code on the forum with /\ in it but I was mistaken, it's actually just a / followed by a character, but what character it was is what made me even more confused. It was posted by jesh and I hope he doesn't care that I question it. It's javascript, but it's still probably around the same rules.

// replace [b] with <b>content = content.replace(/\[b\]/g, "<b>");// replace [/b] with </b>content = content.replace(/\[\/b\[/g, "</b>");

When would [b be put like \[b\?

Link to comment
Share on other sites

The parentheses are for grouping. If your regex was "do(ugh)nut" and you ran a match against some string ("I like doughnuts!"), match $0 would be "doughnut" and match $1 would be "oug". If you wanted to make "oug" optional, it'd look something more like "do(ugh)?nut" or maybe even like "do(?:ugh)?nut".A regular expression of "^Sniffy$" would only match "Sniffy". It would not match any of the following:"Sniffy posted this.""I heard Sniffy say...""sniffy"The ^, in this context, means only match at the start of the string and the $, in this context, means only match at the end of the string.The // is how you declare a regular expression in javascript. /test/ is a valid declaration for a regular expression that matches "test". Throwing a "g" on the end of that makes the regular expression global, meaning it will attempt to make more than a single match: /test/gIn the post that you mention, I wrote: /\[b\]/gThe "\", in this context, is an escape character. It makes it so the the regular expression engine treats the character following that as a regular character instead of something special. It's similar to escaping double quotes in strings: "This is a \"string\"". In regular expressions, the [] indicates a set of possible characters. For example, [a-zA-Z] will match any upper- or lower-case letter. [aeiou] will only match vowels. So, if you need to actually match the "[" character, you need to escape it: "\[".I hope this helps a bit.

Link to comment
Share on other sites

Wait, actually it would be a valid pattern in both PHP and JS, because the delimeter is not part of the pattern.Okay, now I'd like a detailed coverage and examples on:preg_replace()preg_split()I don't really understand the php.net coverage because it's a reference, not a tutorial.

Link to comment
Share on other sites

To sum regex all up (correct me if I'm wrong):() - Group. Contains a regular expression. Useful for applying a regular expression over several characters.[] - Range. jesh gave the best example.| - "OR". Mathes either of the regular expressions around it.\ - Escape the following character(s). If used before alphanumeric character, it could indicate a special predefined match.. - Any character.* - Zero or more occurances of the previous character(s).? - Zero or one occurence of the previous character(s).+ - One or more occurances of the previous character(s).^ - Begins with the following characters.$ - Ends with the previous characters.{n} - "n" amount occureces of the previous character(s).{n,} - "n" or more occurances of the previous character(s).{,n} - Zero of up to "n" occurances of the previous character(s).{n,m} - minumum "n" occurences of up to "m" occurances of the previous character(s).As for the "/" character. It's used in Perl and JavaScript to enclose regular expressions. Why? I have no idea. I'm not experienced in either.(e|p)reg_replace() searches the third argument against a regular expression in the first argument. Each match is replaced with the string in the second argument.(preg_)?split()... I'm not sure but I think it turns a string (the second argument) into an array. Each array member is a string. The separator is selected with the regular expression (the first argument). For example, the string "12/21/2006" in (preg_)?split("\/","12/11/2006") will return an array with the following three members: 12, 21 and 2006.

Edited by boen_robot
Link to comment
Share on other sites

Thanks, just one thing, I don't really understand how to use the parimeters in the preg_replace() so if I could have an example that would be great.And as for the preg_split, it's a more advanced version of string.split() or whatever it is in PHP?Now, if someone could me some examples of the uses of regular expressions and as to why I should use them.

Link to comment
Share on other sites

You asked for an exemple of little script so here it is, a script that'll tell you if an email if valid or not.

<p><?if (isset($_POST['mail'])){	if (preg_match("#^[a-z0-9._-]+@[a-z0-9._-]{2,}\.[a-z]{2,4}$#", $_POST['mail']))	{		echo 'The email' . $_POST['mail'] . ' is<strong>valid</strong> !';	}	else	{		echo 'The email' . $_POST['mail'] . 'isnt valid, try again!';	}}?></p><form method="post"><p>	<label for="email">Your email ?</label> <input id="mail" name="mail" /><br />	<input type="submit" value="Try the email" /></p></form>

Link to comment
Share on other sites

Thanks, that's great, now if you don't mind me trying to solve what this means.#^[a-z0-9._-]+@[a-z0-9._-]{2,}\.[a-z]{2,4}$## - Is the starting and ending delimeter.^- Is what the string begins with.$ - Is what the string ends with, which means that all of the string is expressed in this regex.[a-z0-9._-]+ - The string has any alphanumeric character, a single _ or - once in a row, and these characters are in the expression one or more times. What i don't understand though, is that some emails don't have _ or - in them, but your requesting that they must have it, or are you?{2,} - Your saying that there's at least 2 or more alphanumeric and -_ characters in the domain name.\. - \ means the following character is not a special character in the expression, but part of the string.[a-z]{2,4} - The domain extension is between 2 and 4 characters.Just answer my question about the ._- ; everything else is fine.

Link to comment
Share on other sites

Example of replace(), heh? OK. The following example will output the string with all numbers replaced by a sharp sign:

<?phpprint ereg_replace("[0-9]","#","There are 6 000 000 000 people in the world. Bill Gates has $50 000 000 000, while my personal savings are only about $150.")?>

The output of the above will be:

There are # ### ### ### people in the world. Bill Gates has $## ### ### ###, while my personal savings are only about $###.

Link to comment
Share on other sites

Oh. You mean the "._-" in the rage selection? As far as I'm familiar with, those are just additional characters that fall into this range. This is the only place in a regular expression where the only true metacharacter is "-". But scince it's placed last, it really doesn't do anything, other then selecting a dash.So, the range

[a-z0-9._-]

Includes lowercase letters from "a" to "z", numbers from "0" to "9", dot, underscore and a dash.

Link to comment
Share on other sites

But it says, one or times, with the +; Also, doesn't . mean a single character? Followed by a single dot or dash?

EDIT: [._-] could mean a single dot and a dash to nothing?

The "+" is applyed over the complete range (as with a group). So the "+" in the above states "one or more occurances of any alphanumeric character, a dot, an underscore or a dash".[._-] means "a dot, an underscore or a dash". Nothing more and nothing less.
Link to comment
Share on other sites

And when characters are grouped, it just means one or more occurances of that group?
To be exact, when characters are grouped, the rest of the regular expression treats them like it treats a single chracter. So
[._-]*

is zero or more occurances of the characters in the range, and

[._-]?

is zero or one occurance of the characters in the range, etc.

Link to comment
Share on other sites

Okay, now just to go over so more uses of regular expressions.
It's useful anytime you need to look in a string for a particular pattern.Most often, it's used to validate user input (like in specialguy's example). You can use it to make sure a user typed in a date in a correct format, or an email, or a phone number, or a UPC off of a product, or a zip/postal code.But it's a powerful tool and can be used for many other purposes.
Link to comment
Share on other sites

Look at Regexlib for some examples of regular expressions. Use cases for them you ask? Start building ANY application that interacts with a user (say a simple form that returns something on certain conditions) and you'll immediatly see at least one use of regular expressions.

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...