Jump to content

Reg Expression Syntax


son

Recommended Posts

I really struggle with the regular expressions. I want to allow all alpha, number, spaces and ;:-' and usedif (!eregi ('^[[:alnum:][:punct:][:space:]]{2,40}$', stripslashes(trim($_POST['heading'])))) {to find that this does not include the £ symbol. I tried to add:if (!eregi ('^[[:alnum:][:punct:][:space:]\£]{2,40}$', stripslashes(trim($_POST['heading'])))) {without luck.How would I best achieve my objective?Thanks,Son

Link to comment
Share on other sites

How about:

if (!eregi ('^([[:alnum:][:punct:][:space:]]|\£){2,40}$', stripslashes(trim($_POST['heading'])))) {

?

Link to comment
Share on other sites

How about:
if (!eregi ('^([[:alnum:][:punct:][:space:]]|\£){2,40}$', stripslashes(trim($_POST['heading'])))) {

?

Tried this before without luck. Was also wondering if there is a more general way of allowing all characters, which will display ok in browser and restrict all those, which cause problems (ASCII \xE2\x80\x99, \xE2\x80\x98, \xe2\x80\x9c,\xe2\x80\x9d, \xe2\x80\x93 for example display the square question mark instead of character)? I tried to use the reg expression to achieve exactly that...Son
Link to comment
Share on other sites

Make sure you use UTF-8 everywhere. Otherwise, there will be differences from one encoding to the next and the only sure way would be to turn the sign into it's entity code.I believe in regex you can use \xhh to match the character with "hh" hex code.e.g.

if (!eregi ('^([[:alnum:][:punct:][:space:]]|\xA3){2,40}$', stripslashes(trim($_POST['heading'])))) {

(I'm not completely sure if A3 is the hex code for a pound... I looked at the Windows character map app)

Link to comment
Share on other sites

Make sure you use UTF-8 everywhere. Otherwise, there will be differences from one encoding to the next and the only sure way would be to turn the sign into it's entity code.I believe in regex you can use \xhh to match the character with "hh" hex code.e.g.
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"><html xmlns="http://www.w3.org/1999/xhtml"><head><meta http-equiv="Content-Type" content="text/html; charset=utf-8" />

so this side of things should be fine. I use the regular expressions to not let characters go through which cause display issues. Is there another way to achieve this? I mean, there are professional CMS systems and I wondered how they programm that this does not happen. With those corportate user surely they do a lot of copy/paste from Office applications (as my friend does)?SonAlso, just to say: the db table is set to latin_swedish collation. NOt sure if that would create the issues I am getting...

Link to comment
Share on other sites

I said "everywhere", so yeah. If the string you're testing comes from a DB which is not in UTF-8, it's possible that the problem is there.In addition, having a charset meta doesn't guarantee you get UTF-8 in the end. You should explicitly set it as a header, like so:

<?php header('Content-Type: text/html; charset=utf-8'); ?><!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"><html xmlns="http://www.w3.org/1999/xhtml"><head><meta http-equiv="Content-Type" content="text/html; charset=utf-8" />

Link to comment
Share on other sites

I said "everywhere", so yeah. If the string you're testing comes from a DB which is not in UTF-8, it's possible that the problem is there.In addition, having a charset meta doesn't guarantee you get UTF-8 in the end. You should explicitly set it as a header, like so:
<?php header('Content-Type: text/html; charset=utf-8'); ?><!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"><html xmlns="http://www.w3.org/1999/xhtml"><head><meta http-equiv="Content-Type" content="text/html; charset=utf-8" />

The DB itself is UTF-8, I only meant the collation. In addtion, have <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />on all pages...???Son
Link to comment
Share on other sites

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...