Jump to content

Using regex to parse text as links...


iyeru42

Recommended Posts

I need it so that it checks if the URL is within

 BBCode and if it is, won't parse it. Also, it should end the link if there's a ) or . or ; or ] and don't parse if there's a URL= attached to the front of http:// (IE: URL=http://)Basically, I want http://www.this.com to parse like IPB, etc. does automatically. Of course, not all URLs will have www. in front.
Link to comment
Share on other sites

I still would rather have JUST the regex, I don't need a whole extra extension. I already have bbcode functions, and they work fine. Now I need to auto-parse links that people have not used bbcode with.I need it to check a variable, $iAutoParse and see if it's true (since I already have this variable defined, then search the post itself, $strPostBody, for any http:// and auto-parse them with <a href="$link[]">$link[]</a> or something.

Link to comment
Share on other sites

I was wrong anyway; see my late edit.This is a pretty complex project... I think it would be easier to install the extension if you have the necessary authority. Here's my first attempt:

((?:http|ftp|https)://(?:\w+\.)+\w{2,3}\.?(?:/[\w.])*/?)(?=\]|\)|\s)

(?:http|ftp|https):// - a mandatory protocol, can be any of those listed(?:\w+\.)+ - one or more domains before the TLD (com, org, etc.)\w{2,3}\.? - the TLD itself followed by an optional dot (a more formal way of writing the domain)(?:/[\w.])*/? - zero or more path elements followed by an optional slashEDIT: (?=\]|\)|\s) - a lookahead that ensures that either a square bracket, a round bracket, or a space follows the URLNote that dots and semicolons are part of a URL's syntax and cannot be excluded except in the special case where a space follows them (when a semicolon wouldn't mean anything and a dot would hopefully not be part of the actual path). EDIT2: Rather than excluding the dot and semicolon, I've now just excluded the space.

Link to comment
Share on other sites

So, how do I put this into php? do I use ereg_replace? No, I would use preg_replace if I used this.Example:

<?php$string = 'April 15, 2003';$pattern = '/(\w+) (\d+), (\d+)/i';$replacement = '${1}1,$3';echo preg_replace($pattern, $replacement, $string);?>

But I don't know how to incorporate your match.

Link to comment
Share on other sites

Yeah, it would be something like

$post = preg_replace($regex, '<a href="$1">$1</a>', $post);

Note that I've edited the pattern. I can't figure out how to make sure there's no

 tag before the URL.  (I could figure it out if the tag had to be [i]directly[/i] before the URL, but that's not a real requirement.)
Link to comment
Share on other sites

Thanks, but now I have a new problem. Seems autoparse is being gotten, null. Which it isn't, it's 1.Also, your regex makes my whole posts go blank.

// Convert links to clickable links.function clickableLinks($post){   // Regex Match   $regex = "((?:http|ftp|https)://(?:\w+\.)+\w+\.?(?:/[\w.])*/?)(?!\]|\)|[\.;] )";      // Replace foo'   $post = preg_replace($regex, '<a href="$1">$1</a>', $post);      return $post;}

Using this instead:

  $text = eregi_replace('(((f|ht){1}tp://)[-a-zA-Z0-9@:%_\+.~#?&//=]+)', 	'<a href="\\1">\\1</a>', $text);   $text = eregi_replace('([[:space:]()[{}])(www.[-a-zA-Z0-9@:%_\+.~#?&//=]+)', 	'\\1<a href="http://\\2">\\2</a>', $text);   $text = eregi_replace('([_\.0-9a-z-]+@([0-9a-z][0-9a-z-]+\.)+[a-z]{2,3})', 	'<a href="mailto:\\1">\\1</a>', $text);

and editing appropriately makes it work.

Link to comment
Share on other sites

Lol, that's what I get for not testing. If your code works, great - but I still think the extension would be a better idea, and you don't have to install it without help.

Link to comment
Share on other sites

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...