Jump to content
iyeru42

Using regex to parse text as links...

Recommended Posts

I need it so that it checks if the URL is within

 BBCode and if it is, won't parse it. Also, it should end the link if there's a ) or . or ; or ] and don't parse if there's a URL= attached to the front of http:// (IE: URL=http://)Basically, I want http://www.this.com to parse like IPB, etc. does automatically. Of course, not all URLs will have www. in front.

Share this post


Link to post
Share on other sites

I don't manage my php installation, so I can't. I'm also no good at "compiling" either, just like on Linux.

Share this post


Link to post
Share on other sites

EDIT: *removes foot from mouth* I thought it was loaded by default, but I was wrong.Do you host your own site or pay someone else to?

Share this post


Link to post
Share on other sites

I still would rather have JUST the regex, I don't need a whole extra extension. I already have bbcode functions, and they work fine. Now I need to auto-parse links that people have not used bbcode with.I need it to check a variable, $iAutoParse and see if it's true (since I already have this variable defined, then search the post itself, $strPostBody, for any http:// and auto-parse them with <a href="$link[]">$link[]</a> or something.

Share this post


Link to post
Share on other sites

I was wrong anyway; see my late edit.This is a pretty complex project... I think it would be easier to install the extension if you have the necessary authority. Here's my first attempt:

((?:http|ftp|https)://(?:\w+\.)+\w{2,3}\.?(?:/[\w.])*/?)(?=\]|\)|\s)

(?:http|ftp|https):// - a mandatory protocol, can be any of those listed(?:\w+\.)+ - one or more domains before the TLD (com, org, etc.)\w{2,3}\.? - the TLD itself followed by an optional dot (a more formal way of writing the domain)(?:/[\w.])*/? - zero or more path elements followed by an optional slashEDIT: (?=\]|\)|\s) - a lookahead that ensures that either a square bracket, a round bracket, or a space follows the URLNote that dots and semicolons are part of a URL's syntax and cannot be excluded except in the special case where a space follows them (when a semicolon wouldn't mean anything and a dot would hopefully not be part of the actual path). EDIT2: Rather than excluding the dot and semicolon, I've now just excluded the space.

Share this post


Link to post
Share on other sites

So, how do I put this into php? do I use ereg_replace? No, I would use preg_replace if I used this.Example:

<?php$string = 'April 15, 2003';$pattern = '/(\w+) (\d+), (\d+)/i';$replacement = '${1}1,$3';echo preg_replace($pattern, $replacement, $string);?>

But I don't know how to incorporate your match.

Share this post


Link to post
Share on other sites

Yeah, it would be something like

$post = preg_replace($regex, '<a href="$1">$1</a>', $post);

Note that I've edited the pattern. I can't figure out how to make sure there's no

 tag before the URL.  (I could figure it out if the tag had to be [i]directly[/i] before the URL, but that's not a real requirement.)

Share this post


Link to post
Share on other sites

Thanks, but now I have a new problem. Seems autoparse is being gotten, null. Which it isn't, it's 1.Also, your regex makes my whole posts go blank.

// Convert links to clickable links.function clickableLinks($post){   // Regex Match   $regex = "((?:http|ftp|https)://(?:\w+\.)+\w+\.?(?:/[\w.])*/?)(?!\]|\)|[\.;] )";      // Replace foo'   $post = preg_replace($regex, '<a href="$1">$1</a>', $post);      return $post;}

Using this instead:

  $text = eregi_replace('(((f|ht){1}tp://)[-a-zA-Z0-9@:%_\+.~#?&//=]+)', 	'<a href="\\1">\\1</a>', $text);   $text = eregi_replace('([[:space:]()[{}])(www.[-a-zA-Z0-9@:%_\+.~#?&//=]+)', 	'\\1<a href="http://\\2">\\2</a>', $text);   $text = eregi_replace('([_\.0-9a-z-]+@([0-9a-z][0-9a-z-]+\.)+[a-z]{2,3})', 	'<a href="mailto:\\1">\\1</a>', $text);

and editing appropriately makes it work.

Share this post


Link to post
Share on other sites

Lol, that's what I get for not testing. If your code works, great - but I still think the extension would be a better idea, and you don't have to install it without help.

Share this post


Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...

×
×
  • Create New...