Parsing UBB code with PHP

Fmdpa · September 11, 2010

I'm working on a handmade UBB code system. Everything was going smoothly until I got to parsing the img tag. The format I set up was this: --><img src="src" alt="alt_text" id="small" class="center" />The regex I came up with was about twice as long as the email-validation regex, and didn't even work right. My biggest issue is controlling the greed of the characters. I couldn't get the preg_replace function to work correctly with multiple ubb img tags in the text. How would you convert it?

Synook · September 11, 2010

Have you tried just using the BBCode functions?

Fmdpa · September 11, 2010

No, and although this looks like a potential solution, I don't understand the functions. There are some constants used that are not defined. Could you give some examples?

boen_robot · September 11, 2010

Have you seen the list of defined constants? Anyone in particular you see missing from the list or missing an explanation?Have you tried the examples provided in each function's manual page? To be more exact, have you tried to install the extension?

Fmdpa · September 11, 2010

Good thing you asked about the extension. No, I didn't install it. I just downloaded it (http://pecl.php.net/package/bbcode), and after some research I found out that I had to compile the dll by hand from the C source files. Is this true (for windows)? If so, how do you it? I think I'm starting to understand the BBCode functions. How close am I (I want to make all params required)?:

$array_bbcode = array ('img'=>	  array('type'=>BBCODE_TYPE_ARG,					'open_tag'=>'<img src="{PARAM}" alt="{PARAM}" id="{PARAM}" class="{PARAM}"', 'close_tag'=>' />',					'childs'=>''),);$bbcode = '[img src="src" alt="alt_text" size="large" align="right" /]';$BBHandler=bbcode_create($arrayBBCode);echo bbcode_parse($BBHandler,$bbcode);

P.S. Do you know of reliable freeware that is capable of unzipping TGZ files?

boen_robot · September 11, 2010

Normally, yes, you need to compile the extension into a DLL or whatever first, using Visual Studio or some another compiler.For windows, there's this list of precompiled extensions so that you only extract the archive and place the DLL in your "ext" dir.For an archiver, I can reccomend Z-Zip (for simplicity) or PeaZip (if you need to also create TGZ archives).

Fmdpa · September 11, 2010

Ok, I installed the extension, restarted wamp, and enabled the bbcode extension. Unfortunately, I recieved an error:"PHP Startup: bbcode Unable to initialize module Module compiled with module API=20060613PHP compile with module API=20090626These options need to match"???

boen_robot · September 11, 2010

The DLL at that page is for PHP 5.2 Thread Safe version... I'd assume VC6, even though it doesn't say that part.What version are you using? I'd assume PHP 5.3 VC6 Thread Safe? If so, you'll have to either switch WAMP to PHP 5.2 (if there's a way... I don't use WAMP), or learn to recompile the extension for PHP 5.3 VC6 Thread Safe. Hmm... maybe there's a reason they didn't compiled it with other PHP versions. Maybe the Zend API has changed too much for this extension to smoothly handle the change.

Fmdpa · September 11, 2010

I'm thinking I'm just going to have to dump the idea of using the PHP bbcode functions. Only 10% of hosts that I contacted would allow the installation of the bbcode module. The precompiled version doesn't work with the most recent PHP builds. So, back to regex?

Synook · September 12, 2010

There's also the PEAR module: http://pear.php.net/package/HTML_BBCodeParser.

Fmdpa · September 12, 2010

Yes, I came across that in my research. Are the PEAR modules basically PHP files with functions (or classes) in them?

trevelluk · September 13, 2010

Yes, I came across that in my research. Are the PEAR modules basically PHP files with functions (or classes) in them?

Yes. PEAR provides an easy way to install the modules and manage dependencies, but there's no need to compile anything, the modules only contain regular PHP files.

Fmdpa · September 14, 2010

First of all, why do PEAR files always require the PEAR.php file in the beginning? Are they dependent upon it in any way?Secondly, I just finished a regex that replaces the img tag I mentioned above with its HTML equivalent. Here it is:

preg_match('/src=[\"\']([^\"\'].+?)[\"\']/s', $txt, $match);$src = $match[1];$im = imagecreatefromjpeg ($src); if (imagesx($im) > imagesy($im)) $o = '_hor';if (imagesy($im) > imagesx($im)) $o = '_ver'; preg_replace('/\[img src=[\"\']([^\"\'].+?)[\"\'] size=[\"\']([^\"\'].+?)[\"\'] align=[\"\']([^\"\'].+?)[\"\'] \/\]/s', '<img src=\'$1\' id=\'$2' . $o . '\' class=\'$3\' />', $txt);

It works just great for replacing the BBcode. But notice the PHP code that checks what the orientation of the image is. I append a string indicating the image's orientation into the id attribute. It works fine for the first image tag. But then it never changes. How can I check the orientation for each individual img tag?

justsomeguy · September 14, 2010

First of all, why do PEAR files always require the PEAR.php file in the beginning? Are they dependent upon it in any way?

Of course, why else would they include it?

It works fine for the first image tag. But then it never changes.

Well, you're only setting it once, based on $match[1]. You may want to use preg_match_all to get every match, and then use str_replace to replace each one individually. You can also use preg_replace_callback to have a callback function run for the matches to return the replacement string.

Fmdpa · September 14, 2010

I found the solution! I used preg_match_all to determine the number of matches. Then I used a FOR loop to loop through the code as many times as there were matches, limiting the preg_replace function to one replacement each time. It works perfectly, as far as I've seen.

Fmdpa · September 15, 2010

Here it is. Some parts of it are dirty, but it works fine.

<?phpfunction convert_chars ($txt) {		$txt = preg_replace('/\[i\]([^\[\/i\]].+?)\[\/i\]/s', '<i>$1</i>', $txt); //italics		$txt = preg_replace('/\[b\]([^\[\/b\]].+?)\[\/b\]/s', '<b>$1</b>', $txt); // bold <b>		$txt = preg_replace('/\[u\]([^\[\/u\]].+?)\[\/u\]/s', '<u>$1</u>', $txt); //underline tags <u>		$txt = preg_replace('/\[em\]([^\[\/em\]].+?)\[\/em\]/s', '<span class="emphasize">$1</span>', $txt); //emphasis	$i = 0;	$code = preg_match ('/\[codebox\](.)\[\/codebox\]+/', $txt, $code_text);		foreach ($code_text as $ct) {			$i++;			if ($i<=1) continue;			$txt = preg_replace('/\<+/', '/<', $ct);			$txt = preg_replace('/\>+/', '/>', $ct);		}		$txt = stripslashes ($txt);	$txt = preg_replace ('/\[codebox\]+/', '<pre><code class="codebox"><br/>', $txt);	$txt = preg_replace ('/\[\/codebox\]+/', '</code></pre>', $txt); //convert to <code class="codebox"> tags		$txt = preg_replace('/\[abbr[ ]{0,3}title=[\"\']([^\"\'].*?)[\"\']\]([^\[].*?)\[\/abbr\]/', '<span rel="hint" class="black" title="$1">$2</span>', $txt);		$txt = preg_replace('/\[link[ ]{0,3}=[\"\']([^\"\'].*?)[\"\']\]([^\[].*?)\[\/link\]/', '<a href="$1">$2</a>', $txt);	$txt = preg_replace('/\[h1\]([^\[\/h1\]].+?)\[\/h1\]/s', '<h1>$1</h1>', $txt); //headings	$txt = preg_replace('/\[h2\]([^\[\/h2\]].+?)\[\/h2\]/s', '<h2>$1</h2>', $txt); //headings <h2>		$txt = preg_replace('/\[h3\]([^\[\/h3\]].+?)\[\/h3\]/s', '<h3>$1</h3>', $txt); //headings <h3>		$txt = preg_replace('/\[h4\]([^\[\/h4\]].+?)\[\/h4\]/s', '<h4>$1</h4>', $txt); //headings <h4>		$txt = preg_replace('/\[h5\]([^\[\/h5\]].+?)\[\/h5\]/s', '<h5>$1</h5>', $txt); //headings <h5>		$txt = preg_replace('/\[h6\]([^\[\/h6\]].+?)\[\/h6\]/s', '<h6>$1</h6>', $txt); //headings <h6>	$txt = preg_replace('/\[code\]([^\[\/code\]].+?)\[\/code\]/s', '<code>$1</code>', $txt); //convert to <code>		$return = preg_match_all('/\[img[ ]{0,3}src=[\"\']([^\"\'].+?)[\"\'][ ]{0,3}size=[\"\']([^\"\'].+?)[\"\'][ ]{0,3}align=[\"\']([^\"\'].+?)[\"\'][ ]{0,3}\/\]/s', $txt, $matches); //find how many img tags are in $txt		for ($i = 0; $i < $return; $i++) { //loop through the txt as many times as there are matches ($return); replace only one tag each loop.			preg_match('/\[img[ ]{0,3}src=[\"\']([^\"\'].+?)[\"\']/s', $txt, $match);			$src = $match[1];			$im = imagecreatefromjpeg ($src);			if (imagesx($im) > imagesy($im)) $o = '_hor';			if (imagesy($im) > imagesx($im)) $o = '_ver';			$txt = preg_replace('/\[img[ ]{0,3}src=[\"\']([^\"\'].+?)[\"\'][ ]{0,3}size=[\"\']([^\"\'].+?)[\"\'][ ]{0,3}align=[\"\']([^\"\'].+?)[\"\'][ ]{0,3}\/\]/s', '<img src=\'$1\' id=\'$2' . $o . '\' class=\'$3\' />', $txt, 1);		}		return $txt;}

Parsing UBB code with PHP

Recommended Posts

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Archived