Jump to content

Trying To Create BBCode Parser


Yahweh

Recommended Posts

False alarm folks, I've found a solution to my problem :)I've been working on this problem for a while, and I haven't figured the solution. I've written a script that lets users define their own bbCodes and HTML template, then it transforms the bbCode into HTML.The code works like this:1) A user defines a YCode, specifying the opening tag, and whether it uses arguments or an end tag.2) The data is sent to the BuildPattern function, where it uses the data to construct a regular expression pattern. A sample pattern looks like this:

\{url=\s*([^\};]+)\s*;?\}([\S\s]*?)\{\/url\}The above expression is obviously the regex for a YCode that looks like this:{url=http://www.google.com}Google Homepage{/url}The part in red captures any arguments used, without capturing any character that is a ";" or "}". Semicolons are used to seperate arguments for tags using multiple arguments. The "\s*" is just a way of capturing only the argument, and not the whitespace surrounding it.The part in blue captures the text between the opening and closing tag.
3) I perform a quick regex search and replace.I repeat that process a few times to get any tags that have been nested.However, I'm having a problem: the code I've written doesn't work for nested tags. For instance, if I define a tag that makes text a certain color, and for some reason prints the name of the color after the text, I would define a code like this:
YCode: "color"Arguments: 1EndTag: TrueHTMLTemplate: <font color="$arg">$text</font> ($arg1)

But when I run that through the regex, it won't work quite right with nested colors.

{color=red}This is red.    {color=blue}This is blue nested in red.{/color}Now its red again.{/color}It *should* work like this:        First pass through regex:        <font color="red">This is red.            {color=blue}This is blue nested in red.{/color}        Now its red again.</font> (red)                Second pass through regex:        <font color="red">This is red.            <font color="blue">This is blue nested in red.</color> (blue)        Now its red again.</font> (red)Instead, it does this:        One the first pass through regex, that becomes:        <font color="red">This is red.            {color=blue}This is blue nested in red.</font> (red)        Now its red again.{/color}                On the second pass through regex, it becomes:        <font color="red">This is red.            <font color="blue">This is blue nested in red.</font> (red)        Now its red again. </font> (blue)

Obviously, you can see that the code isn't parsing correctly, because it matches the start tag to its nearest end tag, regardless of wear the end tag is nested, which causes parsing problems. It does that as a consequence of regex pattern, because I use a lazy matching for determing the text between two tags; however, I'm forced to use lazy matching, otherwise I can't put non-nested tags side by side.

Here's why I can't use greedy matching:{color=red}This is red.    {color=blue}This is blue nested in red.{/color}Now its red again.{/color}{color=orange}This is orange beside red{/color}        First pass through regex:        <font color="red">This is red.            {color=blue}This is blue nested in red.{/color}        Now its red again.{/color}                {color=orange}This is orange beside red</font> (red)                Second pass through regex:        <font color="red">This is red.            <font color="blue">This is blue nested in red.{/color}        Now its red again.</font> (blue)                {color=orange}This is orange beside red</font> (red)        That obviously isn't what I wanted.

This is the script I've written for YCode parsing (I'm posting it as seperate link because its too long for the board software to handle):http://www.fstdt.com/scripts/ycode/ycode_source.aspA working version of the script above is available at:http://www.fstdt.com/scripts/ycode/ycode_beta.aspI've been working on the nesting problem for the longest time, and I haven't come close to solving it. Would anyone like to give this problem a shot?

Link to comment
Share on other sites

Dang, I knew I'd started this thread too soon. I asked the same question in the opening post on a Regex forum, and they were quite helpful:http://regexadvice.com/forums/17123/ShowPost.aspx#17123Its not possible to do recursive regexes, but they can be simulated like this:\[(color)=\s*([^]]+)]((?:\[\1=[^]]+](?:\[\1=[^]]+][\S\s]*?\[/\1]|[\S\s])*?\[/\1]|[\S\s])*?)\[/\1]That will only get 0 to 2 levels of nesting, so I'd have to know in advance how deep nesting can go (I wouldn't imagine I'd need more than 5 or 6 levels). It works nicely :)

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...