Closing HTML Tags

royarellano · January 3, 2011

I'm calling a string from a database which contains HTML tags, but not the entire string, so when there is a "<strong>" tag started it is not closed in the echo, which means that all the rest of the content on the page remains bold. How can I ensure that no matter the type of tag, it is closed after the echo is called?

birbal · January 3, 2011

you can do some http://php.net/function.preg_match...where you can find the occurance of the open tag and then you can put a close tag if there is a open tag. but i think it would not be efficient..as php cant decide where to end the close tags...

<b>it is bold text

it can be made up

<b>it</b> is bold tag

<b>it is </b> bold tag

...and so on...or may be you can do some string replace function..or preg_replace to remove the open tags..i think removing open tag will be more aporpiate..i guess you only want to stop to format the rest of the tags.

royarellano · January 4, 2011

you can do some http://php.net/function.preg_match...where you can find the occurance of the open tag and then you can put a close tag if there is a open tag. but i think it would not be efficient..as php cant decide where to end the close tags...it can be made up...and so on...or may be you can do some string replace function..or preg_replace to remove the open tags..i think removing open tag will be more aporpiate..i guess you only want to stop to format the rest of the tags.

Well the thing is that the content (the string being queried) is never the same, and always contains various tags like bold, italics, etc. So it would have to be a function which reads the string and closes any open tags. Is that what you mean?

birbal · January 4, 2011

i mean to say rather than that

a function which reads the string and closes any open tags

it would be more better to make a function which will remove the open tags which dont have close tags.

thescientist · January 4, 2011

well, using a regex or string functions, you can test for the first occurrence of an html element, by looking for < and >. Take everything inside them (example: <b> and store it as a variable, and then you know you need to add a </b> as the closing tag, if you can't find it's match in the rest of the string. Will be a bit harder for nested tags, but it's a start.

justsomeguy · January 4, 2011

You can also just strip all of the tags. If this is something like a preview for a story, you may just want to remove HTML tags from the preview.

Dilated · January 4, 2011

You're entering into tokenizer territory if you want something this advanced. You *might* be able to accomplish this through extremely convoluted regexes, but it wouldn't be easy. You would need to be a genius with lookaheads and conditional patterns.http://htmlpurifier.org/ - This is a library that would do all the work for you to make sure everything is properly nested. As a bonus, it also removes potentially malicious code. It also ensures the markup is valid according to whatever particular doctype you select.Another possibility would be to load everything into a DOMDocument, and have DOMDocument parse everything - I'm not sure but I think it has some kind of error handling that would fix misnested tags.edit: Sorry, slightly misinterpreted the OP's problem when I first posted everything above. If you're just grabbing a substring of a larger string and hoping for the best when it comes to the markup, there are better ways to do it. One thing you can do is grab each element with a regex, one by one, and as soon as the cumulative string length of all the elements put together is at least the length you want, you can ignore all the other elements. That way theres no chance you would cut off one of the close tags.

royarellano · January 5, 2011

You can also just strip all of the tags. If this is something like a preview for a story, you may just want to remove HTML tags from the preview.

Yes, it is a preview to an article. So I think removing all HTML tags from the string would be the best thing to do. As it states on my profile, I'm a newbie at this

php development, so my next question is what is the easiest way to code this.I am calling the first 250 characters from a string which may contain one or many HTML tags which may be opened, but not closed within that 250 limit. Help is appreciated.To everyone else who replied, thank you kindly for your time and suggestions.

Dilated · January 5, 2011

Yes, it is a preview to an article. So I think removing all HTML tags from the string would be the best thing to do. As it states on my profile, I'm a newbie at this php development, so my next question is what is the easiest way to code this.I am calling the first 250 characters from a string which may contain one or many HTML tags which may be opened, but not closed within that 250 limit. Help is appreciated.To everyone else who replied, thank you kindly for your time and suggestions.

Stripping all the tags is fairly simple:

$str = preg_replace('/<[^>]+>/', '', $str);$excerpt = substr($str, 0, 250);

birbal · January 5, 2011

QUOTE (justsomeguy @ Jan 4 2011, 10:18 AM) *You can also just strip all of the tags. If this is something like a preview for a story, you may just want to remove HTML tags from the preview.Yes, it is a preview to an article. So I think removing all HTML tags from the string would be the best thing to do. As it states on my profile, I'm a newbie at this smile.gif php development, so my next question is what is the easiest way to code this.I am calling the first 250 characters from a string which may contain one or many HTML tags which may be opened, but not closed within that 250 limit. Help is appreciated.To everyone else who replied, thank you kindly for your time and suggestions.

http://php.net/function.strip_tags

Dilated · January 5, 2011

php.net/function.strip_tags

No. Do not use strip_tags. It's notorious for not working properly.

birbal · January 5, 2011

It's notorious for not working properly.

what do you mean by that? explain plz

Dilated · January 5, 2011

what do you mean by that? explain plz

As far as I understand, it won't even strip tags that have attributes. Using a regex is more reliable.

birbal · January 5, 2011

As far as I understand, it won't even strip tags that have attributes. Using a regex is more reliable.

but it also removeing the attributes also..

<?php$text="<b style='color:red'>Red text</b>";echo strip_tags($text);echo $text;?>

Fmdpa · January 5, 2011

Just testing one tag doesn't prove it works properly. Stripping tags from the preview is a good idea, in case the excerpt cuts off a closing tag and the rest of the page is bold text (for example). That will probably only happen if you have a strict XHTML doctype.

ShadowMage · January 5, 2011

No. Do not use strip_tags. It's notorious for not working properly.

Yeah, if you read the warnings on the manual page it states that partial or broken tags can result in too much text being stripped. This is something OP can't risk. The regex is the better option.

Fmdpa · January 6, 2011

PHP is open-source. Why hasn't someone fixed it yet?

royarellano · January 9, 2011

Stripping all the tags is fairly simple:
$str = preg_replace('/<[^>]+>/', '', $str);$excerpt = substr($str, 0, 250);

Would the above also strip all the tag within the tag symbols (i.e., bold, em, etc.)?

ShadowMage · January 10, 2011

Would the above also strip all the tag within the tag symbols (i.e., bold, em, etc.)?

You mean something like this:<div><em>some text</em> some more text</div>??Yes. It'll also remove all attributes like: <a href='www.somesite.com' class='aClassName'>Link</a>Basically what it does is it looks for a '<' symbol, then looks for one or more of any character that isn't a '>' symbol, then a '>' symbol. So absolutely anything that is between < and > symbols will be removed, leaving only your text.

royarellano · January 12, 2011

You mean something like this:<div><em>some text</em> some more text</div>??Yes. It'll also remove all attributes like: <a href='www.somesite.com' class='aClassName'>Link</a>Basically what it does is it looks for a '<' symbol, then looks for one or more of any character that isn't a '>' symbol, then a '>' symbol. So absolutely anything that is between < and > symbols will be removed, leaving only your text.

Awesome! Worked perfectly; thank you and God bless.v/rRoy Arellano

Closing HTML Tags

Recommended Posts

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Archived