Jump to content

preg_replace


ckrudelux

Recommended Posts

Your regular expression is likely to be taking up all the content between the first <item> tag and the las </item> tag.You should try a search that's not so greedy, like <item>(.*)?<\/item>Since this is XML, I'd actually recommend using the DOM or XSLT instead.

Link to comment
Share on other sites

Your regular expression is likely to be taking up all the content between the first <item> tag and the las </item> tag.You should try a search that's not so greedy, like <item>(.*)?<\/item>Since this is XML, I'd actually recommend using the DOM or XSLT instead.
No this was for an other thing it was just easier to explain this way :)
Link to comment
Share on other sites

This first row is replaced but not the other..
preg_replace("/<item>(.*)<\/item>/", "<p>$1</p>", $content);

<item>content</item><item>content</item>

How do make both of them replaced

i am not sure are you talkin about to match it whole
<item>content</item><item>content</item>

AS $contentOR youe are trying to match it isolately in different situation

<item>content</item>

AND

<item>content</item>

i am not sure which one is your second/first row you are reffereing.though for the second case it will work i think

<item>[a-z\s]*<\/item>

Link to comment
Share on other sites

EDIT. See Posts 12-13 below. The principle here is correct, but the character class is messed up.The biggest problem is the dot character specifically will not match line breaks, so pretty-printed XML/HTML creates a problem. Birbal's solution comes close, but it will miss caps and digits, etc. To really capture EVERYTHING, try this:

"/<item>([^\n\n]*)<\/item>/"

The central part will be interpreted as "match any character that is a line break or not a line break," which really does mean everything. Of course, you can do that with any character, but using the line break sequence will remind you why you are formatting the expression this way.

Link to comment
Share on other sites

i am not sure are you talkin about to match it whole
<item>content</item><item>content</item>

AS $contentOR youe are trying to match it isolately in different situation

<item>content</item>

AND

<item>content</item>

i am not sure which one is your second row you are reffereing.though for the second case it will work i think

<item>[a-z\s]*<\/item>

$content is just referring to this:
<item>content</item><item>content</item>

This is the first match

<item>content</item>

This is the second match (starting at row 2):

<item>content</item>

The second match isn't found for some reason maybe preg_replace only can find matches with in a line and not on several lines. Which I have hard time to believe.

Link to comment
Share on other sites

The biggest problem is the dot character specifically will not match line breaks, so pretty-printed XML/HTML creates a problem. Birbal's solution comes close, but it will miss caps and digits, etc. To really capture EVERYTHING, try this:
"/<item>([^\n\n]*)<\/item>/"

The central part will be interpreted as "match any character that is a line break or not a line break," which really does mean everything. Of course, you can do that with any character, but using the line break sequence will remind you why you are formatting the expression this way.

Why are you using two line brakes?
Link to comment
Share on other sites

^\n will match all character except new line and \n will match new line also.

Link to comment
Share on other sites

^\n will match all character except new line and \n will match new line also.
Still don't work, it don't getting replaced :)this is my line:preg_replace("/<item>([^\n\n]*)<\/item>/", "replaced", $file)
Link to comment
Share on other sites

What I forgot was the negation operator ( ^ ) affects everything in the braces, not just the character immediately following. So we'll try a similar idea with a character that has a matching negation character:"/<item>([\s\S]*?)<\/item>/"\s == all whitespace characters, including \n and \r\S == all characters that are not whitespace characters.So [\s\S] means "match all whitespace characters and all non-whitespace characters, which again means everything.The ? operator is needed for this now, just as Ingolme explained in Post #5

Link to comment
Share on other sites

Okay, I really am sleepy today. I was sure a simple modifier would handle this, but I couldn't remember it. Using the \s character reminded me. The whole pattern can be simplified to this, which is almost what you had to begin with:"/<item>(.*?)<\/item>/s"where the "s" modifier allows dot to match linebreaks also.

Link to comment
Share on other sites

What I forgot was the negation operator ( ^ ) affects everything in the braces, not just the character immediately following. So we'll try a similar idea with a character that has a matching negation character:"/<item>([\s\S]*?)<\/item>/"\s == all whitespace characters, including \n and \r\S == all characters that are not whitespace characters.So [\s\S] means "match all whitespace characters and all non-whitespace characters, which again means everything.The ? operator is needed for this now, just as Ingolme explained in Post #5
That was very helpful and yes it works now.. Thank you very much for this information :)
Link to comment
Share on other sites

To clear things up:The simplest way to do it is, "preg_replace('/<item>(.*?)</item>/s', '<p>$1</p>', $content);". An ungreedy sign must always be put after the quantifier, not after the grouping. If it is after anything but the quantifier, it becomes the "{0,1}" quantifier.A negated set negates all characters in the set. [^\n\n] is redundant, and the same thing as [^\n].The "g" modifier doesn't exist in PHP - this behavior is enabled by default for preg_replace. (You can limit preg_replace's iterations by using the 4th parameter.) Also, the reason so many people are inclined to use "/[\s\S]*/" over "/.*/s" is because javascript wrongly never specified the "s" modifier in their version of PCRE, and a lot of people are used to javascript.

Link to comment
Share on other sites

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...