preg_replace

ckrudelux · November 24, 2010

This first row is replaced but not the other..

preg_replace("/<item>(.*)<\/item>/", "<p>$1</p>", $content);

<item>content</item><item>content</item>

How do make both of them replaced

Ingolme · November 24, 2010

Your regular expression is likely to be taking up all the content between the first <item> tag and the las </item> tag.You should try a search that's not so greedy, like <item>(.*)?<\/item>Since this is XML, I'd actually recommend using the DOM or XSLT instead.

ckrudelux · November 24, 2010

Your regular expression is likely to be taking up all the content between the first <item> tag and the las </item> tag.You should try a search that's not so greedy, like <item>(.*)?<\/item>Since this is XML, I'd actually recommend using the DOM or XSLT instead.

No this was for an other thing it was just easier to explain this way

ckrudelux · November 24, 2010

No this was for an other thing it was just easier to explain this way

Well adding the "?" didn't make any differences in the output

Ingolme · November 24, 2010

Actually, you might need to add "g" as a modifier as well.

birbal · November 24, 2010

This first row is replaced but not the other..
preg_replace("/<item>(.*)<\/item>/", "<p>$1</p>", $content);
<item>content</item><item>content</item>
How do make both of them replaced

i am not sure are you talkin about to match it whole

<item>content</item><item>content</item>

AS $contentOR youe are trying to match it isolately in different situation

<item>content</item>

AND

<item>content</item>

i am not sure which one is your second/first row you are reffereing.though for the second case it will work i think

<item>[a-z\s]*<\/item>

jeffman · November 24, 2010

EDIT. See Posts 12-13 below. The principle here is correct, but the character class is messed up.The biggest problem is the dot character specifically will not match line breaks, so pretty-printed XML/HTML creates a problem. Birbal's solution comes close, but it will miss caps and digits, etc. To really capture EVERYTHING, try this:

"/<item>([^\n\n]*)<\/item>/"

The central part will be interpreted as "match any character that is a line break or not a line break," which really does mean everything. Of course, you can do that with any character, but using the line break sequence will remind you why you are formatting the expression this way.

ckrudelux · November 24, 2010

i am not sure are you talkin about to match it whole
<item>content</item><item>content</item>
AS $contentOR youe are trying to match it isolately in different situation
<item>content</item>
AND
<item>content</item>
i am not sure which one is your second row you are reffereing.though for the second case it will work i think
<item>[a-z\s]*<\/item>

$content is just referring to this:

<item>content</item><item>content</item>

This is the first match

<item>content</item>

This is the second match (starting at row 2):

<item>content</item>

The second match isn't found for some reason maybe preg_replace only can find matches with in a line and not on several lines. Which I have hard time to believe.

ckrudelux · November 24, 2010

The biggest problem is the dot character specifically will not match line breaks, so pretty-printed XML/HTML creates a problem. Birbal's solution comes close, but it will miss caps and digits, etc. To really capture EVERYTHING, try this:
"/<item>([^\n\n]*)<\/item>/"
The central part will be interpreted as "match any character that is a line break or not a line break," which really does mean everything. Of course, you can do that with any character, but using the line break sequence will remind you why you are formatting the expression this way.

Why are you using two line brakes?

birbal · November 24, 2010

^\n will match all character except new line and \n will match new line also.

ckrudelux · November 24, 2010

^\n will match all character except new line and \n will match new line also.

Still don't work, it don't getting replaced :)this is my line:preg_replace("/<item>([^\n\n]*)<\/item>/", "replaced", $file)

jeffman · November 24, 2010

Yeah, you're right. I changed your html without realizing it, so I didn't notice the character class is broken. give me a minute.

jeffman · November 24, 2010

What I forgot was the negation operator ( ^ ) affects everything in the braces, not just the character immediately following. So we'll try a similar idea with a character that has a matching negation character:"/<item>([\s\S]*?)<\/item>/"\s == all whitespace characters, including \n and \r\S == all characters that are not whitespace characters.So [\s\S] means "match all whitespace characters and all non-whitespace characters, which again means everything.The ? operator is needed for this now, just as Ingolme explained in Post #5

jeffman · November 24, 2010

Okay, I really am sleepy today. I was sure a simple modifier would handle this, but I couldn't remember it. Using the \s character reminded me. The whole pattern can be simplified to this, which is almost what you had to begin with:"/<item>(.*?)<\/item>/s"where the "s" modifier allows dot to match linebreaks also.

ckrudelux · November 24, 2010

What I forgot was the negation operator ( ^ ) affects everything in the braces, not just the character immediately following. So we'll try a similar idea with a character that has a matching negation character:"/<item>([\s\S]*?)<\/item>/"\s == all whitespace characters, including \n and \r\S == all characters that are not whitespace characters.So [\s\S] means "match all whitespace characters and all non-whitespace characters, which again means everything.The ? operator is needed for this now, just as Ingolme explained in Post #5

That was very helpful and yes it works now.. Thank you very much for this information

Dilated · November 24, 2010

To clear things up:The simplest way to do it is, "preg_replace('/<item>(.*?)</item>/s', '<p>$1</p>', $content);". An ungreedy sign must always be put after the quantifier, not after the grouping. If it is after anything but the quantifier, it becomes the "{0,1}" quantifier.A negated set negates all characters in the set. [^\n\n] is redundant, and the same thing as [^\n].The "g" modifier doesn't exist in PHP - this behavior is enabled by default for preg_replace. (You can limit preg_replace's iterations by using the 4th parameter.) Also, the reason so many people are inclined to use "/[\s\S]*/" over "/.*/s" is because javascript wrongly never specified the "s" modifier in their version of PCRE, and a lot of people are used to javascript.

preg_replace

Recommended Posts

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Archived