dcole.ath.cx Posted June 26, 2006 Share Posted June 26, 2006 what is wrong with this line:$clear = preg_replace("<!--(.*)-->", "", $cache[1]);I'm trying to remove all comments from $cache[1] Link to comment Share on other sites More sharing options...
justsomeguy Posted June 26, 2006 Share Posted June 26, 2006 What's the problem? Link to comment Share on other sites More sharing options...
dcole.ath.cx Posted June 26, 2006 Author Share Posted June 26, 2006 Parse error: parse error, unexpected T_VARIABLE Link to comment Share on other sites More sharing options...
justsomeguy Posted June 26, 2006 Share Posted June 26, 2006 It's probably not caused by that line, there's probably a missing semicolon or quote before that line.You will also probably want to add a question mark after the asterisk in the pattern. Link to comment Share on other sites More sharing options...
dcole.ath.cx Posted June 27, 2006 Author Share Posted June 27, 2006 how would you replace everything in <head> ... </head>??$clear = preg_replace("<head>(.*)</head>", "", $cache[1]);because that doesn't work Link to comment Share on other sites More sharing options...
justsomeguy Posted June 27, 2006 Share Posted June 27, 2006 If that's all you want to do, you can use strpos to get the position of the first >, and use substr_replace to do the swap. But it requires having the head tag in it's own string.substr_replace($head, "", strpos(">"), -7);If you want it to replace in an arbitrary document, maybe try this pattern:/<head>(.*)<\/head>/is Link to comment Share on other sites More sharing options...
dcole.ath.cx Posted June 28, 2006 Author Share Posted June 28, 2006 that didn't do what I wanted it to but it was closer than nothing...so I'm going to open a URL and look at the source and it's going to be like<html><head>... style and title...</head><body></body></html>any way if I remove all html and php tags the internal style sheet will still show, I want to remove that...thinking about it I will use style and not head but it's the same thing...so how can I remove EVERYTHING in between <style and </style>this includes A-Z, 0-9, `~!@#$%^&*()_+-=[]{}\|;':",./<>? and returns! Link to comment Share on other sites More sharing options...
justsomeguy Posted June 28, 2006 Share Posted June 28, 2006 The dot represents any character other than a newline. You need to add the 's' pattern modifier to make it include newlines as well. Link to comment Share on other sites More sharing options...
dcole.ath.cx Posted June 28, 2006 Author Share Posted June 28, 2006 what is that? Link to comment Share on other sites More sharing options...
justsomeguy Posted June 28, 2006 Share Posted June 28, 2006 In this pattern:/<head>(.*)<\/head>/isThere are 'i' and 's' modifiers. The 'i' modifier indicates case-insensitivity, and the 's' modifier changes the dot metacharacter to include newlines. The pattern is enclosed in delimiters (I used forward slashes, which are fairly common for delimiters), and the modifiers appear after the closing delimiter. You can get information on pattern modifiers for Perl-compatible regex here:http://www.php.net/manual/en/reference.pcr...n.modifiers.phpAnd pattern syntax here:http://www.php.net/manual/en/reference.pcr...tern.syntax.phpPHP also includes support for POSIX regex (ereg and friends), but the Perl-compatible functions are more powerful, and are generally faster than the POSIX. Link to comment Share on other sites More sharing options...
dcole.ath.cx Posted June 28, 2006 Author Share Posted June 28, 2006 I was wondering why you put "is" in there... Link to comment Share on other sites More sharing options...
justsomeguy Posted June 28, 2006 Share Posted June 28, 2006 You might also try this pattern for matching the head:/<head\b[^>]*>(.*)</head>/isRegular expressions, while very powerful, are also equally complex. They are difficult to read unless the programmer has read books dedicated to regular expressions (such as this one). If you're like me, you can recognize where regular expressions fit a situation, but you still aren't sure of the specific syntax. When I encounter something like this, if it is a 'common' situation like yours, where you are matching HTML tags, I typically turn to Google and find examples of the same thing. If it's not so common, if I'm working with proprietary data, then I generally do some research and figure out a pattern that works for the situation, but come time to do it again and I will probably go back to Google. That's just the way regex are, they are equally powerful and complex, and require either an advanced understanding, or some thorough research followed by trial and error. Since regex can't be used for everything, I generally favor the latter (research, trial and error). But if I was working on a project where I knew that regex would play a large part, a book would be worth its weight in gold.The example above I found through Google and modified a bit. Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now