Jump to content

while preg_match not working as it should


Greywacke
 Share

Recommended Posts

hi there.i need to do a loop untill all the %TEXT all | optional | characters% type flags are replaced. the flag only has one rule, it cannot have a carriage return or a line break in it, nor a percentage symbol - but is defined between two percentage symbols.this is the code i have at the moment that doesn't seem to be looping till all the tags are replaced (this has to be done, because flag replacement texts should be allowed to have other flags within them). the line breaks below have merely been added so that the forum thread would keep its shape and not expand over the width of the screen.

function getflags($input) {	foreach ($input as $in) {		// use to split by first space if found, to get flag values		$in = preg_split("/ /",$in,2);		// build and return the values to replace with		switch ($in[0]) {			// mail configurable flags			case "LOGO":				if ($GLOBALS["attribsarr"]["products_description"]) {					return "%CONFIGTEXT File | Logo | CanopyXchange%";				} elseif ($GLOBALS["attribsarr"]["vehicle_make_model"]					&& $GLOBALS["response"]) {						return "%CONFIGTEXT File | Logo | QuoteMe |							Canopies%";				} elseif ($GLOBALS["attribsarr"]["vehicle_make_model"])					{						return "%CONFIGTEXT File | Logo |							CanopyXchange%";				}				break;			case "RESPONSESIGNATURE":				if ($GLOBALS["attribsarr"]["products_description"]) {					return "%CONFIGTEXT Email | Consumer | Message 						| Signature (CX)%";				} elseif ($GLOBALS["attribsarr"]["vehicle_make_model"])					{						return "%CONFIGTEXT Email | Consumer |							Message | Signature (QM)%";				}				break;			case "PREMIUMSUBJECT":				if ($GLOBALS["attribsarr"]["products_description"]) {					return "%CONFIGTEXT Email | Supplier | Premium |						Header (CX)%";				} elseif ($GLOBALS["attribsarr"]["vehicle_make_model"]) {					return "%CONFIGTEXT Email | Supplier | Premium |						Header (QM)%";				}				break;			case "PREMIUMINTRO":				if ($GLOBALS["attribsarr"]["products_description"]) {					return "%CONFIGTEXT Email | Supplier | Premium |						Message | Intro (CX)%";				} elseif ($GLOBALS["attribsarr"]["vehicle_make_model"])					{						return "%CONFIGTEXT Email | Supplier |							Premium | Message | Intro (QM)%";				}				break;			case "PREMIUMSIGNATURE":				if ($GLOBALS["attribsarr"]["products_description"]) {					return "%CONFIGTEXT Email | Supplier | Premium |						Message | Signature (CX)%";				} elseif ($GLOBALS["attribsarr"]["vehicle_make_model"])					{						return "%CONFIGTEXT Email | Supplier |						Premium | Message | Signature (QM)%";				}				break;			case "FREEMIUMSUBJECT":				if ($GLOBALS["attribsarr"]["products_description"]) {					return "%CONFIGTEXT Email | Supplier | Freemium |						Header (CX)%";				} elseif ($GLOBALS["attribsarr"]["vehicle_make_model"])					{						return "%CONFIGTEXT Email | Supplier |							Freemium | Header (QM)%";				}				break;			case "FREEMIUMINTRO":				if ($GLOBALS["attribsarr"]["products_description"]) {					return "%CONFIGTEXT Email | Supplier | Freemium |						Message | Intro (CX)%";				} elseif ($GLOBALS["attribsarr"]["vehicle_make_model"])					{						return "%CONFIGTEXT Email | Supplier |							Freemium | Message | Intro (QM)%";				}				break;			case "FREEMIUMSIGNATURE":				if ($GLOBALS["attribsarr"]["products_description"]) {					return "%CONFIGTEXT Email | Supplier | Freemium |						Message | Signature (CX)%";				} elseif ($GLOBALS["attribsarr"]["vehicle_make_model"])					{						return "%CONFIGTEXT Email | Supplier |							Freemium | Message | Signature							(QM)%";				}				break;			// attribute flags			case "ATTRIBUTE":				return $GLOBALS["attribsarr"][$in[1]];				break;			// consumer flags			case "CONSUMERNAME":				return $GLOBALS["consumerfullname"];				break;			case "CONSUMEREMAIL":				return $GLOBALS["consumeremail"];				break;			case "CONSUMERCELL":				return $GLOBALS["consumercell"];			case "CONSUMERCITY":				return $GLOBALS["city_town"];				break;			// supplier flags			case "SUPPLIERNAME":				return $GLOBALS["recipient"][1];				break;			case "CONTACTNAME":				return $GLOBALS["recipient"][0];				break;			case "ACCMGR":				return $GLOBALS["recipient"][11];				break;			case "ACCMGRMAIL":				return $GLOBALS["recipient"][12];				break;			case "PAYGBALANCE":				return $GLOBALS["newbalance"];			case "INVOICEAMOUNT":				return $GLOBALS["recipient"][13];				break;			// lead flags			case "LEADREFERENCE":				return "#".($GLOBALS["leadid"] + 11001000);				break;			case "LEADCATEGORY":				return $GLOBALS["cat"];				break;			case "PAYGCOST":				return $GLOBALS["servicecost"];				break;			case "DUPLICATES":				return $GLOBALS["append"];				break;			// other flags			case "PROSPECTMESSAGE":				return $GLOBALS["pmsg"];				break;			case "SUPPLIERLIST":				return join("",$GLOBALS["suppliers"]);				break;			case "PROSPECTLIST":				return join("",$GLOBALS["supplierp"]);				break;			case "SERVICENAME":				return $GLOBALS["servicename"];				break;			case "REGIONNAME":				return $GLOBALS["regionname"];				break;			// cronjob flags			case "FROMDATE":				return $GLOBALS["f"];				break;			case "TODATE":				return $GLOBALS["t"];				break;			// configurable text flags			case "CONFIGTEXT":				$sql = "SELECT * FROM 18_configurabletexts WHERE					18_configurabletexts.text_TextDescription = \"".					$in[1]."\" AND 18_configurabletexts.					bigint_ServiceID = ".$GLOBALS["service"].";";				$result = mysql_query_errors($sql, $conn, __FILE__,					__LINE__, false);				if ($result) {					if ($row = mysql_fetch_array($result)) {						return ($br)?preg_replace(array('/\r\n/','/\n/',							'/\r/'),'/<br />/',							$row["text_TextFullContent"]):							$row["text_TextFullContent"];					}				}		}	}}

it is using global variables to get the variables set and made available where the function is called. there is something wrong with the while loop - as it only does that once. what would be a better way to make sure there are no more tags in the passed string?also the $GLOBALS variables don't seem to be retreiving the values that are available where the populateflags function is called.from the main page. the functions are defined inside of an include.

Edited by Pierre 'Greywacke' du Toit
Link to comment
Share on other sites

You may want to consider redefining your tags, it doesn't sound very easy to parse. How would you parse this, for example:

The values range from 10% according to %CONFIGTEXT Name% to 50% with a 5% margin of error.
This would be easier to parse:
The values range from 10% according to {%CONFIGTEXT Name%} to 50% with a 5% margin of error.
Since your beginning and ending delimiters are the same, something like nesting becomes completely ambiguous. Imagine if PHP replaced both curly brackets as scope delimiters with a single character like |. It would be nearly impossible to determine what the scope is, if you're beginning another scope or ending a previous one.
Link to comment
Share on other sites

well they are parsed now, i just had to define the regular expressions more explicitly, as such:

function populateflags($str, $br) {	// replace flags used in string (parse entire template contents' as string), populating flags individually and sometimes recursively in callback function.	while (preg_match("/%([^\r\n%]+[\bA-Za-z|]*)%/",$str)) {		$str = preg_replace_callback("/%([^\r\n%]+[\bA-Za-z|]*)%/",idflags,$str);	}	return ($br)?preg_replace(array('/\r\n/','/\n/','/\r/'),'/<br />/',$str):$str;}

also picked up one of the flags in a template with a typo ($ instead of % as opening character).but the consumer response e-mail, appears as follows:

/////////////////////// // // // Dear Pierre Du Toit,////Thank you! Your request for quotations was processed successfully, with your detailed requirements and contact details provided to the suppliers listed below. You can expect contact and quotations within the next 24 hours.////// $CONFIGTEXT Email | Consumer | Message | Intro Premium%// // In order to keep our side of the bargain - assisting you with getting quotations - we forwarded your request to additional suppliers with whom we DONT have a Service Level Agreement in place. Therefore, CanopyXchange CANNOT guarantee a response from them. We ask *pretty please* if you decide to contact them directly using the contact details provided here, that you mention being referred by CanopyXchange (with our Ref.# provided).////// // We trust that your customer service experience with CanopyXchange and our supplier network will be a happy one. You may later receive an email survey inviting your valued feedback as part of our efforts to continuously improve this service. Please reply to this email should you have any queries.////Best wishes!////Consumer Support//QuoteMe - Canopies Team////Copyright © 2010 QuoteMe / CanopyXchange. All rights reserved.// //////
i gather somewhere the opening and closing regex / characters are not required. are ' required if you wish to use them and " otherwise? i could not locate these minor details, it is rather hard to google for these characters.
Since your beginning and ending delimiters are the same, something like nesting becomes completely ambiguous. Imagine if PHP replaced both curly brackets as scope delimiters with a single character like |. It would be nearly impossible to determine what the scope is, if you're beginning another scope or ending a previous one.
i think i didn't make myself quite clear here. a flag is replaced with text that is usually retrieved from the database, if it is not a page variable. the replacement text is allowed to contain further tags, and not the tags themselves. the flags optionally have one parameter, to define which element in a specific array, or which configurable text it should load from the database. sometimes a flag is replaced by another, depending on a page level variable. i seem to have sorted most problems, just this one left that i did not see. :) Edited by Pierre 'Greywacke' du Toit
Link to comment
Share on other sites

i have resolved the / issue, but it is because there is no need for defining the replacement text as a regular expression if it is not. right?one problem though. how would i be able to replace the \n, \r, or \r\n characters with <br /> but only within any xhtml tags. at the moment it is replacing them inside the stylesheet as well (basically wherever it finds these characters).a pcre of this complexity, is beyond my experience i am afraid...basically it needs to replace them only that should be visible - ie within the body tags...i need to get to bed now, got an appointment to make tomorrow morning.ps: i have updated the pcre to search more implicitly on the format of the tags:

function populateflags($str, $br) {	// replace flags used in string (parse entire template contents' as string), populating flags individually and sometimes recursively in callback function.	while (preg_match("/%([A-Z]+[\sA-Za-z|]*)%/",$str)) {		$str = preg_replace_callback("/%([A-Z]+[\sA-Za-z|]*)%/",idflags,$str);	}	return ($br)?preg_replace(array("/\r\n/","/\n/","/\r/"),"<br />",$str):$str;}

pps: this is not an attempt at a replacement for html. it is merely to define the various data components in the mails sent, as comfigurable in the database. :)

Edited by Pierre 'Greywacke' du Toit
Link to comment
Share on other sites

You can use a positive lookahead assertion to make sure the text you want to match comes after something else, or a negative lookbehind assertion to make sure the text you want to match does not come before something else, like a body tag.http://www.php.net/manual/en/regexp.reference.assertions.php

Link to comment
Share on other sites

okay, i've read through it and come up with the following:

function populateflags($str, $br) {	// replace flags used in string (parse entire template contents' as string), populating flags individually and sometimes recursively in callback function.	while (preg_match("/%([A-Z_0-9]+[\sA-Za-z|]*)%/",$str)) {		$str = preg_replace_callback("/%([A-Z_0-9]+[\sA-Za-z|]*)%/",idflags,$str);	}	return ($br)?		preg_replace(			array(				"/(?=\<body[^>]+\>)\r\n(?=\<\/body\>)/",				"/(?=\<body[^>]+\>)\r(?=\<\/body\>)/",				"/(?=\<body[^>]+\>)\n(?=\<\/body\>)/"			),			"<br />",			$str		)		:		$str;}

now just to upload and test, to make sure it works as expected.

Link to comment
Share on other sites

but unfortunately, still the loop is not replacing all the flags.

function populateflags($str, $br) {	// replace flags used in string (parse entire template contents' as string), populating flags individually and sometimes recursively in callback function.	while (preg_match("/%([A-Z_0-9]+[\sA-Za-z\|\(\)]*)%/",$str)) {		$str = preg_replace_callback("/%([A-Z_0-9]+[\sA-Za-z\|\(\)]*)%/",idflags,$str);	}	return ($br)?		preg_replace(			array(				"/(?=\<body[^>]+\>)[^\>]\r\n(?=\<\/body\>)/",				"/(?=\<body[^>]+\>)[^\>]\r(?=\<\/body\>)/",				"/(?=\<body[^>]+\>)[^\>]\n(?=\<\/body\>)/"			),			"<br />",			$str		)		:		$str;}

it replaces the %FLAGNAME optional | parameters | passed% on the first pass, but why would it cop out and leave the rest of the flags?. why would this be happening?

Edited by Pierre 'Greywacke' du Toit
Link to comment
Share on other sites

okay, the multiple replacements have been resolved - it does a while (preg_match_all... now. :)but i can't seem to wrap my head around the replacement of the new lines with the entities after the body tag but before the end body tag.the populated string that needs to be parsed, is at the bottom of the post.this is my current regular expression to try and match all carriage returns or line feeds or both after the body tag, and before the end body tag - to replace with <br />:Regular Expression Testerhowever, this does not seem to work - they match nothing. the pcre that i have compiled is from what i understood on that assertion page you gave me, using both lookahead and lookbehind but something is wrong here, it gets nothing! and i can't seem to see the forest for the trees :) any help, even if it is a nudge in the right direction, would be immensely appreciated.

<!--ACTIVE QUOTE REQUEST MAILER TEMPLATEVersion 2.2.4--><!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"><html xmlns="http://www.w3.org/1999/xhtml"><head><meta http-equiv="Content-Type" content="text/html; charset=utf-8" /><title>[CX] GMW - LWB - lowline_luxury quotation for Pierre Du Toit (Ref.##11004557)</title><style type="text/css"><!--body {	background:				#ffffff url('http://www.intellisource.co.za/images/mail_bg.jpg') repeat-x fixed top left;}* {	margin:				0px;	padding:				0px;}h1, h2, div {	font:					10pt Tahoma, Verdana, Arial, Helvetica, sans-serif;}p {	margin:				12px;}a {	font-weight:				bold;	text-decoration:			none;}div.gw000 { /* container */	width:					90%;	margin:				0 auto;}div.gw001 { /* header */	background:				transparent url('http://www.intellisource.co.za/images/canopyxchange.gif') no-repeat scroll center left;	height:				96px;}div.gw002 { /* tools */	text-align:				right;}div.gw003 { /* data listing */	clear:					both;}A.gw000:hover {	color:					#008000;}A.gw001:hover {	color:					#800000;}div.gw004 { /* footer */	padding:				12px;	font-size:				9pt;	text-align:				right;	border-top:				#b0b0b0 solid 1px;}div.gw005 { /* lead cost / balance */	float:					left;	text-align:				left;}//--></style></head><body><div class="gw000">  <div class="gw001"> </div>  <div class="gw002"><a href="mailto:Jaap Venter<jaap.venter@ananzi.co.za>?subject=Blacklist request for pierre@greywacke.co.za" class="gw000">REPORT LEAD</a></div>  <div class="gw003">	<p>Hello John Cox,Continental Canopies (Cape Town) matched this lead  for GMW - LWB (new_colour_coded) in Western Cape - Cape Town region.This is a Category A lead costing 30.  Continental Canopies (Cape Town) current credit balance is 870.</p></body></html>

Edited by Pierre 'Greywacke' du Toit
Link to comment
Share on other sites

it matches without errors according to that pcre testing page - inserts all the values, but it does not replace the line feed, carriage return, or both characters with the <br /> tag. why would this be? what is wrong with the regular expression in red? i've written the lookbehind assertion to match inside the enclosing paragraph (<p></p>), no matter what the attributes. this is where the <br /> tags should be used.

function populateflags($str, $br) {	// replace flags used in string (parse entire template contents' as string),	// populating flags individually and sometimes recursively in callback	// function.	while (preg_match_all("/%([A-Z_0-9]+[^%]*)%/",$str,$arr)) {		$str = preg_replace_callback("/%([A-Z_0-9]+[^%]*)%/",idflags,			$str);	}	[color="#FF0000"]return ($br)?preg_replace("/(?=\<p[^\>]*\>)		\r\n|\r|\n(?=\<\/p\>)/","<br />",$str):$str;[/color]}

Edited by Pierre 'Greywacke' du Toit
Link to comment
Share on other sites

i really don't think i am getting the assertions right here. somebody please help? the text below, would need to be found, the all carriage returns or new lines or both need to be replaced by the <br /> tag. i seem to have exhausted my resources, cant seem to figure out why for the life of me...i'm starting to think i didn't quite get the assertions, but the more i look at the results on that testing site the less sense it seems to make sense. :) please help someone :/the pcre at the moment is: /(?=<p[^>]*\>[^\r\n]*)\r\n|\r|\n(?=[^\r\n]*\<\/p\>)/the testing page seems to be telling me something, but i'll be damned if i know what :)the string it is running this PCRE against, is below.

	<p>If you would like to discontinue receiving similar leads in future (e.g. you dont manufacture or stock GMW - LWB canopies), please let us know so we can adjust the settings of our supplier matching & lead distribution engine accordingly.Regards,Client Success ManagerCanopyXchange TeamCopyright © 2010 CanopyXchange.  All rights reserved.</p>

Edited by Pierre 'Greywacke' du Toit
Link to comment
Share on other sites

You're using 2 lookahead assertions, you need 1 lookahead and 1 lookbehind. The testing page indicates that it is matching the last newline and nothing else. You can use replacement text other than an underscore to see that more clearly.

Link to comment
Share on other sites

well, i have decided to go with the tinyMCE javascript WYSIWYG editor, so the configurable texts will have html from the word go.this issue has now been resolved.

Edited by Pierre 'Greywacke' du Toit
Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

×
×
  • Create New...