Jump to content

RegExp Help


Krupa

Recommended Posts

Let's say I have something like this as text.

[x=param1,param2,param3]text[/x]

How could I grab those 'param' values (without the commas) and place them in an array so that I may use those values later for comparison?This is the matching I have so far:

/\[x\=(.+?)\](.+?)\[\/x\]/gi

Thank you!

Link to comment
Share on other sites

Does the pattern work? Do you have any code other than the pattern? You would first run the match to get the attributes, then break the attributes up on commas to put them in an array. You probably need to change your pattern, instead of using the dot to match any character, in one place you want to match any character besides "]", and in the other you want to match any character besides "[".

Link to comment
Share on other sites

I have this pattern working (along with the rest of the code so far).

if (/blahblah/.test(location.href)) {	$('selector').each(function() {		if (this.innerHTML.match(/\[x\=(.+?)\](.+?)\[\/x\]/gi)) {			this.innerHTML = this.innerHTML.replace(/\[x\](.+?)\[\/x\]/gi, "REPLACED");		}	});}

The reason I used the dot was because the user has the option of inputting any character they want and I need the pattern to grab all of that input. Is there a better way to do this?

Link to comment
Share on other sites

You can use a character class like this:[^a]to match any character that is not "a". Most of the time when you parse HTML tags, for example, you would get a tag like this:/(<[^>]+>)/That looks for "<", then one or more characters that are not ">", then ">". The ^ character at the start of a range says that you want everything but what you list. It's like a "not" operator for character classes. So it sounds like you want to find "[", then any character that is not "]", then "]".

Link to comment
Share on other sites

Since the user is able to input any character, I don't want any characters to be omitted from the search. For example, I would still want something like this to be detected:

[x=[hey,[there],how,are],[]you[]]TEXT[/x]

Link to comment
Share on other sites

Since the user is able to input any character, I don't want any characters to be omitted from the search. For example, I would still want something like this to be detected:
[x=[hey,[there],how,are],[]you[]]TEXT[/x]

Something like that would be incredibly difficult to parse. You need to have a way to clearly distinguish the end of your [x=...] block from the parameters. You should probably rethink your pattern or consider making certain characters illegal to use (the '[' and ']' for example).
Link to comment
Share on other sites

Just re-read the previous couple of posts and yes, the pattern does need to be changed.Once that is done however, how could I grab those parameters and assign them to an array (assuming that they are separated by commas)?

Link to comment
Share on other sites

Just re-read the previous couple of posts and yes, the pattern does need to be changed.Once that is done however, how could I grab those parameters and assign them to an array (assuming that they are separated by commas)?
I did read the previous posts. By pattern I meant the [x=...]TEXT[/x] not the regex you're using. The regex to parse the input you posted would be incredibly complex, if not impossible.Once you get it figured out though all you'd have to do is use the split function to get those parameters into an array.var paramStr = "param1,param2,parm3";var params = paramStr.split(",");
Link to comment
Share on other sites

If your tags will not be nested then you can look into using an ungreedy search to find [x...[/x]. If you can nest tags within other tags, like this:[x] ... [x] ... [/x] [/x]then you need to parse it manually, which would probably involve several hundred lines of code instead of the regexp, so that you can keep track of how deeply each tag is nested.

Link to comment
Share on other sites

If your tags will not be nested then you can look into using an ungreedy search to find [x...[/x].
That would still leave you with something like this to parse (using the example provided by the OP)[hey,[there],how,are],[]you[]]TEXTwhere the bolded part would not be part of the parameter list. I suppose you could use the lastIndexOf method to search for the last occurence of ']' and use substr to extract everything up until that location. But even that may not work if the 'TEXT' portion can also contain ']' characters.
Link to comment
Share on other sites

I've taken into account the difficulty of matching certain characters, so I've made a few changes so that the input will only accept a-z, A-Z and 0-9. However, I still can't seem to remove the commas from the user inputted parameters. This the last attempt I made (the replace portion of the code is currently testing the value of 'z' which still returns commas).

(function() {	if (/blahblah/.test(location.href)) {		$('selector').each(function() {			if (this.innerHTML.match(/\[x\=(.+?)\](.+?)\[\/x\]/gi)) {				var y = RegExp.$1, z = y.split(',');				this.innerHTML = this.innerHTML.replace(/\[x\=(.+?)\](.+?)\[\/x\]/gi, z);			}		});	}})();

An example of the input would now be

[x=John,Dave,Joe]Hello[/x]

Link to comment
Share on other sites

Ok, I'm new to jQuery so your code is really confusing me. I'll show you a simplified example which you should be able to apply to your code:

var input = "[x=John,Dave,Joe]Hello[/x]";var regex = /\[x\=(.+?)\](.+?)\[\/x\]/gi;var matches = regex.match(input); //The match function is a method of the RegExp object, not String/*This will return an array with matches for the whole string and each subpattern.So in this example you'd have an array like this:["[x=John,Dave,Joe]Hello[/x]", "John,Dave,Joe", "Hello"]where your parameter list is at index 1.*/var paramStr = matches[1]; //Get the parameter listvar params = paramStr.split(","); //Split the parameter list on the comma

Link to comment
Share on other sites

Firebug is telling me regex.match is not a function.

var input = "[x=John,Dave,Joe]Hello[/x]";var regex = /\[x\=(.+?)\](.+?)\[\/x\]/gi;var matches = regex.match(input);var paramStr = matches[1];var params = paramStr.split(",");alert(params);

Anyhow, the jQuery portion of the script is not what what's not working properly; the following snippet is the code I'm having trouble with.

if (this.innerHTML.match(/\[x\=(.+?)\](.+?)\[\/x\]/gi)) {		var y = RegExp.$1, z = y.split(',');		this.innerHTML = this.innerHTML.replace(/\[x\=(.+?)\](.+?)\[\/x\]/gi, z);}

Link to comment
Share on other sites

Firebug is telling me regex.match is not a function.
Doh! Yeah my bad there, sorry! :) match() is indeed a function of a String object. The function I was thinking of was exec(). You'll still need to use the RegExp method (exec) though because match will only return a match for the whole string and not the subpatterns. So your code should be like this:
var input = "[x=John,Dave,Joe]Hello[/x]";var regex = /\[x\=(.+?)\](.+?)\[\/x\]/gi;var matches = regex.exec(input);var paramStr = matches[1];var params = paramStr.split(",");alert(params);

Anyhow, the jQuery portion of the script is not what what's not working properly...
I know. I was just getting confused with the jQuery mixed in there. :)
Link to comment
Share on other sites

What you posted works in firebug, however, what I applied to my code seemed to work at first, but does not now. Perhaps you can pick out an error that I just don't see.

var reg_w = /\[x\=(.+?)\](.+?)\[\/x\]/gi;			if (this.innerHTML.match(reg_w)) {				var a = this.innerHTML.match(reg_w), b = reg_w.exec(a)[1].split(","), c = b.length, x = $('selector').text();				while (c--) {					b[c].toLowerCase() === x.toLowerCase() ? this.innerHTML = this.innerHTML.replace(reg_w, function() { return "<span class='testing'>" +RegExp.$2+ "</span>"; }) :	this.innerHTML = this.innerHTML.replace(reg_w, '[HIDDEN MESSAGE]');				}			}

Link to comment
Share on other sites

What is it doing or not doing? Do you get any errors in firebug when you run it?The only thing that looks odd to me is this:RegExp.$2but maybe that's some kind of jQuery magic, so I don't know if that's causing the issue.

Link to comment
Share on other sites

The RegExp.$2 is just referring to the second stored value of the RegExp match.Firebug does not throw any errors. I also set up alerts to test if anything was coming back null or undefined but everything is coming through fine. It appears unless the value that I'm matching from var x = $('selector').text(); is the first or only value to be looped through in the while loop, the first condition doesn't occur (the function returning RegExp.$2). For example.

x = 'Matt'; // let's say x holds the string Mattb[c] = 'John', 'Matt', 'Joe'; // let's say that these are the values that are looped through in that orderb[c].toLowerCase() === x.toLowerCase() ? this.innerHTML = this.innerHTML.replace(reg_whisper, function() { return "<span class='whisper'>" +RegExp.$2+ "</span>"; })/* This part of the if statement won't happen even though b[c] does hold a value equal to x. It will however work if 'Matt' is the first value to be looped through. */

Link to comment
Share on other sites

Actually, now that I look at it again, I think the issue could be here:var a = this.innerHTML.match(reg_w), b = reg_w.exec(a)[1].split(",")The string.match() function returns an array of all the matches, so a is an array. Try something like this:var a = this.innerHTML.match(reg_w), b = reg_w.exec(a[0])[1].split(",")

Link to comment
Share on other sites

Actually, now that I look at it again, I think the issue could be here:var a = this.innerHTML.match(reg_w), b = reg_w.exec(a)[1].split(",")The string.match() function returns an array of all the matches, so a is an array. Try something like this:var a = this.innerHTML.match(reg_w), b = reg_w.exec(a[0])[1].split(",")
It still seems to be doing the same thing. :) I'm really stumped as to what the problem could be.
Link to comment
Share on other sites

It still seems to be doing the same thing. :) I'm really stumped as to what the problem could be.
Does Firebug show any errors when you run your code? Try placing alerts in various places in your code or step through your code in Firebug to check the values of your variables. Are they what you expect?Here's a quick test I did, each of the names was alerted perfectly fine:
   var reg_w = /\[whisper\=(.+?)\](.+?)\[\/whisper\]/gi;	var testStr = "[whisper=Fred,Joe,John]This is a test[/whisper]"	   if (testStr.match(reg_w)) {	   var a = testStr.match(reg_w);	   var b = reg_w.exec(a[0])[1].split(",");	   var c = b.length	   	   while (c--) {		   alert(b[c]);	   }   }

Link to comment
Share on other sites

Everything seems to run perfectly. It's just that for some reason unless the value of x is equal to the first value of b[c] the code doesn't work as it should. Even if another index of b[c] is equal to x, it won't work unless it's the first value to be matched. It's supposed to work if any of the parameters match x.Here's the code so you don't need to go back a page :)

var reg_w = /\[x\=(.+?)\](.+?)\[\/x\]/gi;			if (this.innerHTML.match(reg_w)) {				var a = this.innerHTML.match(reg_w), b = reg_w.exec(a)[1].split(","), c = b.length, x = $('selector').text();				while (c--) {					b[c].toLowerCase() === x.toLowerCase() ? this.innerHTML = this.innerHTML.replace(reg_w, function() { return "<span class='testing'>" +RegExp.$2+ "</span>"; }) :	this.innerHTML = this.innerHTML.replace(reg_w, '[HIDDEN MESSAGE]');				}			}

Link to comment
Share on other sites

Oooh, I see the problem! You're changing your input string throughout the code.Your input string is the innerHTML of some element (this.innerHTML) which equals: "[whisper=Fred,Joe,John]This is a test[/whisper]"Your regex correctly splits out the parameter list, so you end up with the following array: b = ["Fred", "Joe", "John"]Now we'll assume that x = "Joe";Loop through your array:The first time through b[c] will equal: "John"Test this against x:b[c].toLowerCase() === x.toLowerCase() ? ...Not equal, so execute the false block:this.innerHTML = this.innerHTML.replace(reg_w, '[HIDDEN MESSAGE]');which means that this.innerHTML is now equal to: "[HIDDEN MESSAGE]"instead of the original string: "[whisper=Fred,Joe,John]This is a test[/whisper]"The second time through the loop, b[c] now equals: "Joe"which is equal to x so we execute the true block of your conditional:this.innerHTML = this.innerHTML.replace(reg_w, function() { return "<span class='testing'>" +RegExp.$2+ "</span>"; })Only problem now is that the replace function is still looking for: ""[whisper=Fred,Joe,John]This is a test[/whisper]"in this.innerHTML, but the string doesn't contain that because it was changed to: "[HIDDEN MESSAGE]"the first time through the loop.You need to have a second variable to store the output from the replace function. eg.tempOutput = this.innerHTML.replace(reg_w, '[HIDDEN MESSAGE]');

Link to comment
Share on other sites

Oh! That makes sense :) I understand the problem but I'm not exactly clear on how to implement your solution. How would my if statement look with that new variable being used?

Link to comment
Share on other sites

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...