Need some help with Node Iteration, Filtering, etc.

Fmdpa · June 29, 2011

I'm working on a project in which I want to iterate through all text nodes in the page (excluding empty text nodes) and then I want to wrap words in tags that match any elements in array foo (like when you highlight matched matched keywords on a search results page). First I'm having trouble filtering out nodes that don't match my criteria.

allNodes = document.getElementsByTagName('*') skippedTags = /(nofilter|style|meta)/im	allNodes = allNodes.filter(function(node) {  // only retain nodes that are text nodes and are not empty nodes	return (node.nodeType == 3 && node.nodeValue.match(/^\w*?$/gm) && !node.nodeName.match(skippedTags))});

This is giving me the error, "allNodes.filter is not a function".

thescientist · June 29, 2011

I take it allNodes is showing up as an array after the getElementsByTagName call? The other thing is, although it seems like it's supported in Opera, is that filter is not necessarily part of the standard. The MDC has an article on how to get around that.https://developer.mozilla.org/en/JavaScript...ts/Array/filter

Fmdpa · June 29, 2011

I tested the filter() function on Opera's JS console with an array of numbers and it worked fine. But when I tried to filter the array of HTML nodes, it gave me the error. Maybe the nodes aren't actually part of the Array object but rather the NodeList object. ...Actually, I just tested some more and found that NodeList is a function, not an object. If I could determine what object allNodes is, then I could extend it with the function on the MDC documentation page for filter().edit: This is continued work on my Opera extension so I don't have to worry about cross browser compatibility

thescientist · June 29, 2011

ah yes, good call catching nodeList vs. an array.

Fmdpa · June 29, 2011

Alright, I was able to successfully extend the NodeList object with this sample test function:

NodeList.prototype.getLength = function() {	 return this.length}

This worked. But then I tried to add the filter() function to the NodeList object...

	NodeList.prototype.filter = function(fun /*, thisp*/) {		var len = this.length >>> 0;		if (typeof fun != "function") {			throw new TypeError();		}		var res = [];		var thisp = arguments[1];		for (var i = 0; i < len; i++) {			if (i in this) {				var val = this[i]; // in case fun mutates this				if (fun.call(thisp, val, i, this)) {					res.push(val);				}			}		}		return res;	}

When I called this:

allNodes = allNodes.filter(function(node) {	return node.nodeType == 3 // filter out all nodes that are not text nodes});

It emptied the array. Apparently all of the nodes in allNodes were element nodes (nodeType == 1). How can I fetch all text nodes on the page?

Ingolme · June 30, 2011

It's because text nodes aren't elements and you're only calling elements when using document.getElementsByTagName('*').An element is a type of node, of node type 1. If you want to transverse all the nodes of the document you're going to have to use a recursive function.

Fmdpa · July 1, 2011

I should have thought of that. Here's the function I came up with that should fetch all text nodes on a page that are not empty/whitespace and not text nodes that are inside of the given elements. I can't tell whether it is doing what it should. Any comments or suggestions?

allElems = document.getElementsByTagName('*')skippedTags = /(nofilter|style|meta)/imfor (i = 0; i < allElems.length; i++) {	if (!allElems[i].nodeName.match(skippedTags) && allElems[i].hasChildNodes()) { // iterate through the childNodes of the current element node in the loop IF it isn't a skipped tag		for (j = 0; j < allElems[i].childNodes.length; j++) {			if (allElems[i].childNodes[j].nodeType == 3 && !allElems[i].childNodes[j].nodeValue.match(/^\w*?$/m)) { // accept *text* nodes that are not empty or plain whitespace				allElems[i].childNodes[j].nodeValue = allElems[i].childNodes[j].nodeValue.replace(window.dirtyList, replacement)			}		}	}}

Ingolme · July 1, 2011

Well, that is one way to do it.Can't you perform any tests to see if it's working? Check new values or something that'sI, personally, would make a recursive function that goes through every node inside the body element.

function returnAllTextNodes(node) {  var i, list = [];  for(i = 0; i < node.childNodes.length; i++) {	if(node.childNodes[i].nodeType == 3) {	  list.push(node.childNodes[i]);	} else if(node.childNodes[i].nodeType == 1) {	  list = list.concat( returnAllTextNodes(node.childNodes[i]) );	}  }  return list;}textNodes = returnAllTextNodes(document.body);

You can do whatever you want with the array of nodes returned from the function. You'll have to loop through it and check the nodeValue of each element to test that it's not just whitespace.Also, this does not return a NodeList, it returns an Array, so you can't use nodelist methods or properties on the returned result.

Sign In

Need some help with Node Iteration, Filtering, etc.

Recommended Posts

Fmdpa

Link to comment

Share on other sites

thescientist

Link to comment

Share on other sites

Fmdpa

Link to comment

Share on other sites

thescientist

Link to comment

Share on other sites

Fmdpa

Link to comment

Share on other sites

Ingolme

Link to comment

Share on other sites

Fmdpa

Link to comment

Share on other sites

Ingolme

Link to comment

Share on other sites

Archived

Browse

Activity