Fmdpa Posted June 29, 2011 Share Posted June 29, 2011 I'm working on a project in which I want to iterate through all text nodes in the page (excluding empty text nodes) and then I want to wrap words in tags that match any elements in array foo (like when you highlight matched matched keywords on a search results page). First I'm having trouble filtering out nodes that don't match my criteria. allNodes = document.getElementsByTagName('*') skippedTags = /(nofilter|style|meta)/im allNodes = allNodes.filter(function(node) { // only retain nodes that are text nodes and are not empty nodes return (node.nodeType == 3 && node.nodeValue.match(/^\w*?$/gm) && !node.nodeName.match(skippedTags))}); This is giving me the error, "allNodes.filter is not a function". Link to comment Share on other sites More sharing options...
thescientist Posted June 29, 2011 Share Posted June 29, 2011 I take it allNodes is showing up as an array after the getElementsByTagName call? The other thing is, although it seems like it's supported in Opera, is that filter is not necessarily part of the standard. The MDC has an article on how to get around that.https://developer.mozilla.org/en/JavaScript...ts/Array/filter Link to comment Share on other sites More sharing options...
Fmdpa Posted June 29, 2011 Author Share Posted June 29, 2011 I tested the filter() function on Opera's JS console with an array of numbers and it worked fine. But when I tried to filter the array of HTML nodes, it gave me the error. Maybe the nodes aren't actually part of the Array object but rather the NodeList object. ...Actually, I just tested some more and found that NodeList is a function, not an object. If I could determine what object allNodes is, then I could extend it with the function on the MDC documentation page for filter().edit: This is continued work on my Opera extension so I don't have to worry about cross browser compatibility Link to comment Share on other sites More sharing options...
thescientist Posted June 29, 2011 Share Posted June 29, 2011 ah yes, good call catching nodeList vs. an array. Link to comment Share on other sites More sharing options...
Fmdpa Posted June 29, 2011 Author Share Posted June 29, 2011 Alright, I was able to successfully extend the NodeList object with this sample test function: NodeList.prototype.getLength = function() { return this.length} This worked. But then I tried to add the filter() function to the NodeList object... NodeList.prototype.filter = function(fun /*, thisp*/) { var len = this.length >>> 0; if (typeof fun != "function") { throw new TypeError(); } var res = []; var thisp = arguments[1]; for (var i = 0; i < len; i++) { if (i in this) { var val = this[i]; // in case fun mutates this if (fun.call(thisp, val, i, this)) { res.push(val); } } } return res; } When I called this: allNodes = allNodes.filter(function(node) { return node.nodeType == 3 // filter out all nodes that are not text nodes}); It emptied the array. Apparently all of the nodes in allNodes were element nodes (nodeType == 1). How can I fetch all text nodes on the page? Link to comment Share on other sites More sharing options...
Ingolme Posted June 30, 2011 Share Posted June 30, 2011 It's because text nodes aren't elements and you're only calling elements when using document.getElementsByTagName('*').An element is a type of node, of node type 1. If you want to transverse all the nodes of the document you're going to have to use a recursive function. Link to comment Share on other sites More sharing options...
Fmdpa Posted July 1, 2011 Author Share Posted July 1, 2011 I should have thought of that. Here's the function I came up with that should fetch all text nodes on a page that are not empty/whitespace and not text nodes that are inside of the given elements. I can't tell whether it is doing what it should. Any comments or suggestions? allElems = document.getElementsByTagName('*')skippedTags = /(nofilter|style|meta)/imfor (i = 0; i < allElems.length; i++) { if (!allElems[i].nodeName.match(skippedTags) && allElems[i].hasChildNodes()) { // iterate through the childNodes of the current element node in the loop IF it isn't a skipped tag for (j = 0; j < allElems[i].childNodes.length; j++) { if (allElems[i].childNodes[j].nodeType == 3 && !allElems[i].childNodes[j].nodeValue.match(/^\w*?$/m)) { // accept *text* nodes that are not empty or plain whitespace allElems[i].childNodes[j].nodeValue = allElems[i].childNodes[j].nodeValue.replace(window.dirtyList, replacement) } } }} Link to comment Share on other sites More sharing options...
Ingolme Posted July 1, 2011 Share Posted July 1, 2011 Well, that is one way to do it.Can't you perform any tests to see if it's working? Check new values or something that'sI, personally, would make a recursive function that goes through every node inside the body element. function returnAllTextNodes(node) { var i, list = []; for(i = 0; i < node.childNodes.length; i++) { if(node.childNodes[i].nodeType == 3) { list.push(node.childNodes[i]); } else if(node.childNodes[i].nodeType == 1) { list = list.concat( returnAllTextNodes(node.childNodes[i]) ); } } return list;}textNodes = returnAllTextNodes(document.body); You can do whatever you want with the array of nodes returned from the function. You'll have to loop through it and check the nodeValue of each element to test that it's not just whitespace.Also, this does not return a NodeList, it returns an Array, so you can't use nodelist methods or properties on the returned result. Link to comment Share on other sites More sharing options...
Recommended Posts
Archived
This topic is now archived and is closed to further replies.