Jump to content

Simple HTML DOM Parser: how to get the nodeValue?


Recommended Posts

<?php// Simple HTML DOM Parserinclude('simple_html_dom.php');$html = file_get_html('http://lifelearning.xtreemhost.com/'); foreach ($html->getElementsByTagName("ul",0)->getElementsByTagName("li") as $li){echo $li->plaintext; // works; but return the whole nodeValues (From "Point 1" to "Reference")	    echo $li->childNodes(0)->nodeValue; // return nothingecho "<br />";}?>

<ul><li>Point 1	 <ul>		 <li>Point 1.1</li>		 <li>Point 1.2</li>		    <pre class="CodeDisplay">		    some codes		    </pre>		    <li>Reference: <a href="link.html" target="_blank">link</a></li>	    </ul>    </li> </ul>

I want to just thru PHP get the plain text of the first li ("Point 1").How can I do that in PHP?

Link to post
Share on other sites

It's true that nodeValue and textContent concatenate the values of all text nodes in a node. But text nodes do exist in the object and can be accessed the old-fashioned way. You simply forgot the correct syntax. Try this

echo $li->childNodes->item(0)->textContent;

Of course. if you don't know the exact location of the text node, you can loop through the childNodes and test the nodeType or nodeName.

Edited by Deirdre's Dad
Link to post
Share on other sites
  • 3 weeks later...
It's true that nodeValue and textContent concatenate the values of all text nodes in a node. But text nodes do exist in the object and can be accessed the old-fashioned way. You simply forgot the correct syntax. Try this
echo $li->childNodes->item(0)->textContent;

Of course. if you don't know the exact location of the text node, you can loop through the childNodes and test the nodeType or nodeName.

It returns the error:
Fatal error: Call to a member function item() on a non-object in /home/a6016920/public_html/example_basic_selector.php on line 11
Link to post
Share on other sites
In PHP's DOMDocument there are no text nodes. The node value belongs to the <li> element itself.
echo $li->nodeValue

The PHP page returns nothing (a blank page).
Link to post
Share on other sites

did you try var_dump() $li to see what is it actually?

Link to post
Share on other sites
did you try var_dump() $li to see what is it actually?
It returns a large chunk of unreadable texts which I can't understand.
Link to post
Share on other sites
<body><ul id="ul1"><li id="pt1">Point 1		 <ul id="ul2">			 <li id="pt11">Point 1.1</li>			 <li id="pt12">Point 1.2</li>			    <pre class="CodeDisplay">			    some codes			    </pre>			 <li id="ref">Reference: <a href="link.html" target="_blank">link</a></li>		 </ul>    </li></ul></body><script> alert(document.getElementsByTagName("li")[0].childNodes[0].nodeValue); // Point 1 </script>

In JS, it returns what I want.Is there an equivalent in the simple HTML Dom of PHP?

Link to post
Share on other sites

The following code works. And it does not require an external library.

$url = 'http://www.someplace.com';$doc = new DOMDocument();$doc->loadHTMLFile($url);$els = $doc->getElementsByTagName('li');echo $els->item(0)->childNodes->item(0)->textContent;

Notice: if your document really looks like this, PHP will issue a warning. The <pre> element is not permitted between <li> elements.

<li id="pt12">Point 1.2</li>    <pre class="CodeDisplay">      some codes    </pre><li id="ref">Reference: <a href="link.html" target="_blank">link</a></li>

Link to post
Share on other sites
The following code works. And it does not require an external library.
$url = 'http://www.someplace.com';$doc = new DOMDocument();$doc->loadHTMLFile($url);$els = $doc->getElementsByTagName('li');echo $els->item(0)->childNodes->item(0)->textContent;

Notice: if your document really looks like this, PHP will issue a warning. The <pre> element is not permitted between <li> elements.

<li id="pt12">Point 1.2</li>	<pre class="CodeDisplay">	  some codes	</pre><li id="ref">Reference: <a href="link.html" target="_blank">link</a></li>

Yes, ya're right. And you've made a point.First, the pre tag is not accepted by PHP. After removing the tags, the warning generated by PHP disappears.Second, the DOMDocument method achieves what I want. I don't need an external lib like the Simple HTML Dom. By the way, I can't check up online. I wonder if I can do this in PHP:
<?phpecho $html->getElementsByTagName('li',0)->childNodes[0]->textContent;// it generates warnings// instead of using item()?>

Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
×
×
  • Create New...