Jump to content

DOM


niche

Recommended Posts

I'm learning about the way php works its DOM (http://php.net/manual/en/book.dom.php). I'd like to display the contents of $items using a learning script that I created from the example at http://php.net/manua...elist.item.php.

<?php$file = file_get_contents('http://www.imdb.com/find?q=star+trek&s=all');$DOM = new DOMDocument();$DOM->load($file); $items = $DOM->getElementsByTagName('a'); for ($i = 0; $i < $items->length; $i++) {	echo $items->item($i)->nodeValue . "\n";}?>

There's plenty of anchors at this imdb page and there are no warnings or errors. Why do you suppose there's no display?

Edited by niche
Link to comment
Share on other sites

I think the load method is for xml files. For html pages, try: $DOM->loadHTMLFile(); You don't need the file_get_contents() function. Just enter the address in the loadHTMLFile function as a string. ie: $DOM->loadHTMLFile('http://www.w3schools.com');

  • Like 1
Link to comment
Share on other sites

Thanks Don E. That function worked much better, but it threw seven warnings. They all said the same thing with different line refs: Warning: DOMDocument::loadHTMLFile() [domdocument.loadhtmlfile]: Unexpected end tag : a in http://www.w3schools.com, line: 616 in E:\wamp\www\test_121209.php on line 3 I used the w3schools site. Do you suppose they they have a few unclosed anchors, or does something else come to mind? I know I can turn them off. Just interested is all. Current script:

<?php$DOM = new DOMDocument();$DOM->loadHTMLFile('http://www.w3schools.com');$items = $DOM->getElementsByTagName('a');for ($i = 0; $i < $items->length; $i++) {	echo $items->item($i)->nodeValue . '<br/>';}?>

Edited by niche
Link to comment
Share on other sites

Welcome. I got the same warnings too when I tried www.w3schools.com. I am not entirely sure if it is because of unclosed anchors, but it may be so. I just ran their site through w3 validator and got some warnings/errors. If you try http:www.w3.org ( $DOM->loadHTMLFile('http://www.w3.org'); ) though, you will get no warnings and get the results you're looking for.

Edited by Don E
  • Like 1
Link to comment
Share on other sites

Very interesting. Thanks again. You made this a very good topic. It's time to take the garbage to the curve and hit the rack.

Link to comment
Share on other sites

http://in3.php.net/manual/en/function.libxml-use-internal-errors.php this function will disable html validation errors. you can also pass the file to tidy http://php.net/tidy to clean out the page.
  • Like 1
Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...