niche Posted December 10, 2012 Share Posted December 10, 2012 (edited) I'm learning about the way php works its DOM (http://php.net/manual/en/book.dom.php). I'd like to display the contents of $items using a learning script that I created from the example at http://php.net/manua...elist.item.php. <?php$file = file_get_contents('http://www.imdb.com/find?q=star+trek&s=all');$DOM = new DOMDocument();$DOM->load($file); $items = $DOM->getElementsByTagName('a'); for ($i = 0; $i < $items->length; $i++) { echo $items->item($i)->nodeValue . "\n";}?> There's plenty of anchors at this imdb page and there are no warnings or errors. Why do you suppose there's no display? Edited December 10, 2012 by niche Link to comment Share on other sites More sharing options...
Don E Posted December 10, 2012 Share Posted December 10, 2012 I think the load method is for xml files. For html pages, try: $DOM->loadHTMLFile(); You don't need the file_get_contents() function. Just enter the address in the loadHTMLFile function as a string. ie: $DOM->loadHTMLFile('http://www.w3schools.com'); 1 Link to comment Share on other sites More sharing options...
niche Posted December 10, 2012 Author Share Posted December 10, 2012 (edited) Thanks Don E. That function worked much better, but it threw seven warnings. They all said the same thing with different line refs: Warning: DOMDocument::loadHTMLFile() [domdocument.loadhtmlfile]: Unexpected end tag : a in http://www.w3schools.com, line: 616 in E:\wamp\www\test_121209.php on line 3 I used the w3schools site. Do you suppose they they have a few unclosed anchors, or does something else come to mind? I know I can turn them off. Just interested is all. Current script: <?php$DOM = new DOMDocument();$DOM->loadHTMLFile('http://www.w3schools.com');$items = $DOM->getElementsByTagName('a');for ($i = 0; $i < $items->length; $i++) { echo $items->item($i)->nodeValue . '<br/>';}?> Edited December 10, 2012 by niche Link to comment Share on other sites More sharing options...
Don E Posted December 10, 2012 Share Posted December 10, 2012 (edited) Welcome. I got the same warnings too when I tried www.w3schools.com. I am not entirely sure if it is because of unclosed anchors, but it may be so. I just ran their site through w3 validator and got some warnings/errors. If you try http:www.w3.org ( $DOM->loadHTMLFile('http://www.w3.org'); ) though, you will get no warnings and get the results you're looking for. Edited December 10, 2012 by Don E 1 Link to comment Share on other sites More sharing options...
niche Posted December 10, 2012 Author Share Posted December 10, 2012 Very interesting. Thanks again. You made this a very good topic. It's time to take the garbage to the curve and hit the rack. Link to comment Share on other sites More sharing options...
birbal Posted December 10, 2012 Share Posted December 10, 2012 http://in3.php.net/manual/en/function.libxml-use-internal-errors.php this function will disable html validation errors. you can also pass the file to tidy http://php.net/tidy to clean out the page. 1 Link to comment Share on other sites More sharing options...
niche Posted December 10, 2012 Author Share Posted December 10, 2012 Very good post birbal. Thanks as always. Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now