son Posted May 19, 2010 Share Posted May 19, 2010 I try to get a list of all web pages in website via if($dirHandle = opendir("http://www.site.co.uk/")) { while(false !== ($file = readdir($dirHandle))) { print($file."<br>"); } closedir($handle); }? The web server seems to be configured so I cannot run this script (hosted website). Is there a different way to generate the list of all web pages?Son Link to comment Share on other sites More sharing options...
Synook Posted May 19, 2010 Share Posted May 19, 2010 You can't run opendir() on an absolute URL - think of the security risk! If you use a relative path (e.g. opendir("/")) you may have better results. You could also try using glob(). Link to comment Share on other sites More sharing options...
justsomeguy Posted May 19, 2010 Share Posted May 19, 2010 If you're trying to list the contents of a remote web server, you can't. Web servers don't have a way to just return a directory listing of files, they will only do that if you're requesting a directory that does not have a default document and the server is configured to output the directory listing in that case. If you can connect to the server through FTP, there's a way there to list files. Link to comment Share on other sites More sharing options...
son Posted May 20, 2010 Author Share Posted May 20, 2010 If you're trying to list the contents of a remote web server, you can't. Web servers don't have a way to just return a directory listing of files, they will only do that if you're requesting a directory that does not have a default document and the server is configured to output the directory listing in that case. If you can connect to the server through FTP, there's a way there to list files.But how does it work then that Google for example can create a sitemap of a website in no time? Surely they are somehow accessing the files...Son Link to comment Share on other sites More sharing options...
justsomeguy Posted May 20, 2010 Share Posted May 20, 2010 No, they just follow links. Some sites also provide a sitemap for Google to start with. Link to comment Share on other sites More sharing options...
Synook Posted May 21, 2010 Share Posted May 21, 2010 If you don't link to a file anywhere on your server, Google won't be able to find it. HTTP just doesn't include a method of getting a directory's contents (unlike, say, FTP). Link to comment Share on other sites More sharing options...
son Posted June 10, 2010 Author Share Posted June 10, 2010 If you don't link to a file anywhere on your server, Google won't be able to find it. HTTP just doesn't include a method of getting a directory's contents (unlike, say, FTP).Now I got it, to interlink the pages is the key;-)Son Link to comment Share on other sites More sharing options...
Recommended Posts
Archived
This topic is now archived and is closed to further replies.