Jump to content

check url adress with curl/get_headers


WesleyA

Recommended Posts


    <html><center><br><br><br>checks with get headers

    <form action="#" method="post">
    <input type="text" name="webadres" value=" " required>
    <br>
    <input type="submit" value="Send" ><br><br><br>
    </form>
     <?php
     if ($_SERVER['REQUEST_METHOD'] == 'POST'){
       
         $sites= $_POST['webadres'];
		 var_dump($sites);
		 
		if (isset($sites)) {


	var_dump(get_headers($sites));
     }

     ?>

     <br><br>Checks with curl get info

     <?php

     $url = curl_init($sites);
     curl_setopt($url, CURLOPT_RETURNTRANSFER, 1);
     $result = curl_exec($url);
     $info = curl_getinfo($url);
     print_r($info);
     curl_close($url);

     } ?>

     </html>

I made above script to test whether a webadress exists or not.

 

I'm very surprised that a lot of inputs dont give any result, they actually give a php error message.

 

using sites like http://www.facebook.com does not work, for it twitter it runs. pinterest does only run http and not https. I also found sites like http://www.bbc.co.uk or http://www.cnn.com and more giving no result.

 

Can someone explain why this happens and how to securily check the existance of a website.

 

 

 

Link to comment
Share on other sites

ok this is the output of an URL giving an error message:

 

 

 

 

string ' http://www.facebook.com' (length=24)

( ! ) Warning: get_headers(): This function may only be used against URLs in ......\WebSiteValid\test-get-headers.php on line 30 Call Stack # Time Memory Function Location 1 0.0401 245968 {main}( ) ..\test-get-headers.php:0 2 0.0420 246104 get_headers ( ) ..\test-get-headers.php:30 boolean false


Checks with curl get info Array ( => http://www.facebook.com/ [content_type] => [http_code] => 0 [header_size] => 0 [request_size] => 0 [filetime] => -1 [ssl_verify_result] => 0 [redirect_count] => 0 [total_time] => 0 [namelookup_time] => 0 [connect_time] => 0 [pretransfer_time] => 0 [size_upload] => 0 [size_download] => 0 [speed_download] => 0 [speed_upload] => 0 [download_content_length] => -1 [upload_content_length] => -1 [starttransfer_time] => 0 [redirect_time] => 0 [redirect_url] => [primary_ip] => [certinfo] => Array ( ) [primary_port] => 0 [local_ip] => [local_port] => 0 )
Edited by WesleyA
Link to comment
Share on other sites

string ' http://www.facebook.com' (length=24)
Notice that there is a space at the beginning of the URL, probably because you set the default value to the field as a space and then just type or paste after that. That URL is 23 characters long, the extra character is the space. You should probably trim the values you get from $_POST to remove excess whitespace, or validate them in some way to ensure that they are a valid URL. You can use parse_url to parse and rebuild the URL if it is valid.
  • Like 1
Link to comment
Share on other sites

trimming the input first works fine. Thanks for the suggestion

 

Now, I want to use get_headers to test if a website exists. But testing only at status code 200 seems not enough. Sometimes an URL gives code 301 or 302 or so.

 

So how exactly is get_headers used? What is recommended I mean. Is it possible to check all status codes and how is that done?

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...