Jump to content

capturing unexpected file requests SOLVED with thanks in the final post


niche

Recommended Posts

I know how to track key strokes, but I can't imagine how to capture unexpected file requests at my site. How do I get an unexpected url request into a php variable?

Edited by niche
Link to comment
Share on other sites

Using mod_rewrite or the ErrorDocument directive, you can redirect all requests (or all requests that are not to known files) to a single PHP file. From within this PHP file, $_SERVER['REQUEST_URI'] will contain the full URI that the file was accessed with. Other $_SERVER variables contain some more specific data that you could analyze.

Link to comment
Share on other sites

Guest So Called

You should describe what you're doing in more detail. My site works like WordPress. All files that cannot be served from static resources are routed to the index.php file. (Static resources are like images, scripts, style sheets...) It could work the same if you just want an error handler (script). Here's an .htaccess example:

RewriteCond %{REQUEST_FILENAME} !-fRewriteCond %{REQUEST_FILENAME} !-dRewriteRule . /error.php

(My file has "/" instead of "/error.php")

Link to comment
Share on other sites

So, apache can be made to send unexpected file requests to a location that can be accessed by a php script. If so, what would that output look like?

Link to comment
Share on other sites

So, apache can be made to send unexpected file requests to a location that can be accessed by a php script. If so, what would that output look like?
No... it actually runs the PHP file itself (which PHP file? The one you specify when you configure those), but within the PHP file, your super globals ($_SERVER, $_GET, $_POST, etc.) are populated with what would've been received if this was a real file.The output looks like whatever you make that PHP script output.
  • Like 1
Link to comment
Share on other sites

Guest So Called
So, apache can be made to send unexpected file requests to a location that can be accessed by a php script. If so, what would that output look like?
The unexpected file request ends up at whatever script you send them to, perhaps error.php. You can process that inside however you want. In my case I have multiple error sources directed there. My file first identifies which kind of error (404 or a couple dozen other possible errors) and formats a message including header (like "404 Not Found"). It also captures the URL ($_SERVER['REQUEST_URI']). Then it sends the error message and logs all the relevant things to an error log. (Time, date, URI, user agent, remote IP, client address, etc.) Here's how you steer your other errors there, if you can kind of reverse engineer it and keep in mind that all my requests go to index.php and at the end of index.php if the request cannot be handled by any other method it goes to my error.php:
ErrorDocument 400 /error/400ErrorDocument 401 /error/401ErrorDocument 402 /error/402ErrorDocument 403 /error/403ErrorDocument 404 /ErrorDocument 405 /error/405ErrorDocument 406 /error/406ErrorDocument 407 /error/407ErrorDocument 408 /error/408ErrorDocument 409 /error/409ErrorDocument 410 /error/410ErrorDocument 411 /error/411ErrorDocument 412 /error/412ErrorDocument 413 /error/413ErrorDocument 414 /error/414ErrorDocument 415 /error/415ErrorDocument 500 /error/500ErrorDocument 501 /error/501ErrorDocument 502 /error/502ErrorDocument 503 /error/503ErrorDocument 504 /error/504ErrorDocument 505 /error/505

The /error/ is a pseudo-directory. I have only one directory under my web root, for images. The main script analyzes $_SERVER['REQUEST_URI'] and handles what it can by other means. Files and directories not found end up getting handled by error.php. Accesses to /error/ are also sent to error.php where the /400 through /505 are decoded to provide exact type of error. Note that not all of them are possible, like e.g. 500 Internal Server Error. If you have a real internal server error it will prevent your script from executing. Most of my error documents can't really happen in real life, or if they do they can't really reach my script. I can't really say why I've left some of those in, maybe just because I wrote the code and found out later that they were "can't happen" situations, and didn't feel like unwriting the code.

Link to comment
Share on other sites

Guest So Called

Once again BR types faster but I type more, and include examples (although I'm not sure if Niche will want to use my ErrorDocument .htaccess stuff.

Link to comment
Share on other sites

Once again BR types faster but I type more, and include examples (although I'm not sure if Niche will want to use my ErrorDocument .htaccess stuff.
I've been on both sides of the typing thing too. Don't worry :P .
Link to comment
Share on other sites

That means that I need a simple if statement at the top of my index that checks for errors in the $_SERVER array. Right? I think you've said worlds to that effect already. I'm using my worlds to confirm the communication.

Link to comment
Share on other sites

That means that I need a simple if statement at the top of my index that checks for errors in the $_SERVER array. Right?
If you redirect both good and bad requests to the same file, yes. If you have a dedicated file for bad requests, you don't need that check, since you already know it's a bad request. The only thing left for the script to do is figure out why is this a bad request, and act accordingly.
Link to comment
Share on other sites

Guest So Called
That means that I need a simple if statement at the top of my index that checks for errors in the $_SERVER array. Right? I think you've said worlds to that effect already. I'm using my worlds to confirm the communication.
Well not sure. That's ambiguous to me. If you're directing all your traffic to index.php including errors (this is what I do) then yes you look at $_SERVER['REQUEST_URI'] and you process that request if it's valid. If it's not valid then handle it as an error and do what ever you want with that (e.g. log the error and send 404 header and file not found HTML message).
Link to comment
Share on other sites

Boen, maybe you can come over and watch my sick dog so I can type more replies... ;) ;) ;):P
He won't like me anyway... I have a (healthy) cat.
Link to comment
Share on other sites

Guest So Called

Well he doesn't like me today either. Stuff coming out both ends, I have to be ready to take him outside at any moment, can't go anywhere, can't do anything except fiddle around on the Internet. I like the challenge of trying to handle PHP questions, probably my best area of expertise relevant to this forum. My own website is pretty much done at the moment until I get another feature idea (and I already have way too many features, is as obvious from some of my topics on "experiments" elsewhere in the forum). I dread it when I once again get the idea, "Hey, it's not broken, so let's fix it anyway!" :):P

Link to comment
Share on other sites

What's one way you sort the good from the bad requests not including an if statement at the top of the index?

Link to comment
Share on other sites

What's one way you sort the good from the bad requests not including an if statement at the top of the index?
When I say "good" above, I mean "valid" - it's by an authorized IP, contains a valid Host header, points to an existing file or matches a rewrite rule."Bad" is everything else. Requests from unauthorized IPs, unknown Host header, not existent file, forbidden file, no matching rewrite rule, etc.Whether the request was "bad" as in "malicious" or "bad" as in "accidental mistype" is something the PHP script could try to determine... or not.
Link to comment
Share on other sites

If you redirect both good and bad requests to the same file, yes. If you have a dedicated file for bad requests, you don't need that check, since you already know it's a bad request. The only thing left for the script to do is figure out why is this a bad request, and act accordingly.
Where else could you check $_SERVER['REDIRECT_STATUS'] except in the index? Edited by niche
Link to comment
Share on other sites

Guest So Called
What's one way you sort the good from the bad requests not including an if statement at the top of the index?
I don't know what you mean by good or bad. I was assuming your main concern was stuff like 404 URLs. That's where most of my annoying site requests end up, because most of the hackers are lamer script kiddies and they're just running probes on as many sites as they can. Their probes don't match any valid files on my site because I'm not running any standard software where they know vulnerabilities. My site would have to be hacked as one of a kind, kind of like trying to break into Ft. Knox when the only thing in the vault is a bag of stale peanuts. But there are other reasons to analyze all your input. In the past I've found broken links on my own site by analyzing 404 traffic. Sometimes I find URLs that used to be valid, and I changed them for some reason, and then years later one of the search engines wants that file. A few years ago I added a redirect feature, a MySQL table that lists old URI vs. new URI. If the requested URI matches the old URI entry I 301 redirect the request to the new URI. My processing goes pretty much like this: 1. Connect to MySQL server. (If I can't get the database then I won't be able to serve the request or log it.) 2. Collect all interesting logging information, all the way from time and date to requested URI, user agent string, etc. 3. Examine the URI for validity. For example this will 400 Bad Request: "example.com/directory//somefile.php" because a double slash in the indicated place is not valid. 4. Analyze the URI dividing it into directory, file, etc. For example: "example.com/directory/file" ... Keep in mind that my case the site doesn't actually have any sub-directories other than /images. Actually I've kind of followed WordPress URI formation, so it's "example.com/category/title" 5. I test for admin access. I'm the only person that uses this section of code, and I don't want the bans that follow to be able to affect me. If it's an admin request it's sent to the appropriate admin script. 6. At this point I test for all the various bans I can set. If any fit the visitor is appropriately disposed of and logged. 7. Then one by one as my script proceeds it tries various fits for the request. Is it for my root index? Is it for a style sheet? Is it for robots.txt or sitemap.xml? Is it for the contact form? Is it for the search form? Is it something I can redirect? Is it a request for content? For each of these an appropriate script is included, the script handles the request and logs the result. 8. Finally if none of the above applies then it's some kind of error. The script calls (includes) my error.php and processing ends there after sending an error header, error message and logs the error. Well that's mostly it. I'm sure I missed some parts, maybe a few important ones. To sum it up, I catch the bans early in execution, then I try to find a valid way to handle the request, and finally I send an error message if nothing else can be done. Edited by So Called
Link to comment
Share on other sites

Very good explaination So Called. For me, good would mean a request was made for a valid file and bad would mean the file doesn't exit. Anyway, I was trying out $_SERVER['REQUEST_URI'] on my localhost like this:

<?php$item = $_SERVER['REQUEST_URI'];include "connect_to_mysql.php";mysql_query("INSERT INTO url_log (req_file) VALUES ('" . $item . "')") or die(mysql_error());?>

It successfully executes with "localhost" in the url, but gives me a "file not found" error and fails to execute the index.php script when I typed "localhost/xx" in the url. Do I need to change the setting on my localhost server or change my script?

Edited by niche
Link to comment
Share on other sites

Guest So Called

I don't know much about the setups people use for localhost but are you running Apache? If not then the .htaccess files probably won't work. If you are running Apache you'll need to add the code I suggested in post #3 to your .htaccess file. Keep in mind that everything I know is based on a shared hosting LAMP setup (which I'm sure you know is Linux, Apache, MySQL, PHP). Maybe along the way in this topic I might learn something about local hosts (although my current setup suits me just fine and I'm not going to change). If you're not running Apache there is probably a way to do the same thing with your HTTP server.

Link to comment
Share on other sites

Guest So Called
RewriteEngine onRewriteCond %{REQUEST_FILENAME} !-fRewriteCond %{REQUEST_FILENAME} !-dRewriteRule . /

You probably need the additional first line if you didn't have any .htaccess file. Assuming you use Apache at all. Also, remember my Apache came pre-configured. I don't know if there are any configs you need in your setup, again assuming you're using Apache. I've changed the example to assume you're handling it in index.php. BTW (I could be wrong) but I think the !-f means "file not found' and the !-d means 'directory not found.' Conceivably you could handle them differently. I should probably quit posting until you confirm your HTTP server.

Edited by So Called
Link to comment
Share on other sites

RewriteEngine onRewriteCond %{REQUEST_FILENAME} !-fRewriteCond %{REQUEST_FILENAME} !-dRewriteRule . /

You probably need the additional first line if you didn't have any .htaccess file. Assuming you use Apache at all. Also, remember my Apache came pre-configured. I don't know if there are any configs you need in your setup, again assuming you're using Apache. I've changed the example to assume you're handling it in index.php. BTW (I could be wrong) but I think the !-f means "file not found' and the !-d means 'directory not found.' Conceivably you could handle them differently. I should probably quit posting until you confirm your HTTP server.

-f means "is file" and -d means "is directory" therefore !-f means "is not file" and !-d means "is not directory."RewriteCond implies a condition, therefore it translated to:If REQUEST_FILENAME is not a file and REQUEST_FILENAME is not a directory then rewrite the URL.
Link to comment
Share on other sites

Guest So Called

Well okay I got a bit sloppy. I should have described what I said as what the entire line does. Anyway I'm glad you understand the codes. I'm always happy to help. :)

Link to comment
Share on other sites

My localhost is a WAMP server.

Edited by niche
Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...