Jump to content

Weird site accesses... Referrer makes no sense...


Guest So Called
 Share

Recommended Posts

Guest LH91325

I've got a small amount of weird accesses on my site but almost all of them involve one content page. The weirdness is that all the visitors show that the referrer is my same page that is being served. Let's say my page is /this-is-the-page-that-is-being-served. But the referrer is also /this-is-the-page-that-is-being-served. (I don't use .PHP or .HTM/.HTML URLs on my site.) There aren't any links on the page that refer to itself. There are no log entries that the visitor arrived there earlier by any means. They come out of the blue and they're all lying that they claim the referrer is my own site! The IP addresses mostly indicate the visitors are in CN (China) and many if not most are coming from large IP blocks (CIDR /18 or /19 or /20, thereabouts, really big blocks, telecom companies). Many of the 'scrapers and 'jackers use small web hosts, like /22, /23 or even a few of them /24 hosts. Small companies that have small blocks and might even be in collusion with scammers. I have the feeling I'm being 'scraped or 'jacked in some way, but I just can't figure it out. The traffic has to be coming from ordinary web surfers because it's too spread out to be organized from the visitor's end. I've run a few experiments that indicate the visitors (some of them) seem to be human. (I can describe my observations if you want.) I'm trying to figure out where this traffic is coming from. Why do all these visitors claim the page they're visiting is the page that referred them there? That makes no sense to me. Everybody has to come from somewhere. There were no eggs before there were chickens. Could this be some kind of weird proxy that Chinese people use? I can see that with their oppressive government Chinese citizens might use browser add-ons and/or proxies to mask information from the sites they visit. A fair number of them use this user-agent string:

Mozilla/4.0 (compatible; MSIE 7.0b; Windows NT 6.0 ; .NET CLR 2.0.50215; SL Commerce Client v1.0; Tablet PC 2.0

But there are a lot others using various other Opera, MSIE 6, etc. The answer can't just be defective Tablet PC referrer strings. They're not just Chinese either. One of them even came from Gabon. (I never heard of that country. It's in Africa.)

Edited by LH91325
  • Like 2
Link to comment
Share on other sites

it could be a proxy which is changing the referer or it is possible also user is reloading the same page in that case referer and request uri will be same. user agents should not be an issue.

Link to comment
Share on other sites

There is a national firewall or proxy for China, there are ways around it but it's there. Since a client can set the referer to anything they want they're probably just setting it to the same page rather than not send it at all.

Link to comment
Share on other sites

Guest LH91325
Could this be some kind of weird proxy that Chinese people use?...
it could be a proxy which is changing the referer or it is possible also user is reloading the same page in that case referer and request uri will be same. user agents should not be an issue.
There is a national firewall or proxy for China, there are ways around it but it's there. Since a client can set the referer to anything they want they're probably just setting it to the same page rather than not send it at all.
Well as I said in the OP I've been wondering if this is some kind of proxy. I'm pretty sure that reloading a page will send the same referrer as the original page load. I've even seen duplicate log entries (other situations) where two loads are separated by a few seconds or minutes. Anyway I've never seen an earlier request for that page or any other. (I can view previous accesses of any visitor IP address including wildcards using my logging analysis software.) I've examined my log (going back about 4 years) looking at visitors who sent no user-agent string and at least 70%-80% of them are always involved in some negative activity like looking for a WordPress or phpMyAdmin login page, or looking for WordPress or other platform vulnerabilities. Many of them have several or more attempts for pages with the word "admin" in them. Visitors with no user-agent hardly ever look at any of my content pages. I finally just added a few lines to drop the connection on visitors without user-agent strings. That hardly ever happens. Most real people and legitimate search engines have user-agent strings. Anyway I think it's more likely than not the referrer = self accesses are probably some proxy, although it's still possible it's some sort of framing (which I'm pretty sure could produce the same result). If so, I don't have any clicks to 'jack. It would be just simple content scraping via framing.
Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

×
×
  • Create New...