Jump to content

Robots.txt


trevor-
 Share

Recommended Posts

I'm not sure where to put this, but I really need help with robots.txt I thought I made a right one, but it's been months now and google/other search engines are still not indexing what I want, and removing what I don't want.This is mine:User-Agent: *Disallow: /*What I to be viewed, but NOT cached:-Only my index (so the default http://domain.com, aka http://domain.com/htm)What I don't want to be viewed or cached:-Everything else. This includes all subdomains, I don't want them showing up in any searches-I also want to uncache deleted subdomains and undeleted directoriesAnyone know how I can do this? Also, I need something that will let only my first page (what I want to be viewed) be viewed and cached. Lastly, does google de-list you or make your site show up less when you have caching on your main page turned off? It seems to have knocked my site down a few ranks when I search for it exclusively now that the caching for that was turned off... Thanks!

Link to comment
Share on other sites

This is mine:User-Agent: *Disallow: /*
To disallow everything, you need to do this:User-Agent: *Disallow: /
What I to be viewed, but NOT cached:-Only my index (so the default http://domain.com, aka http://domain.com/htm)What I don't want to be viewed or cached:-Everything else. This includes all subdomains, I don't want them showing up in any searches-I also want to uncache deleted subdomains and undeleted directoriesAnyone know how I can do this? Also, I need something that will let only my first page (what I want to be viewed) be viewed and cached. Lastly, does google de-list you or make your site show up less when you have caching on your main page turned off? It seems to have knocked my site down a few ranks when I search for it exclusively now that the caching for that was turned off... Thanks!
Your asking for 2 differet things - your asking for your index page to be cached and not to be cached. Its either one or the other :)If your able to run PHP, then stick this code in before anything else in the page...which basically creates a Cache-Control header, as well as an Expires header three days in the future:
<?php Header("Cache-Control: must-revalidate"); $offset = 60 * 60 * 24 * 3; $ExpStr = "Expires: " . gmdate("D, d M Y H:i:s", time() + $offset) . " GMT"; Header($ExpStr);?>

Look into meta tags too:<meta http-equiv="cache-control" content="no-cache">And theres no reason for Google to penalise your site for whichever caching method and setting you use. Only reason why you might be dropping in the listing is that Google cant search your site. If you block access to the site for SE's, then how do they know you'e not hosting illegal content or spreading viruses around?

Edited by real_illusions
Link to comment
Share on other sites

sorry, i meant that i wanted one version where it is cached, and one where it isnt (main page)but i dont want to block everything, i still want the main page no matter what... so how do i allow this?

Link to comment
Share on other sites

  • 2 weeks later...

google refreshed again, and it still doesn't want me to block everything BUT my index. am i supposed to be setting an allow instead? or should i be manually entering every subdomain in the disallow (don't want to do this since someone could then find them all from robots.txt)

Link to comment
Share on other sites

Have a look at this - http://www.searchtools.com/robots/robots-t...ents.html#allowMight work if you set Disallow everything, then Allow just the index page.Actually...you can put a robots.txt file in the subdomain root folder to disallow everything.

Edited by real_illusions
Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

×
×
  • Create New...