Sometimes, we try hard to welcome Google, and try best to make it index the pages from a website, but sometimes we don’t want them to crawl the pages too. What do we do in those cases ? Can we control Google the way we want ?
Well, sometimes yes, sometimes not. Let’s see how we can make Google stay out of crawling selected portions/parts of a website, effectively that is.
1. Lock and Key Method
The most effective and sure shot method is to use a lock and key mechanism. That is to leave those pages you don’t want Google access, locked behind. Recommended method is to use a username/password login gateway to those pages/part of the website, that you don’t want Google to crawl.
2. The Nofollow Meta Header tags
The next best thing to Lock and Key method is to use the versatile meta header tags. Pick up those pages that you don’t want Google to crawl, and add the meta=nofollow tag to their headers. Technically this makes Google “skip” indexing the contents of the page. However, this method will be rendered in effective if external websites link extensively to the URL and Google might just “see” the page somehow, while not indexing it still. Also, when you have a large number of pages to be screened out of Google, then adding the header tags can lead to complications.
3. Blocking the links with NoFollow tags
The third and easiest way to block Google from crawling certain pages/URL is to use the “NoFollow” tags on the links that point to that particular URL. Using Nofollow tags will tell Google that its bots needn’t crawl the contents of a page, as the content may not be useful. Even though this sounds technically okay, sometimes Google can “see through” the nofollow tags.
In my opinion adding Nofollow tags to links are like blocking entry to a room with a Glass wall. Google bots won’t enter the room, or index the contents there, but can very well see through the glass wall and will have an idea about what’s in there.
So there you have it. Three effective (and ineffective) ways to block Google from crawling part/sections of your website. Each one works out well according to the condtions and crcumstances, and sometimes even you have to work out all the three based on nature of the pages you’re dealing with.
Do you know of any other possible ways ?