WordPress Themes Twitter Adder SEO Software Master Your Webmaster WordPress Themes

Duplicate content ? Canonical tag to your rescue

Share

Duplicate content has been discussed and fought over on almost all panels I’ve attended.
SEO’s seems to like the subject very much, I have no clue as to why.

We’ve discussed  about duplicate content issues here.

Today Google, Yahoo and MSN have come up with a new “idea” to help webmasters fight duplicate content.

Canonical-tags-duplicate-content

First off – How does duplicate content occur on a website ?

- When more than one page has the same content.

- When more than one page are similar in content.

- When a page is repeated on the website due to technical glitches.

- When dynamically generated pages repeat the same content over various events.

So in such events, the search engines, on seeing the same content on diff pages, “suspends” the value of those pages, and takes time before it shows up either one of those pages on the search results page for a live search.

Ex:- Lets say we have two pages on a website.

URL 1 – http://yoursite.com/yourpage

URL 2 – http://yoursite.com/yourpage?bgcolor=blue

Let’s assume that the second page is the same as the first page except for the background color which is dynamically controlled on CSS styles.

Now, when there are references to the two URLs from another or more website, with similar or same anchor texts, Google will find it difficult to decide which page to come up with.

In such situations, Google might take its own time to decide which page to show up on the search engine listings for a related search. It’s more like a confused state. (Bots aren’t always smart you see.)

So that explains why a website should contain minimum duplicate entries or duplicate content.

It might not be possible to completely avoid duplicate content on a website, but the idea is to curb it to the minimum causing the least confusion.

 

How to curb duplicate content ?

In the above example, there are more than one way of telling Google that one page is better than the other.

1 – Google has its own calculations that it does to analyze the content, and come to a decision as to which page makes more sense.

2 – Google can also check for external factors such as incoming links, anchor texts, contextual content on the links etc and decide as to which page among the two are more “popular” or “preferred”.

 

How do Search Engines deal with Duplicate content ?

Search Engines takes their own time until they get evidence of why a page is better than the other before they actually display them on the live search results. They would simply carry on with the other results in the queue and suspend the “possible duplicate content” from being displayed on the live results.

So what is a Canonical tag ? How does it help in dealing with duplicate content ?

A canonical tag is a simple piece of HTML code (<link>) that you insert into the <head> section of a duplicate page, letting the search engines know that they are on a duplicate page and they need to find the original content elsewhere, and guide them there.

So let’s pick an example.

Page 1 -  http://www.google.com/duplicate-content.html   (Original source content)

Page 2 – http://www.google.com/duplicate-content-800×600.html   (Duplicate content)

Now, you add the canonical link tag to the duplicate page, Page 2.

<link rel="canonical" href="http://www.google.com/duplicate-content.html " />

So what happens now ? As soon as Google bots land on the duplicate page (page with the canonical tag), it does not give weight age to the content on that page, rather follows the original URL in the canonical tag code.

Where/Which pages should you add a canonical tag?

Technically , any page that you think will loop the content from a different page.
For example – http://www.yoursite.com/page1.php?sessionid=12+author=ben should be canonically tagged to http://www.yoursite.com/page1.php.

How does Canonical Tag help WordPress blogs ?

In my opinion, canonical tags should not be automated on WordPress blogs. Because although there are several occurrences of possible duplicate content on WordPress, the canonical tags may not work there efficiently as they require some amount of manual checks.

For example, on WordPress blogs, tags and archives creates a possible duplicate content situation, but not either can be effectively controlled by canonical tags. In such situations, meta noindex tags are far more effective.

But in instances like series posts (101-tips-part1.html and 101-tips-part2.html) , where IF the content are strikingly similar, one may manually insert the canonical tags to good use.

Otherwise, I’d stay away from automation at least for now.

Stylish Wordpress Themes

Written by Mani Karthik

Blogger, Web / Social Media Enthusiast & SEO with Flip Media. I'm always on the learning curve. Love to meet new people, feel free to befriend me.

Follow Mani Karthik on Twitter Add Mani Karthik on Facebook

22 Responses

  1. Fantastic Mani, thanks. On our site we often have to duplicate content to fit in with how users logically would search through menus, it’s great to see that there we can still keep user friendliess without harming our rankings!!!

  2. Duplicate content within a site are discounted nowadays.

    For WordPress, the plugin HeadSpace2 has no other matches in terms of SEO it provides. You can configure tags, archives, categories, subpages, login pages, author pages or anything you want.

  3. Cool… Has been hearing a lot on these canonical tags and this is an eye opener.

    Thanks for the info

  4. I think the canonical tag will prevent search engines from getting confused. But hopefully they have put safeguards in place to prevent spamming. I’m afraid that spammers could use such a new convention as this to their advantage.

  5. It will certainly have a pretty huge impact, most profit from it will be for script based sites and people running affiliate programs with IDs at the end of the URL…looks a bit like Noindex 2.0.

  6. Mani,

    The All in One SEO Plugin can add a no-index to the category and tag pages.. and then what is problem with automation of canonical tag? Would a combination of noindex via the All in One SEO pack and a auto-canonical tag solve the issue of duplicate content to a good extend?

    Recently, I found my search results page coming up in Google results, will the canonical tag solve such issues too?

    see this link:
    http://omninoggin.com/wordpress-posts/automatically-deal-with-duplicate-content-in-wordpress/

  7. Mani,

    users can use the plugin available @ http://yoast.com/canonical-url-links/ for curbing duplicates using canonical tag automatically.

    It supports wordpress and also available for other CMS aswell.

  8. Hi Mani.

    Nice Post with Well Explanation.
    I read this news on searchengineland but was confused to understand what is this and what blog witter want to say, but you solved all problems about it.

    Nice Post. You are really posting nice information.

  9. What is different using “canonical” and using “noindex meta tag” in duplicate pages?

  10. I didn’t have a any knowledge on this tag but this is definately helpful for me.

    Lynda

  11. Mani,
    I am using “nofollow” for that kind of duplicate content. Do I need to include “canonical” also to the same link?

    • Mani Karthik

      Aloy, it would be helpful if you use canonical tag. Because nofollow and meta noindex tags still can mena that the content is being scanned.

  12. I believe canonical tag comes by default with the latest WP upgrade but I can still see a few new pages indexed.. site.com/page-101/comment-page-1

    Is it what you are talking?
    That the new page will not be indexed by default.. It shows up in site:search though…
    :P

  13. I used to publish my articles, but now I wander should I stop doing this, because the risk of duplicate content penalty. Should I stop publish my articles on article directories?

  14. Ashik

    I’ve found something wrong in your example file. the tag supposed to contain the duplicate-content-800×600.html file. (page2)

    like the one below

    thanks.

Leave a Reply