Skip to content
Dec 11 / Nick Teel

The Haunting Tales of Duplicate Content

If you are a brokerage or agent in today’s real estate web landscape, there is currently something terrifying looming over your head. It’s creeping under your bed, lurking in the darkest corners of your room, hiding in the depths of your closets and that thing is Duplicate Content.  We may not know it exists, but it haunts our every web move.  For real estate professionals, Duplicate Content is almost a guarantee.  Unless you are able to relegate all of your listings to one web page per, Duplicate Content could be harming your Search Engine results efficiency.

Until recently, it wasn’t as big of a problem, and only spooked us once in a while when we thought we caught glimpses of it. Then Google had to go and rile us up with its haunting tales of websites being negatively affected by duplicate content issues.    With the recent release of Google’s Penguin Algorithm update, they have shifted everyone’s attention to making sure their site is not violating Google’s “quality guidelines” and our Boogey Man, duplicate content, is a big part of that.  Having to rely on search engines to make the determination on the intent of your duplicate pages may not be the best way to go about it.  Not to mention, third party sites, multiple agent sites and IDX systems all generating content very similar to your own.  Where do the lines start to blur?

So, What’s the Big Deal?

Duplicate content, loosely defined by Google, is when different URLs on your site display overwhelmingly similar content (same keywords, text, photos, video, etc…).  Search engines, like Google, are now starting to get smarter about how they index your webpages, and when presented with multiple pages of very similar content, they make an algorithmic determination as to what page is the most preferred.  This may not always give you the best foot forward on the search results pages, the page you want displayed may not be the page they choose. Plus, the Penguin Update set a precedent that sites may start to get penalized for having copious amounts of the same content throughout the site.  Google suggests that Penguin only applies to sites that use duplicate content to attempt to manipulate the search results; however, I would rather err on the side of caution.

Common Reasons Behind Duplicate Content

There are various reasons how duplicate content gets generated.  For example, many brokerages still use practices of dynamically generating URLs using scripts, which present long character ridden query strings in the URL slug.

See samples below:

www.example.com/products.php?id=546416&title=Page_Name&action=blank

www.example.com/products/item.php?id=546417&title=Page_Name2&action=blank

Both URLs show the exact same information, but are different addresses.  A single listing page could have 4 or 5 different URLs, dependant on how the consumer navigated to that page through the brokerage web site.

Even if you don’t use scripts to generate your urls you can still run into duplicate content concerns. Whether it is bad organization of your site, or just multiple pages being created for the same property through your agent pages and your main framework, there are multiple ways content is duplicated, even without you being privy to it.  For example, the “index.html” page of a website is usually the same page displayed if the visitor accesses the site without specifying a filename. So, “http://www.example.com/index.html” and “http://www.example.com/” are usually the same page, showing the same content.

So, How do we Remedy Duplicate Content?

There are a few steps you can take to help address duplicate content issues, and ensure that consumers visiting your real estate site see the content you want them to.

  1. One way to solve Duplicate Content issues is by using 301 redirects:

If you have the time and resources, you can set 301 redirects (SEO Friendly redirects) on your duplicate pages that completely redirect those duplicate pages to the correct URL.   301 redirect means that the content of the page has permanently moved somewhere else and the search engine will follow the path to the new url, de-indexing the previous URL from the search engine cache.   301’s are typically a little tougher to implement and depending on the CMS(Content Management System) or your web provider, they may cost more than some other techniques.

  1. Quickly take care of Duplicate Content using the “Rel=Canonical” Link:

In the past few years, Google declared that it would recognize a new suggestive tag, which if inserted into your web page code, allows you to state which URL you want to be the “preferred” version for that subject matter.  This tag is called the rel=canonical.  A rel=canonical tag is fairly easy to implement because it doesn’t involve making server side changes. You merely  add the link into the <head> tag on any of the suspect duplicate content  pages.

This is what the rel=canonical should look line in your <Head> section

<link rel=”canonical” href=”http://www.example.com/Preferred_Page.html” />

If you have multiple URLs that resolve to the same content, Google and Bing will use the one declared as canonical as the actual URL for indexing purposes. Typically a search engine will attempt to pick the URL that they feel is the authority for some set of information if multiple pages are presented with the same content. The canonical designation solves the issue of the search engine determining which page is preferred and puts the power in your hands.  Therefore, in search engine results, it will display the defined canonical URL instead of all the variants it finds on your website.

 

Why we like Canonicals

Not only is the canonical tag typically easier to implement, but it is also an SEO friendly alternative when dealing with duplicate content. Much like the 301 redirect, the rel=canonical passes the same amount of ranking power from one page to the other.  The canonical passes on the content value and link metrics of the pages back to the specified preferred page.  Also, the overall notion of ridding the search engine results of multiple pages of your same content, allows for a preferred user experience.  The more a site submits to a user the better the site usually does on the search engines.  Just remember that Google and Bing treat the canonical as a suggestion and not a directive like the 301 redirect. 

What About Duplicate Content on Other Domains?

Even in the instance when you have duplicate content on multiple domains, www.example.com/stuff www.example2.com/stuff, or stuff.example2.com, Canonical links have an answer for that!  Several months after Google announced the canonical link tag they also announced their support for it across other domains.  Google had this to say:

There are situations where it’s not easily possible to set up redirects. This could be the case when you need to migrate to a new domain name using a web server that cannot create server-side redirects. In this case, you can use the rel=”canonical” link element to specify the exact URL of the domain preferred for indexing. While the rel=”canonical” link element is seen as a hint and not an absolute directive, we do try to follow it where possible.

The Problem Solved?

This is surely a step in the right direction of solving issues of duplicate content, and it puts the power of how to resolve those issues into the hands of the webmasters themselves, rather than letting the search engine, which usually does not have enough information, try to figure out the correct URL. In an industry where listings content is duplicated 20fold, we need all the help we can get.  A question remains, is Google smart enough to distinguish your content from third party sites republishing your listings?  Which then begs, should your listings on third party sites be canonically linked to your property details pages?  You are the authority for that information and you deserve the value that your listings provide.