If you are dedicated to creating original and unique content for your website, one of the biggest disappointments you will encounter will be those people who steal or scrape it. Rewriters also target original content websites and sometimes, they benefit from it. This means that, someone else could benefit from your work.
Offsite duplicate content
On most times, the problem causing duplicate content is spam. Scrappers create auto blogs where they post your content as soon as it’s published. If the Google bots crawl and index them before you, it means you will lose website ranking in the SERPs. You will be lucky if yours is indexed fast.
The only course of action you need to take therefore is to be proactive about your websites content. Here are some of the things you can do to protect the authenticity of your website’s content with duplicate content checker PlagSpotter.
Contact the spammer
Spammers are annoying whenever they scrap your content, yet shine with it. When you discover someone has taken your content, consider contacting him or her as the first line of confrontation. Some scrappers are new bloggers who have no idea of the rules. If you ask them to kindly remove your content on their website, some may be kind enough to oblige.
Contact the host
Some bloggers are generally selfish or rude. They will either ignore your take down request or keep the content. Many hosting platforms, where many of these sites are hosted have rules against spammers and content thieves. If your request for a takedown is ignored, contact them. The host will take them down for you.
Go to court
The copyright law protects all original published works. Google also supports legal suits for such matters. Through the court system, you can have a scrapper punished for their unethical techniques. However, as a matter of principle, litigation can be long and expensive. However, if it’s your business interests at stake, just do it. On the other hand, duplicate content also comes in some instances on the same website.
Canonicalization is a big problem on most websites. Simply put, it entails users and search engines being able to access a certain page from multiple domains or URLs. For instance, your landing page could be accessed from the root, from the home folder, index folder and so forth.
To the search engines, these are duplicate content. They compete for the same search engine relevance to queries and none of them wins. You can fix these pages by writing 301 redirects to the preferred versions of pages throughout the website.
Printer friendly pages
Many site owners create printer friendly site pages for their visitors for a better experience. If the search engines can index these pages, then they are duplicates with the main site. The same applied to mobile websites built on older frameworks like WAP.
The best way to deal with this kind of duplicate content is through the creation of robotics txt files with those sites and pages. This will tell the search engines not to crawl, yet still give your visitors use friendliness needed.