What does Google consider duplicate content

Find and avoid duplicate content


What is duplicate content and why is it dangerous?

According to Google, duplicate content is a content area that is internally and externally identical or at least very similar to other content in large parts. This poses a problem for the search engine because it wants to map a broad cross-section of relevant search queries in order to provide the searcher with the best possible results.

The Google algorithm solves this problem by deciding on one of the different versions and only presenting this one piece of content to the user. It is now dangerous for you as the website operator: The displayed result may not correspond to the page to which you would have liked to lead the potential customer. In the worst case scenario, too much duplicate content will cause you to lose a lot in the ranking.

Find duplicate content

The easiest way to check if your domain has duplicate content is to do a Google query with various search operators.

Option 1: Duplicate content due to URL structure

Enter “site: domain.de” in the Google search field (without spaces and quotation marks). Now click on the last page of your search results. The following text passage is a first indication of whether your page has duplicate content:

Option 2: Duplicate content due to the same text passages

Take a text passage from your website and insert it into the Google search field with quotation marks, e.g .:If the note is made about similar entries, this is a sign of duplicate content.Tip: With both options you can display the results filtered out by Google. Click on "Repeat the search taking into account the results you skipped". This gives you an initial indication of which pages are causing duplicate content.

Causes of Duplicate Content

Duplicate content often arises when content is expanded or restructured. Typical sources of error include numbered overview pages or shop systems that add filter parameters to the URL. But duplicate content can also arise outside of your own domain, for example when content is distributed in social networks.

Different urls

A major reason for duplicate content is conflicting websites and technical problems. For Google this means: You have a certain content, but it can be accessed under different domain names and / or different URLs.

  1. The page can be called up with and without "www".
  2. The start page can be reached, for example, at domain.de, domain.de/home, domain.de/index.php, etc.
  3. The addition of various parameters as well as upper and lower case is not technically intercepted or does not cause an individual error page. Instead, these new URLs serve the content of the original website.Original website: www.domain.de/inhalt
    Duplicate content:
    - www.domain.de/inhalt? any% 20 text
    - www. domain.de/inhalt? Sign_After_select
  4. Inserting further directories, although they do not exist, delivers a correct result.Example: www.domain.de/directory/content

Session IDs

So-called session IDs are also a common reason for duplicate content. They are responsible for the identification of a user and are used for data stored on the server (e.g. shopping cart). Each user receives their own identification number.

Print versions

Some website owners use the option to create a print version of their website. Here, too, the problem of duplicate content tends to creep in.

Paginated categories

Webmasters like to use paginated categories to provide visitors with a clear product presentation of the website. Certainly useful from the user's point of view. The disadvantage of this pagination, however, is that the following pages are usually very similar. For Google this already leads to duplicate content, especially since most webmasters also display the same textual content (e.g. category description) on pagination pages.

Duplication on external sites

So far, only internal factors for duplicate content have been discussed. Another point that should not be underestimated, however, is the duplication of your own page content on other websites.

Tip: You should avoid copying texts from your website unchanged on Facebook & Co. Even if the content is technically in a different domain, from Google's point of view this is duplicate content.

Avoid penalties for duplicate content

Not every form of duplicate content is penalized by Google. The search engine clearly differentiates between relevant and less relevant duplicate content. However, for the best possible optimization and exploitation of the potential of your website, it is advisable to exclude any identical content.

There is no one-size-fits-all recipe for avoiding duplicate content. Unique content is sufficient to make the user happy. Instead of simply copying product descriptions from the manufacturer, shop operators should offer their customers unique texts with real added value. In contrast, all technical problems, which may even prevent a good ranking in search engines, can only be determined and corrected within the framework of a comprehensive on-page analysis.