Have you found duplicate content on your site? No worries, it’s all right. There are many reasons why your website includes several URLs that lead to the same page or has duplicate content on different URLs. Moreover,
it's a usual practice for online stores in general to have content like that. And though the case is not unheard of and your Magento SEO won’t be hurt, you do need to resolve the problem. And here are a few reasons why.
Why Solve the Magento Duplicate Content Issues?
- To point out a URL that you want users to see in search results
- To help search engines assign all the duplicate URLs to a canonical one
- To make product metrics easier by allotting the duplicate pages to a definite piece of content
- To manage the syndicated content posted on other resources and assign a preferred URL
- To exclude the duplicate pages from crawling and let Google spend time crawling new pages
So, how do you solve the duplicate issue for good?
The easiest and most powerful way is to set canonical URLs in your Magento.
What is a Magento Canonical URL?
A Magento 2 canonical URL is an address that is chosen as a ‘preferred’ one for search engine indexation. You may also hear users call them Magento canonical tags, which include HTML attributes applied to website pages for assigning search value.
Using canonical URLs in Magento is necessary for pages with duplicate or very similar content. This way, you can indicate which page is the main one.
Let’s say you have these pages for the same product:
- site.com/dresses/blackzaradress.html
- site.com/occasions/blackzaradress.html
- site.com/color/black/blackzaradress.html
- site.com/blackzaradress.html
If you don’t add canonical URLs in Magento 2, search engines will automatically choose one of them as canonical considering it the most relevant. This means you won’t be able to control the choice unless you make some changes.
Therefore, you need to tell search engines which one is canonical and set the 301 redirects for all the rest.
How to Add Canonical URLs in Magento 2
Log in to the Admin Panel, go to Stores > Settings > Configuration:
Expand the Catalog drop-down menu and choose Catalog. Then open the Search Engine Optimization section:
Make the next changes:
If you need Google to index the pages with complete category URL paths only:
Use Canonical Link Meta Tag for Categories – ‘Yes’;
Use Canonical Link Meta Tag for Products – ‘No’;
If you want Google to index the product pages only:
Use Canonical Link Meta Tag for Categories – ‘No’;
Use Canonical Link Meta Tag for Products – ‘Yes’;
If you want Google to index categories and products, enable both options:
Use Canonical Link Meta Tag for Categories – ‘Yes’;
Use Canonical Link Meta Tag for Products – ‘Yes’;
Don’t forget to save the changes and clear the cache at the end. Or you can try out one of the Magento 2 canonical plugins.
How Else You Can Add a Canonical URL in Magento 2
In addition to the methods described above, there are several other options for how to mark a link as canonical:
- rel=canonical <link> tag. Add this tag with the canonical link in the code for duplicate pages.
- rel=canonical HTTP header. Send a rel=canonical header in your page response.
- Sitemap. Define canonical URLs in your sitemap.
- 301 redirect. Set up a 301 redirect to indicate the canonical page for Google if the duplicate page is outdated.
- Leverage SEO toolkits for Magento. The tools offer easy detection and resolution of all Magento duplicate content issues.
Can Google Choose the Magento Canonical URL for You?
You can specify a certain Magento page as canonical yourself. But in some cases, Google can choose another page as canonical instead. Google uses various tools to determine canonical URLs, selecting the most comprehensive and valuable content when multiple similar pages exist. Apart from doing content analysis, Google evaluates the page's security protocols (https), sitemaps, and "rel=canonical" labels.
A Magento 2 CMS page that was designated as the canonical URL by Google tends to be crawled more frequently. Typically, Google search results prominently feature canonical URLs. But there are exceptions, such as when Google opts for the mobile version (e.g., https://m.example.com/news/) over the canonical URL (e.g., https://example.com/news/), prioritizing the user experience for mobile users.
How to Learn Which Magento URL is Considered Canonical?
Google's URL Inspection tool can provide you with information about your URL canonicalization. Here are a few notes that you should know:
- You have to own the URL that you want to test.
- Please make sure that you are using the right account.
- If the testing page has duplicates, you will see the info about the canonical URL in the report.
- It is possible to test both AMP and non-AMP URLs.
Sometimes the canonical URL is in a property that you don't own. Here are some reasons why this issue can appear:
- Mistakes in site content localization. Check the official localization guidelines.
- Incorrect canonical tags. Click here to learn how to set up a canonical URL in Magento 2.
- Incorrect server settings. Contact your hosting to solve this problem.
- Hacker attack. Sometimes malefactors use 301 redirect or cross-domain rel=”canonical” link into the HTML <head> to mark malicious URL as canonical.
- External websites copy your content. If you are sure that a third-party website is hosting a full or partial duplicate of your content, please leave a request to Google.
Types of Magento Duplicate Content
Speaking of full and partial duplicates, let's explore when they can appear on your own website and how you can address these situations without canonicalization.
Full Duplicates in Magento
In the case of full duplication, the content on two or more pages is almost identical. The most common example of full duplicates in Magento is when you include the same product in different categories. For example:
http://www.site.com/jewellery/necklace.html
http://www.site.com/for-her/necklace.html
http://www.site.com/gifts/necklace.html
There’s only one necklace but 3 different URLs, which won't pay off well against Google's standards. Here's what you can do:
- Remove the category path from the URL, so that each product will have only one address no matter in how many categories it can be found: http://www. site.com/necklace.html
- If you have a red T-shirt in 2 categories at once: T-Shirts and New, you can choose which category to use in the URL: either the longest one (T-shirt) or the shortest one (new). This is possible with the Unique Product URL extension.
Note: Mind that if your Magento 2 website supports multiple languages, pages with 100% same content in different languages are not considered duplicates thanks to the rel=hreflang tag. For instance, if you have two store views in English and Italian, you should add the tag: “<link rel=”alternate” href=”https://example.com” hreflang=”en-it” />” to the Italian store view. The same method should be applied to all the localized store views.
Partial Duplicates in Magento
In partial content duplicates, only a minor part of the Magento content or its layout is unique. The most common issues related to partial content duplicates are sorting tags, pagination, and product variation.
Solving Partial Magento Duplicate Content Issues
Product sorting issue
Users love it when they can sort the products in your store by bestsellers, by newest, with the Magento 2 price filter, number of reviews, etc. It’s even better if people can decide how many products should be displayed on the page: 20? 50? 100? But all these sorting options create pages with different characters (?, =, |) in the URLs:
http://site.co.uk/category/products.htm?sortby=total_reviews|desc
http://site.co.uk/category/products.htm?sortby=total_reviews|asc
http://site.co.uk/category/products.htm?sortby=relevance|desc
The problem appears when sorting pages get indexed and even cached by Google. Imagine how many such pages can exist. And Google crawlers spend time indexing them while they could concentrate their resources on indexing more important pages of your site: categories, products, etc.
Solution:
- First, go to your product pages and sort them by any option. Now you can see the parameters added to the URL after sorting (e.g., dir, sortby). Go to Google and search for site:yourdomain.com inurl:dir. Most likely, you’ll see a message telling you that some results were omitted.
- Click to include the omitted results and you’ll see the pages in your store containing “dir” in the URLs. It’s bad when these pages with parameters are indexed.
- Go to Google Webmaster Tools => Crawl => URL Parameters. Here you will see the parameters Google has found in the URLs of your store and how it crawls them. “Let Google decide” is the default option there.
- But when it comes to crawling your Magento store, it’s you, not Google, who should decide which pages should be indexed. So if you haven’t decided this before, it’s high time you did it. Click “edit”, choose “Yes” in the dropdown menu, and then – “No URLs”.
You can also add parameters that are not listed in GWT and set crawling options for Google. But be careful and check twice (or even three times) before blocking the URLs with these parameters.
Pagination duplicates issue
Your Magento store is big as you have lots of great products there. But even if you have only a few products, they are still placed on the pages with pagination options. This can result in paginated duplicates. To address this, you need to identify paginated pages and implement effective strategies for controlling search engine crawlers.
Solution:
- Begin by locating paginated pages within your Magento store. Execute a Google search using the query "site:yoursite.com inurl:page." This search helps identify pages containing "page" in the URL, signifying paginated content.
- To prevent Google from crawling all paginated pages, utilize the following meta tag within the head section of duplicate content pages: <meta name="robots" content="noindex, nofollow">. This tag instructs search engines not to index the page or follow links present on it. Adjust the values based on your specific needs, as required.
- Avoid blocking URLs with the robot.txt file, as Google may still treat the pages as unique, and other websites can link to them. This could result in a lack of SEO benefits from backlinking. Instead, consider an alternative approach.
- Instead of relying on the robot.txt file, it is advisable to mark pages with duplicate content using the 'canonical' tag. This tag designates a preferred version of a page when duplicate content exists, offering a more effective solution to address potential SEO issues and maintain the advantages of backlinks.
Variations of the same product issue
Imagine you sell mugs and have landing pages for each color:
The characteristics are the same, the description is the same, the layout is the same… So what’s new? Just the color in the picture! Unfortunately, it’s too little for Google to treat such pages as unique. This means that all product variations found on different pages are partial duplicates.
Solution:
As a Magento shop owner, you probably know all your products and can make a list of their variations. Alternatively, you can search in Google for: site:yoursite.com “Here comes a short excerpt from the product description”. This way you will find all the pages of your site containing this very excerpt. After this, you have two options:
- The hardest but most effective way to solve duplication issues with product variations is to make each variation page unique. You will have to add different product descriptions and meta info. Yes, this is extremely time-consuming. But our AI content master generator can assist!
- An easier way out is to create a single page for a particular product and list all its variations there. This way you have one unique page instead of several duplicate ones.
Takeaways
Magento duplicate content issues are pretty common for any business and aren't necessarily a bad practice. But it's still beneficial to properly guide Google around your content and explain its algorithms why the content is partially or fully similar on certain pages. We hope that our tips on setting up Magento canonical URLs will help you do it with ease but feel free to drop us a line if you still have issues.