Breaking HREFLang Perfection Paralysis
May 25, 2019Hreflang Frequently Asked Questions
May 21, 2020If you are reading this article it is likely you have a regional site and trying to figure out how to solve some of the complications it causes with Google. Over the past week, I have talked to a number of Technical SEOs and prospective clients that have seen an increase in canonical errors in their Google Search Console Index Coverage Report. Specifically, these two items have generated the most errors:
- Duplicate without user-selected canonical: This page has duplicates, none of which is marked canonical. We think this page is not the canonical one.
- Duplicate, Google chose different canonical than user: This page is marked as canonical for a set of pages, but Google thinks another URL makes a better canonical.
Google has indexed the page that we consider canonical rather than this one.
As they dug deeper into the problem they seemed to be caused by a regional website, especially if it was in English, and in most cases, they were not using hreflang element and/or they implemented it incorrectly to handle a regional site. They then set out to solve this and of course, the Interwebs are full of articles that are out of date, that only partially touch on the issue or give incorrect information. There are a few that rather than answer the question scream at you for having a regional site, to begin with.
Please note that this article is not meant to validate or debate the effectiveness of the use of regional sites or single language sites (which is not the best option) but to help mitigate this growing problem by CORRECTLY using hreflang under real-world conditions.
What is a Regional Website?
For the purpose of this article, a regional site is one that was deployed to target a specific geographical area and not a specific market. One of the most common practices is for companies is to localize the site into Spanish and drop it into a /LatAm folder with the hope that it will fill in for all markets in the Americas (aka South and Central America for Americans) that are not large (important) enough markets to warrant a dedicated site.
Equally common is a single English language site to represent all of Asia Pacific using /APAC or a token Arabic language site in a /MEA folder for the Middle East and Africa. On the tech B2B and consumer electronics verticals I am starting to see more sites using other regional designators.
- European Union – a block of countries in Europe that is described below in more detail
- Levant- representing a region in Eastern Mediterranean/Western Asia including Jordan, Lebanon and Syria
- CEMA – to more broadly cover Central and Eastern Europe, Middle East and Africa.
- Stans – this one is targeting all of the mineral rich markets with rapid growth in countries ending in “Stan” including Kazakhstan, Kyrgyzstan, Tajikistan, Turkmenistan, and Uzbekistan.
In most cases, these regions cover multiple if not dozens of different languages which is exactly why Google and the hreflang standard do not support regions. Google’s Gary Illyes sums up the challenge of using regional sites very well in this Twitter post when asked if they support regional hreflang elements.
European Union Region
No, I did not forget about the EU. This is actually the catalyst for this article and what a number of those that I spoke to were dealing with. In addition to my discussion, there have been a number of articles and Twitter posts about handling sites that were trying to target the European Union. After a few reports were published about the potential of the Euro Zone as a gold mine of opportunity with 343 million people using the Euro a number of eCommerce platforms have tied into this region’s potential for riches offering easy access with a click of the button.
The designation EU has a few key elements that cause additional complications, especially for duplication. The EU or European Union is a group of 28 countries that share a set of common agreements related to trade and laws. The Euro Zone is a monetary union of a subset of 19 countries that all accept the Euro as currency.
The offer is simple, with a click of a button the platform would convert the sizes and currency on your US-centric, English language site, drop in a Euro symbol and push it out on either a /eu folder or .eu domain, and bam… you are now covering 19 cash-rich markets in Europe.
Many jumped into this new opportunity for riches but when they did not materialize with the massive traffic and sales they expected from “Europe” they wanted help. Many reached out to Global SEO’s and posted questions in forums and directly to Google asking if the mighty hreflang tag might save them from disaster.
Country Sites as Regional Sites
One trend we see more and more is Webmasters that taking a country site like South Africa or Singapore and using it to target the region. Unlike the /mea or /apac versions which are not typically country-specific, they lack a specific currency or local contact information, these country sites are either cloned to be used across the region or this country version is using hreflang tags to set it for multiple countries.
Challenges of Regional & Country Specific Sites
Taking a moment to read and, more importantly, understand the reason for the two errors from GSC, noted above, we can clearly identify the problem(s). In these specific cases, Google had identified the regional site AND made the decision to consolidate the regional sites under the master global or US site. The reverse has happened in the US where the EU or EMEA page is ranking. This also happens with sites from Germany, Austria, and Switzerland where a dominant page originated.
That is the first challenge, the “dominate page.” Since the original pages(s) were indexed first and have more backlinks they may be considered the original content when an exact or near-exact version is discovered. From a pure algorithmic scoring perspective, it makes sense that Google’s algorithm may then see any newer same language page(s) as duplicates and not rank them let alone consider them relevant to a specific market.
Let’s address the second part of the regional problem specifically in the context of these EU regional pages. For discussion purposes, the EU page is the blue widget page targeting “the EU.” With the exception of the currency symbol and the metric sizes, it is exactly the same as any of the brand’s existing English language pages. Do we really expect Google to automatically assign this version to the 28 EU countries or assume that it is only for the 19 that take the Euro and how does it related to the 24 official languages of the member counties? I assume most would expect Google to understand it but it does not and in the real world another dominant page may be shown. That English searcher in Germany that sees the UK page in British Pounds may not want to purchase.
Since the early days of global websites, we have asked for a method to signal that pages of a site are unique to their respective markets. Most settled for ccTLD’s as a solution. To help minimize this problem Google introduced the hreflang element. It is a method to tell Google that there are alternative versions of the page for specific languages.
But that is not how HREFLang Works
The moment Google said that hreflang was the single most complex function of SEO that was a challenge for the SEO community especially those that tout themselves as Global SEO’s or Technical SEOs. It seemed like a game to churn out the ever “ultimate” and “best” guide to hrelfang. Unfortunately most of what is written, wrong or totally confusing to people. For example:
To solve this problem of the EU for companies a number of Global SEO experts told them the solution was simple. Just use hreflang and assign these pages to the ISO Country Code EU. Unfortunately, there is not an active ISO 3166 Country Code for the EU. You can go to the Official ISO Standards site and review the official list of ISO 3166 Codes which is the exact standard that hreflang accepts. it is NOT on the list.
Of course, the blog post they read on the internet cannot be wrong. I have has many SEO’s and developers argue with me that it is correct with some sharing the IBAN Banking Codes another shared their source as the highly relevant European Cuisines site The hands-down winner of those with exceptional reading comprehension skills is the always correct Wikipedia. Note Wikipedia has it in the Exceptional Reservations section meaning it has been reserved but NOT assigned as an ISO code. And just because it is on the page with ISO codes does not make it correct.
Another approach used by many is to try to set a regional site using two-letter ISO codes. There are hundreds of sites that use the ISO code “LA” to set a hreflang element of es/la for their Latin America site then want to know why it does not appear in regional searches. Well, you told Google it is a Spanish dialect for Laos which is not even in the region. My other favorite is ar/ME for setting a regional Arabic language site to Montenegro.
A few experts even argued that in 2019 Google is smart enough to know that the /latam or /eu should tell Google what the content is targeting or they can set it in GCS. Google luck trying to set a /eu to multiple countries potentially impacting it to show across the region.
Let’s take a step back and look at something even more fundamental with hreflang that most misunderstand about it. The sole purpose of the attribute is to designate the dialect of the written content on that page. It is NOT to set countries – it is right there in the name “hreflang.” This is why Google’s guidelines have a big yellow warning box telling you you cannot use the USO country code only.
Fortunately, many of these dialects are spoken in a specific country which is why we can leverage it to designate that a page in Spanish can be set to target Spanish-speaking Mexico or and specific English page be set to target Australia.
Country Sites as Regional Market Sites
One trend we see more and more is a country site like South Africa or Singapore being used to target multiple countries in the region. Unlike the /eu, /mea or /apac versions which are without currency or local contact information these country sites are either cloned to be used across the region or this country version is using hreflang tags to set it for multiple countries. This does not work as well with hreflang as it does with a true regional site. Since Google can detect language and currently trying to make a Kazak website work in Mongolia is not an optimal experience and Google tends to default to the global site especially for English language queries.
What are we trying to fix?
If your main objective is to remove the potential for canonical dilution, duplication demotions, incorrect page ranking, and to prevent hreflang errors in GSC any of these three challenges you simply need to do one of the following with the last option being the best:
- Set the regional version to a specific language or;
- Set it to any one specific market or;
- Set the URL to represent multiple markets.
Implementing either of the first two options will solve the duplicate and canonical problems. This is what many agencies do using manual methods. They just need to account for the duplicate English language page by setting it to a language or country. That may solve your SEO error problems but will not give you the full benefit of hreflang.
Correctly Use HREFlang for Regional Markets
If your main objective is to prevent the wrong page ranking and to prevent hreflang errors in GSC you need to map the Euro pages to all applicable countries. Yes, that is correct… your new English language-only Euro currency site does need to be set for all represented markets. Does Google support this nonsense? Yes, they have suggested it a number of times as a solution to this problem.
Did you catch the mistake John made in his response? It is one of the most common. The UK is en-GB not en-UK.
To accomplish this for the 19 markets of the Euro Zone using in page hreflang elements we need a block of code like below. We used the following assumptions. Obviously, if you have other versions of the site they will need to be added as well and yes, you can set x-default to a specific host market.
Assumptions:
- We have a global English site
- We want coverage for the 19 Euro Zone markets
- We do not have any other country site.
Now I assume you are saying to yourself “that is a lot of code to put on the page” and on every page. This is after you just got your developers to remove hundreds of lines of unnecessary code to make it load faster so asking them to put this in is more crazy talk. This is exactly why Google suggests putting them in XML site maps.
Managing Multiple Versions of Same URL using XML Site Maps
Using HREFLang XML site maps allows for the expansion of the multiple references without adding weight to the page. In the example below, we represent the multiple Euro Zone markets. It would be similar for any of the other regions. Since this is a very specific example we have to denote some assumptions
- We have a global English site which we set to X-Default
- We want coverage for the 19 Euro Zone markets so we need 19 hreflang elements
- We do not have any other country sites that need to be represented.
Obviously, if you have other versions of the site they will need to be added as well and yes, you can also set x-default to a specific like the US or UK whatever it represents.
One of the first major features of HREFLang Builder was to add the site clone feature. Nearly every major site we worked with was using regional sites and almost never appeared in the SERPs due to stronger country sites and duplicate suppression.
Using the HREFLang Builder replication process, we can take a regional site like the /latam and assign it to as many countries as the client wants. For example, in the screen capture below this client wanted the /latam site to be assigned to Ecuador as the primary target and in our system, it is represented as Spanish Ecuador.
To create the replication for the other versions we just need to click the blue “replicate” button for the newly created Ecuador site and the system will take all of the URLs from /latam and add them to the additional target(s) you designate. In this case the client wanted the regional site to represent a total of 16 markets. The system will then do the mapping as it would for an individual site and then create 16 entries in the HREFLang XML file to indicate the /latam regional site is for each of these countries.
Multiple URL Effectiveness
Does setting regional sites to multiple language versions for countries actually work? We manage the HREFLang Elements for some extremely large websites with some having as many as 20 clones of their Latin America and Asia Pacific sites with no negative impact. In one of our first major projects, the traffic from Latin America increased over 200% when we deployed the cloned versions designating them for each of the markets.
This reduced the negative impact on the markets by presenting the Spain site in Euros and the US Spanish site in USD. It allowed them to better represent the other markets by using IP detection and dynamically inserting local contact information and pricing into the regional site which they could not do effectively on the other Spanish language sites.
In a perfect world of abundant resources, Marketers would not just clone sites and change the currency but build totally unique sites localized specifically for each market. But here in the real world that may not happen so the best we can do is think smartly and effectively deploy hreflang on regional sites. Feel free to contact us to review your implementation or give HREFLang Builder a try.