SEO 101: Applying Hreflang for Multilingual Websites

Last year SEMrush published findings that 75 percent of websites have hreflang implementation errors.

If you then layer on top localization and user issues, such as an Arabic website not reading from right to left, this number probably goes from three in four sites having issues to four in five.

Having worked with a number of companies, varying in both sector and size, I’ve come across a lot of weird and wonderful interpretations and implementations of the hreflang framework.

Selecting Target Countries & Core Structure

When planning your online international expansion and deciding on target markets, you also need to consider how you’re going to target them.

From experience, there are four main ways in which the URL structure can reflect internationalization:

 Implementation  Description
 Different ccTLD Using different ccTLD domains. This is considered best practice for targeting Russian and China in particular. An example of this in practice is Hartley Botanic.
 Subdomain Using a single domain, typically a gTLD, and using language targeted subdomains. An example of this in practice is CNN which uses a subdomain to differentiate between US and UK English sites.
 Subdirectory Again using a single domain, typically a gTLD, different language and content zones are targeted through a subdirectory. An example of this in practice is BeatsByDre.
 Parameter I don’t recommend implementing this method, but I do see it a lot. This is where the domain is appended with a ?lang=de parameter or similar.

Other important things to remember are:

  • Don’t use IP redirects, as it can break Google’s indexes (also remember Google crawls primarily from the U.S.).
  • If you’re using a .com, and you’ve implemented one of the above, don’t redirect your root domain to your “main website”, Google will use the hreflang to point users to the correct site.
  • Only use x-default to point to a language selector page/default page for users worldwide. A great example of this in practice is IKEA. which behaves as a language selector, but x-default can also be used to indicate a default fall-back version of the website for global users.

How to Structure Hreflang Tags

Hreflang always starts with targeting language but then can consist of further variables such as:

  • Language: “en”, “es”, “zh”, or a registered value
  • Script: “Latn”, “Cyrl”, or other ISO 15924 codes
  • Region: ISO 3166 codes, or UN M.49 codes
  • Variant: Such as “guoyu”, “Latn”, “Cyrl”
  • Extension: Single letter followed by additional subtags

Probably the most common interpretation of the above that the majority of us will be familiar with is {language}-{region}.

However, if you do a lot of work in Chinese speaking countries you’re more likely to use {language}-{script}-{region}, such as zh-Hans-cn (Simplified Chinese for the Chinese mainland).

The Internet Engineering Task Force (IEFT) specifications can be found here.

Language Tag

The supported language code comes from the ISO 639-1 classification list. However, in some instances, the extended language tag {extlangtag} can be used on its own.

Extended Language Tag

{extlangtag} tags are subtags that can be used to specify selected languages that are closely identified with an existing primary language subtag. Examples of these are:

  • zh-yue: Cantonese Chinese
  • ar-afb: Gulf Arabic

The extended language tags come from the ISO339-3 classification list.

There is also a code within this classification list for en-eng, which is the extension code for English – and is why en-eng when implemented as English for England works (but not as intended).

Script

The script subtag was introduced in RFC-46464, and they come from the list of ISO 15924 classification list. Only one script subtag can be used per hreflang tag. Examples of these include:

  • uz-Cyrl: Uzbek in the Cyrillic script
  • uz-Latn: Uzbek in the Latin script
  • zh-Hans: Chinese in the simplified script
  • zh-Hant: Chinese in the traditional script

Region

Region codes come from the ISO 3166-1 alpha-2 list and along with the language tag. Common mistakes include attempting to target “RW” as the rest of the world when it’s the country code for Rwanda, and “LA” as Latin America, when it’s Laos.

Variant

The variant subtag can be used to indicate dialects, or script variations, not covered by the language, extended language tag, or region tag.

It’s highly unlikely that you’ll come across variant subtags unless you work in very niche and specialized areas. Examples of these variants are:

  • sl-SI-nedis: The Nadiza dialect of Slovenia, as spoken in Slovenia.
  • de-DE-1901: The variant of German orthography dating from the 1901 reforms, as spoken in Germany).

Extension

Extension subtags allow for extensions to the language tag, such as the extension tag “u”, which has been registered by the Unicode Consortium to add information about the language or locale behaviour. It’s highly unlikely you will ever need to use these.

Source: Search Engine Journal | Implementing Hreflang on Multilingual Websites

Tags: ,

No Comments

    Leave a reply