Post

How to Support Multiple Languages on a Jekyll Blog with Polyglot (1) - Applying Polyglot Plugin & Implementing hreflang alt Tags, Sitemap, and Language Selection Button

This post introduces the process of implementing multilingual support on a Jekyll blog based on 'jekyll-theme-chirpy' using the Polyglot plugin. As the first in the series, this post covers applying the Polyglot plugin and modifying the HTML header and sitemap.

How to Support Multiple Languages on a Jekyll Blog with Polyglot (1) - Applying Polyglot Plugin & Implementing hreflang alt Tags, Sitemap, and Language Selection Button

Overview

About 4 months ago, in early July 2024, I added multilingual support to this Jekyll-based blog hosted on Github Pages by applying the Polyglot plugin. This series shares the bugs encountered during the process of applying the Polyglot plugin to the Chirpy theme, their solutions, and how to write HTML headers and sitemap.xml considering SEO. The series consists of two posts, and this is the first post of the series.

Requirements

  • The built result (web pages) should be provided in language-specific paths (e.g., /posts/ko/, /posts/ja/).
  • To minimize additional time and effort required for multilingual support, the language should be automatically recognized based on the local path (e.g., /_posts/ko/, /_posts/ja/) of the original markdown file during build, without having to specify ‘lang’ and ‘permalink’ tags in the YAML front matter of each file.
  • The header part of each page on the site should include appropriate Content-Language meta tags and hreflang alternate tags to meet Google’s multilingual search SEO guidelines.
  • The sitemap.xml should provide links to all pages supporting each language on the site without omission, and the sitemap.xml itself should exist only once in the root path without duplication.
  • All functions provided by the Chirpy theme should work normally on each language page, and if not, they should be modified to work properly.
    • ‘Recently Updated’, ‘Trending Tags’ functions working normally
    • No errors occurring during the build process using GitHub Actions
    • Post search function in the upper right corner of the blog working normally

Applying the Polyglot Plugin

Since Jekyll does not natively support multilingual blogs, an external plugin must be used to implement a multilingual blog that meets the above requirements. After searching, I found that Polyglot is widely used for multilingual website implementation and can satisfy most of the above requirements, so I adopted this plugin.

Installing the Plugin

As I use Bundler, I added the following content to the Gemfile:

1
2
3
group :jekyll_plugins do
   gem "jekyll-polyglot"
end

Then, running bundle update in the terminal will automatically complete the installation.

If you’re not using Bundler, you can directly install the gem by running gem install jekyll-polyglot in the terminal, and then add the plugin to _config.yml as follows:

1
2
plugins:
  - jekyll-polyglot

Configuration

Next, open the _config.yml file and add the following content:

1
2
3
4
5
6
# Polyglot Settings
languages: ["en", "ko", "es", "pt-BR", "ja", "fr", "de"]
default_lang: "en"
exclude_from_localization: ["javascript", "images", "css", "public", "assets", "sitemap"]
parallel_localization: false
lang_from_path: true
  • languages: List of languages you want to support
  • default_lang: Default fallback language
  • exclude_from_localization: Specify regular expressions for root file/folder paths to exclude from localization
  • parallel_localization: Boolean value specifying whether to parallelize multilingual processing during the build process
  • lang_from_path: Boolean value, if set to ‘true’, it automatically recognizes and uses the language code if the path string of the markdown file includes it, without needing to explicitly specify the ‘lang’ attribute in the YAML front matter of the post markdown file

The official Sitemap protocol documentation states:

“The location of a Sitemap file determines the set of URLs that can be included in that Sitemap. A Sitemap file located at http://example.com/catalog/sitemap.xml can include any URLs starting with http://example.com/catalog/ but can not include URLs starting with http://example.com/images/.”

“It is strongly recommended that you place your Sitemap at the root directory of your web server.”

To comply with this, you should add ‘sitemap.xml’ to the ‘exclude_from_localization’ list to ensure that only one sitemap.xml file exists in the root directory, rather than creating separate ones for each language, as shown in the incorrect example below.

Incorrect example (the content of each file is identical, not different for each language):

  • /sitemap.xml
  • /ko/sitemap.xml
  • /es/sitemap.xml
  • /pt-BR/sitemap.xml
  • /ja/sitemap.xml
  • /fr/sitemap.xml
  • /de/sitemap.xml

Setting ‘parallel_localization’ to ‘true’ can significantly reduce build time, but as of July 2024, when this feature was activated for this blog, there was a bug where the link titles in the ‘Recently Updated’ and ‘Trending Tags’ sections of the right sidebar were not processed correctly and mixed with other languages. It seems not fully stabilized yet, so it’s necessary to test if it works properly before applying it to your site. Also, this feature is not supported on Windows, so it should be deactivated.

Also, in Jekyll 4.0, you need to disable CSS sourcemap generation as follows:

1
2
sass:
  sourcemap: never # In Jekyll 4.0 , SCSS source maps will generate improperly due to how Polyglot operates

Points to Note When Writing Posts

When writing multilingual posts, keep the following in mind:

  • Proper language code designation: You should specify the appropriate ISO language code using either the file path (e.g., /_posts/ko/example-post.md) or the ‘lang’ attribute in the YAML front matter (e.g., lang: ko). Refer to the examples in the Chrome developer documentation.

However, while the Chrome developer documentation uses the format ‘pt_BR’ for region codes, you should actually use ‘pt-BR’ with a hyphen instead of an underscore for it to work properly when adding hreflang alternate tags to the HTML header later.

  • File paths and names should be consistent.

For more details, please refer to the README of the GitHub untra/polyglot repository.

Modifying HTML Header and Sitemap

Now, for SEO purposes, we need to insert Content-Language meta tags and hreflang alternate tags in the HTML header of each page on the blog.

HTML Header

As of the latest version 1.8.1 release in November 2024, Polyglot has a feature that automatically performs the above task when the {% I18n_Headers %} Liquid tag is called in the page header section. However, this assumes that the ‘permalink’ attribute tag has been explicitly specified for that page, and it does not work properly otherwise.

Therefore, I imported Chirpy theme’s head.html and directly added the following content. I referred to the SEO Recipes page of the official Polyglot blog but modified it to use the page.url attribute instead if page.permalink is not available. Also, referring to the Google Search Central official documentation, I specified x-default instead of site.default_lang as the hreflang attribute value for the site’s default language page, so that the link to that page is recognized as a fallback when the visitor’s preferred language is not in the list of languages supported by the site or when the visitor’s preferred language cannot be recognized.

1
2
3
4
5
6
  <meta http-equiv="Content-Language" content="{{site.active_lang}}">

  {% if site.default_lang %}<link rel="alternate" hreflang="x-default" href="{{site.url}}{{page.url}}" />{% endif %}
  {% for lang in site.languages %}{% if lang == site.default_lang %}{% continue %}{% endif %}
  <link rel="alternate" hreflang="{{lang}}" href="{{site.url}}/{{lang}}{{page.url}}" />
  {% endfor %}

Sitemap

Since the sitemap automatically generated by Jekyll during build does not properly support multilingual pages, create a sitemap.xml file in the root directory and enter the following content:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
---
layout: content
---
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:xhtml="http://www.w3.org/1999/xhtml">
{% for lang in site.languages %}

    {% for node in site.pages %}
        {% comment %}<!-- very lazy check to see if page is in the exclude list - this means excluded pages are not gonna be in the sitemap at all, write exceptions as necessary -->{% endcomment %}
        {% unless site.exclude_from_localization contains node.path %}
            {% comment %}<!-- assuming if there's not layout assigned, then not include the page in the sitemap, you may want to change this -->{% endcomment %}
            {% if node.layout %}
                <url>
                    <loc>{% if lang == site.default_lang %}{{ node.url | absolute_url }}{% else %}{{ node.url | prepend: lang | prepend: '/' | absolute_url }}{% endif %}</loc>
                    {% if node.last_modified_at and node.last_modified_at != node.date %}<lastmod>{{ node.last_modified_at | date: '%Y-%m-%dT%H:%M:%S%:z' }}</lastmod>{% elsif node.date %}<lastmod>{{ node.date | date: '%Y-%m-%dT%H:%M:%S%:z' }}</lastmod>{% endif %}
                </url>
            {% endif %}
        {% endunless %}
    {% endfor %}

    {% comment %}<!-- This loops through all site collections including posts -->{% endcomment %}
    {% for collection in site.collections %}
        {% for node in site[collection.label] %}
            <url>
                <loc>{% if lang == site.default_lang %}{{ node.url | absolute_url }}{% else %}{{ node.url | prepend: lang | prepend: '/' | absolute_url }}{% endif %}</loc>
                {% if node.last_modified_at and node.last_modified_at != node.date %}<lastmod>{{ node.last_modified_at | date: '%Y-%m-%dT%H:%M:%S%:z' }}</lastmod>{% elsif node.date %}<lastmod>{{ node.date | date: '%Y-%m-%dT%H:%M:%S%:z' }}</lastmod>{% endif %}
            </url>
        {% endfor %}
    {% endfor %}

{% endfor %}
</urlset>

Adding Language Selection Button to Sidebar

Create a _includes/lang-selector.html file and enter the following content:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
<p>
{%- for lang in site.languages -%}
  {%- if lang == site.default_lang -%}
<a ferh="{{ page.url }}" style="display:inline-block; white-space:nowrap;">
    {%- if lang == site.active_lang -%}
      <b>{{ lang }}</b>
    {%- else -%}
      {{ lang }}
    {%- endif -%}
</a>
  {%- else -%}
<a href="/{{ lang }}{{ page.url }}" style="display:inline-block; white-space:nowrap;">
  {%- if lang == site.active_lang -%}
      <b>{{ lang }}</b>
    {%- else -%}
      {{ lang }}
    {%- endif -%}
</a>
  {%- endif -%}
{%- endfor -%}
</p>

Then, add the following three lines to the “sidebar-bottom” class section of Chirpy theme’s _includes/sidebar.html to make Jekyll load the content of _includes/lang-selector.html during page build:

1
2
3
    <div class="lang-selector">
      {%- include lang-selector.html -%}
    </div>

Further Reading

Continued in Part 2

This post is licensed under CC BY-NC 4.0 by the author.