Managing Slugs for Multiple Languages in Kontent.ai

Mike Edwards - Technical Director

14 Dec 2022

Share on social media

On a recent project, we were tasked with the challenge of translating Kontent.ai into 9 different languages for a website. English would be our base language and we needed to translate the content into 8 other languages. A few of the languages contain non-ASCII characters, e.g. diacritic and Cyrillic characters which was an added challenge.

Translation was going to be automated using a script and a third-party service. The script itself should be straightforward, just use the Kontent.ai management and execute the following steps:

  • Get the list of items
  • For each item, get the English item version
  • Run translations
  • Push the updates
  • Publish the items

However, things are rarely this simple and we quickly ran into a few things that we needed to account for:

  1. Before you can update a content item, it needs to be created. This was an extra call.
  2. We needed to copy non-translated fields across, fields like checkboxes and links to other content. The best way we found to do this was to turn the elements into a JSON string and then deserialize them again in our code:

This allowed the code to create a deep copy of the elements and break the in-memory references.

  1. Requests needed to be rate limited. We quickly found that our script would "max-out" our requests per second, so we introduced a 200ms delay between every request to Kontent.ai.
  2. Our auth token expired mid-stream. Our script would run for several hours but after about two hours the auth token would expire. Sigh! So, make sure you have a way to refresh it because starting the process over again after 2 hours of it running, is very frustrating.
  3. A 404 will be returned if it can’t find an item version in a specific language, this is expected. Make sure your code doesn’t fail if a 404 error is returned.
  4. Create a configuration mechanism that specifies which fields you want to translate. There will be text fields that you shouldn’t translate.
  5. Finding the publish status of an item was difficult, so we decided that we would just publish everything each time we ran the tool rather than check if it need to be published.
  6. Duplicate your Kontent.ai environment and practice your steps there. Duplicating an environment will take several hours.
  7. Set the slug after running the Title through our cleaner script. See below.

Slugs

Since the titles of content were being translated, we wanted the slugs to also be translated to match the titles. We use the slugs to generate the URLs to each page and this would mean we would have language-specific URLs.

We need the script to generate the slugs and match the auto-generation rules that Kontent.ai use.

According to Kontent.ai’s documentation they use these two simple rules:

  • All letters are changed to lowercase.
  • All special characters are replaced by a dash (-) to prevent problems when used in URLs.

https://kontent.ai/learn/tutorials/develop-apps/optimize-your-app/seo-friendly-urls/

We wanted to be sure these rules are correct, so we went into Kontent.ai to actually check the real behaviour. For each test, we entered a title and then let the Slug auto-generate.

For English, everything worked as documented. The rules have been followed.

What happens if we try diacritic or Cyrillic characters?

Okay, this is interesting. This isn’t really what we were expecting. The diacritic and Cyrillic characters have been removed; this wasn’t in the documentation, ah.

After a quick conversation with Kontent.ai support, it turns out that Slugs can only support ASCII characters. This is a little disappointing since browsers can now support UTF characters (this is both good and bad, think phishing scams).

So back to our translation scripts and time to update them to solve this problem.

We found a couple of great libraries online to handle this problem for us:

A quick function and now we have a clean slug:

Translation Service

Initially, we planned to use Google translate to perform the translations for us. This made the most sense because the previous tool we used, provided by a third party, used it.

A bit of experimentation with Google Translate gave us some very weird results, and the API gave us different results to Google translate online.

A simple example was the word “Cart” as in shopping cart. Google would translate this to “Chariot” in French rather than “Panier”.

We decide to switch to Azure Cognitive Services which performs translation and were pleasantly surprised by how easy it was to set up and the quality of the translations.

Summary

Automation of translation is a quick way to translate a website and translation services are getting a lot better. The Kontent.ai management API made reading and writing data very simple, with the only caveat being the rate limit.

The translation services themselves are good but not always 100% and some translations may not make sense. You will have to decide if the small inaccuracies are worth the time and money saved by not using human translators.

If you are interested in automated translation of your website, and you have thousands of pages that need to be processed, please reach out to us at Konabos.

Sign up to our newsletter


Tags:

Composable DXP
Kontent
Content

Share on social media

Mike Edwards

With 18 years of IT development experience, Mike has worked across government, not for profit, and commercial sectors. He has delivered large-scale multinational websites, desktop and mobile applications, and mission-critical health apps. He works closely with the client and delivery teams to ensure that projects deliver business benefits and not just a technical solution.

Mike has been named Sitecore MVP between 2011-2019 and is the founder of the very popular Glass.Mapper.Sc ORM, which has over 1 million downloads.

Outside of work, Mike can be found exploring the British countryside, riding his motorbike, and learning the piano.


Subscribe to newsletter