2. The Training Corpus

What is a training corpus and how does it differ from translation memories? Learn to select the best resources and use best practices to convert them to a training corpus.

Build and Manage Corpus

How to build an inventory of translation memories (and other sources) as a training corpus. Customers: Click here to login and see more details.

Data Science Behind the Scenes

 What’s happens when you click the Build now button? Customers: Click here to login and see more details.

When Can We Blame The Data?

This article describes a data cleaning challenge with a TMX file that the European Union published as a “clean” for machine translation purposes. Customers: Click here to login and see more details.

Working With Huge Corpora

The first Slate Desktop support ticket included this comment.  build a test engine based on one large TM (as an easy start)… I remembered a brilliant computational linguist’s comment. Kenneth Heafield created a critically important…

Continue reading →

Importing From TMX and XLIFF Files

Slate Desktop uses these language code rules when extracting translation units from these files to compensate for inconsistent implementations of TMX and XLIFF standards. Customers: Click here to login and see more details.