Appendix 4 – Moses Core

The Moses Decoder is one of many open source components in Slate Desktop™.

These videos demonstrate the academic fundamentals of statistical machine translation (SMT), including the tedious Moses command lines on a Linux operating system.

Basic SMT

Principles of Machine Translation

This presentation provides a brief overview of the history of machine translation and the approaches that were developed during that history. It then focuses on statistical machine translation including its different flavors, the process of training an SMT system with training data and the decoding process to perform translations.

Training Data – Bilingual and Monolingual

Training data is the essential ingredient for statistical MT systems. This presentation describes parallel and monolingual data, where to obtain training data, and how to combine and select data to achieve the highest quality MT output.

Training Data – Conversion and Corpus Preparation

This presentation and screencast describes the required training data format for the Moses SMT system and shows how to convert data into this format. It also shows how to align text from translated documents and how to convert TMX files to source more data for SMT training.

Training Data – Cleaning and Tokenization

Once data is converted into the right format, it needs to be tokenized and cleaned before it can be used to train a SMT system. This presentation explains tokenization and word segmentation for East Asian languages and outlines cleaning options for SMT training data, used by many MT vendors. The presentation provides guidance on which […]

Introduction to Moses

This presentation contains an overview of the Moses machine translation system, of associated components and the requirements on how to obtain and run the system. It also describes the history of Moses and the larger open-source Moses ecosystem including the development process, support and opportunities to contribute.

Training a Moses MT System

This screencast shows how to train a small Moses SMT system with the training data prepared in earlier screencasts, how to tune the trained system using a tuning set and finally how to perform translations with the trained system.

Advanced SMT

Bulk Translation and MT System Optimization

This screencast uses the Moses SMT system trained earlier to bulk translate a set of test data for which the BLEU score is calculated based on the available reference translations. In the second part of the screencast the trained Moses system is optimized for lower memory use and translation speed.

Evaluating MT Systems – Automatic Metrics

This presentation provides a contrast between automated evaluation and human evaluation of machine translation output. We explain how automated evaluation is useful in the development of MT systems and then go on to describe the automated metrics BLEU, TER, GTM and Meteor.

Evaluating MT Systems – Human Evaluation

This presentation describes different strategies of human evaluation for MT output, how to use them for error analysis for the improvement of MT systems and how to apply them in an industry setting to achieve the desired project goals.

Document Translation and Integration Scenarios

Translation of complex document formats is common in the language industry. This presentation explains how the Okapi Framework and the Moses for Localization open source project can be used to translate these file formats using machine translation. We also address how to translate web pages with Moses and how to integrate Moses MT systems into […]

Document Translation and Web API Demo

Previous demos showed how to translated single sentences and collections of sentences. This demo shows how to translate complex document formats using a combination of the Okapi Framework, Moses for Localization and Moses. The second half demonstrates the use of two web APIs that are available for Moses – the Moses Server XML-RPC API and […]

The European Commission’s Grant Number 288487 of the 7th Framework Programme sponsored the 3-year MosesCore Project , which included the Moses statistical machine translation system and the production of these video tutorials about SMT. TAUS produced and published the videos linked here.