Slate™ Corpus is a specialized product for those who need to clean translation memories and convert them into a training corpus without the need to create the custom machine translation models. If you don’t know what any of this means, then you can skip this product.
Slate™ Corpus is one of the applications in the Slate™ Desktop suite. It’s the application that organizes translation memories by client, subject matter and project type in an inventory. It also cleans, prepares and converts the translation memories into the training corpora that build translation models.
The Slate™ Desktop and Slate™ Desktop Pro suites include this application. This stand-alone application is for customers who want to organize or clean translation memory on another computer in addition to your Slate™ Desktop computer.
These features and functions come with Slate™ Corpus.
Support for all languages and all new languages as they are released in maintenance updates. With 34 languages, it supports a total of 1,122 language pairs.
Privacy & Confidentiality
Slate™ Corpus runs on your PC. There’s no the Internet connection. They don’t log your activities like online subscription services. You’re fully in control of confidential work.
Build Engines (Models)
Organize Translation Memories
Tools that organize an inventory of translation memories by client, subject matter and project type.
Create Training Corpora
Tools that clean, prepare and convert translation memories into training corpora to build translation engines.
Sample Translation Memories
Sample translation memories and other files to help you practice and learn.
Tools that automate repetitive and complicated Slate™ Desktop tasks to efficiently process large projects support a command line terminal or integration into your third-party applications.
You need to provide the following before working with Slate™ Corpus.
Hardware System Requirements
- Intel Core i3 (i7 recommended) or AMD Athlon 64 CPU (4-core x86-64, 2.4 GHz or faster)
- 4 GB of RAM (8 GB recommended)
- 2 GB of free hard-disk space for installation
- 250 GB (or more) of additional free space on a high-performance drive is required after installation
Windows Operating System (Option)
- Windows 7 64-bit Edition with Service Pack 1
- Windows 8 or 8.1 64-bit Edition
- Windows 10 64-bit Edition
- 32-bit Editions not supported
Linux Operating System (Option)
- Ubuntu 16.04 or newer (other Debian-based on request)
- CentOS/RHEL-based (other RPM-based on request)
MacOS Operating System (Option)
- To be determined, currently unsupported
- 70,000 to 150,000 sentence segments
- One full-time translator’s work for 3 to 4 years
- 200,000 to 500,000 sentence segments
- Support a team of translators
- There’s no upper limit to the number of segments
- Too many segments is an opportunity for variety that degrades performance
Slate™ Desktop reads and writes these standards-based localization file types:
- Text files with UTF-8 character encoding, Linux or Windows new line separators
- Tab-delimited files are specialized Text file (as above) with one tab per line. Text left of the tab is the source language. Text right of the tab is target language.
- TMX – translation memory exchange up to version 1.4b
- XLIFF – XML Localization Interchange File Format version 1.2 (.xlf, .xliff, .sdlxliff, .mxliff, .mqxliff)
- Gettext .po and .mo files
You can also work with file types supported by your computer-assisted translation (CAT), such as .docx, .xlsx, etc.
The installer installs and manages the following required dependencies.
Perl Scripting Runtime
Perl 64-bit version 5.28 or newer is a free open source scripting runtime environment.
Python Scripting Runtime
Python 64-bit version 3.72 or newer is a free open source scripting runtime environment. Required dependency libraries include:
License and support included with your purchase of Slate™ Desktop Crpus.
End User License Agreement
A perpetual, royalty-free end-user-license agreement (EULA) to use on your machines. No subscriptions or hidden fees.
Install and activate on any supported operating systems. Today’s support includes MS Windows and Linux. MacOS is planned.
Install, activate and work on one machine. Build engines and work on the same computer.
Maintenance updates are published occasionally with new languages, enhanced features and bug fixes.
Access to priority technical support during the period between major version updates via our online support portal, https://support.slate.rocks.
Slate™ Corpus distributes these components under their respective open source licenses.
Slate Toolkit Language Tokenizers
The language tokenizer scripts from the Slate™ Toolkit support data cleaning, conversion and processing.