This is an integrated application suite configured for single-language vendors to build and use customized machine translation engines. It includes Slate Corpus™ to prepare and manage MT training corpora and Slate Connect™ to integrate engines into CAT tools.
The machine translation engine is responsible for your post-editing experience. To maximize your post-editing satisfaction and bottom line, you need to stop post-editing MT with other translators’ words. You need an engine that generates MT with your words in your style.
Slate Desktop™ is a suite of these three powerful applications to build customized machine translation engines and post-edit with your CAT.
Slate Corpus™ – Every machine translation engine starts as a training corpus. Corpus management directly impacts the quality of MT you post-edit. Slate Corpus™ converts translation memories into training corpora, organizes training corpora in a reusable inventory and prepares the corpora before building an engine
Slate Toolkit™ – Advanced machine learning technology learns to translate by studying a training corpus. It learns vocabulary, grammar, syntax, terminology and preferences from the corpus. Slate Desktop™ saves the corpus knowledge and translating style in customized machine translation engines. When you convert your translation memories to the training corpus, the engine has no choice but to learn to translate with the words and style in your TMs.
Slate Connect™ – The advanced artificial intelligence in Slate Connect™ uses your customized machine translation engine to generate the MT you post-edit in your CAT. It doesn’t make the typical mistakes that Ready-Made™ and online engines make. It knows that embarazada in Spanish is not embarrassed in English. It knows you prefer “my e-mail“, not “my email” or “my eMail.”
Slate Desktop™ is perfect for individual translators and single-language vendors. You can build an unlimited number of customized translation engines in both directions for your language pair.
The build cycle depends on the size of your translation memories and your PC spec. With 80,000 to 100,000 segments in your translation memories and the modern PC that runs your CAT tools, the typical the build cycle finishes overnight.
When the build cycle is complete, the new engine automatically appears in your CAT tool’s Slate Desktop™ plugin configuration. You can also export your engines to use on another PC with Slate Connect™.
Multi-language vendors can browse Slate Desktop Pro™ features.
Languages – Supports two languages from among the 48 possible languages. Your translation memories and Slate Desktop™ create engines that translate between the two languages in either direction.
Organize Translation Memories – Tools that organize an inventory of translation memories by client, subject matter and project type. More tools prepare translation memories as training corpus to build translation engines.
Build Customized Engines – Tools that build custom machine translation engines.
Evaluate Engines – Estimate how engines will perform as you work.
Weighted Engine Builds – Build or update Slate Desktop™ engines prioritizing some translation memories to have more influence over others.
Pre-Translate Files – Tools to pre-translate source segments and save TMX and XLIFF files to include in any CAT tool project.
Supported CAT Tools
- memoQ by memoQ Technologies – Version 7.8 build 124 (April 19, 2016) and newer
- Trados Studio by SDL – Versions 2015, 2017, 2019
- CafeTran Expresso by Cafetran – Versions since January 2017
- OmegaT by the OmegaT Project – Versions since April 2016
Forced Terminology – Add global or per engine or per language source-target terminology files that Slate Desktop™ engines use to force translations of known source language terms.
Terminology On-The-Fly – Add new source-target terminology with just a couple of keystrokes while working. The engine adapts and enforces the new pair in the next segment.
Deploy Engines – Tools that copy engines to another work environment.
Backup & Restore – Tools that backup and restore both the engines and the translation memories that created them.
Sample Translation Memories – Sample translation memories and other files to help you practice and learn.
Privacy & Confidentiality – Applications run on your PC. There’s no the Internet connection. They don’t log your activities like online subscription services. You’re fully in control of confidential work.
Hardware System Requirements
- Intel Core i5 (i7 recommended) or AMD Athlon 64 CPU (4-core x86-64, 2.4 GHz. more cores and faster are better)
- 8 GB of RAM (16 GB better, 4 GB possible as a toys)
- 2 GB of free hard drive space for base application
- 20 GB minimum free space during install. 100 GB (and more) on a high-performance drive is strongly recommended.
Operating System Requirements
Windows 7 64-bit with SP1
Windows 8 or 8.1 64-bit
Windows 10 64-bit
Ubuntu 16.04 or newer, 64-bit
CentOS/RHEL, kernel 3.2+, 64-bit
Other Linux on request
To be determined, currently unsupported.
Training Corpus – Translation memories to convert to corpus or publicly available corpus.
- 70,000 to 150,000 sentence segments
- One full-time translator’s work for 3 to 4 years
- 200,000 to 500,000 sentence segments
- Supports a team of translators
Using translation memories with only specialized segments yields better custom machine translation. Specializations such as financial & regulatory reports, clinical trials & pharmaceuticals, technical manuals, legal contracts, etc. yield more consistent and accurate translations.
There’s no upper limit on the number of segments, but too many segments may degrade the engine for specific, specialized use.
Supported File Types
- Text files with UTF-8 character encoding, Linux or Windows new line separators
- Tab-delimited files are specialized Text file (as above) with one tab per line. Text left of the tab is the source language. Text right of the tab is target language.
- TMX – translation memory exchange up to version 1.4b
- XLIFF – XML Localization Interchange File Format version 1.2 (.xlf, .xliff, .sdlxliff, .mxliff, .mqxliff)
- Gettext .po and .mo files
Other file types (.docx, .xlsx, etc.) supported through your computer-assisted translation (CAT)
License Agreement – A one-time payment, royalty-free end-user license agreement (EULA) to use the software on your machine in perpetuity without subscriptions or usage fees.
Platforms – Install and activate on any supported operating systems. Today’s support includes MS Windows and Linux. MacOS is planned.
Activation – Install and activate on one machine. Build engines and work on the same computer.
Maintenance Updates – Maintenance updates are published occasionally with new languages, enhanced features and bug fixes.
Technical Support – Access to priority technical support during the period between major version updates via our online support portal, https://www.slate.rocks/support/.
Money Back Guarantee – Buy Slate Desktop™, experience great technical support and learn how your translation memories perform as machine translation. If you’re not fully satisfied, uninstall Slate Desktop™ and request a refund within 30 days of your purchase. We will refund your full purchase price immediately – no questions asked.
Expert Review – We’ll help you understand how your first engine performs. Send us the automated evaluation report. We’ll review the performance with you.
Open Source – Slate Desktop™ distributes open source components under their respective licenses via the Slate Toolkit™.
May 13, 2020
- New languages:
- 2 Indic languages (bn, pa)
- Asian language (tl)
- Tidy other languages.
March 31, 2020
- Bugfix. Fixed edge case terminal error with small test corpora.
March 20, 2020
- Bugfix. Fixed an extreme edge case condition that caused the build process to hang early in the TM-to corpus conversion when it encountered a specific character at the end of a segment.
November 14, 2019
- New feature supports regex pattern matching for source segments during translation
- New feature supports regex pattern matching for target output from engine
- New feature removes bullet points and numbered list numbers from source when translating. Engine only sees the linguistic content. Automatically restores bullets and numbers bypassing the MT engine.
- New languages:
- 8 Indic languages (as_in, gu_in, kn_in, ml_in, mr_in, mni_in, or_in, te_in)
- Norwegian, Turkish, Ukrainian, Interlingua, Indonesian, Persian languages
- Updated nonbreaking_prefix files for better tokenization
- New Python tokenizer remove dependency on Perl runtime for tokenization
- New corpus cleaning functions:
- Triage segments by best quality match
- Remove duplicates keeps the best matching translation
- New feature for compound splitting for all languages
- New compound splitting models for Finnish, Dutch, Italian, Dutch and German
- New split prefix/suffix feature for all languages. Not enabled by default.
- New support to merge/blend new foundation corpora to supplement small customer TM sizes
- New weightings feature to translation model similar to weighting feature for the language model
- New option to build reverse direction engine during the same session
- Updated `import-europarl` graph to apply regex filters when downloading the corpus
- Updated to use new Python optimized regex library
- Improved support for multiple engines running in parallel
- Improved enforcement of Moses technical limits and runs faster
- Various bug fixes
- GUI support on Linux
- Edge-case cross-platform crash fixed
- UI summary display update and spelling corrections
- Fixed TMX language tags to correctly use name space
July 20, 2019
- Fixed installer falsely detecting Python installation on Windows 10 updates
- Fixed crash from Chinese jieba tokenizer logging after Windows 10 updates
June 8, 2019
- Fixed Windows Explorer custom file association to import/delete engines
- Disabled tmx export to avoid edge case illegal XML characters
- Added autohotkey script
- Added split-tmx.py
- Custom support for normal-cased translation model
April 3, 2019
- Fixed edge-case crash near end of build
Mar 14, 2019
- Added Persian language support (fa, fa-ir)
- Updated Python dependency libraries
- Removed a Perl dependency library
Jan 6, 2019
- New support for Trados Studio 2019 plugin
- Perl 18.104.22.168 for new installations
- Python 3.7.2 for new installations
- Python libraries version bumps
- Installer support for UNC paths
- Installer UI enhancements
- Various bugfixes
- Tidy code & consistency tweaks
Dec 14, 2018
- Python 3.7.1 for new installations
- Optimize Python library installations
- Install Microsoft foundation classes if missing
- Improve temp folder configuration
- improved curate absolute long segments
- tmx curate update. append-creationdate adds -yyyymm to label1 value
- bugfix write-txt plugin that accumulated temp source files
- bugfix screen scraper skipped terminate subprocess on error
- bugfix mert loop to address edge-case terminations
- bugfix TMX read namespace for xpath
- bugfix error handler in TMX reader
Sept 30, 2018
- installer update supporting change to UTF-8 setup.ini file
- installer update windows pip from wheel
- updated windows shell scripts
- engine summary display update
- revert to use Python 2.x’s ‘IOError:’ error type
- add ‘apt install build-essential’
Aug 31, 3018
- Fixed Linux copystat() error
- Fixed edge-case race screen scraper failure at end of subprocess
Aug 17, 2018
- Version bump to 1.6.0
- Changed to Python 3.7 as default installation version
- Python 3.x support bugfixes
- Improved forced terminology file creation when building engines
- Improved evaluation scoring and summary reporting
- Improved handling TMs with predominantly short segment lengths, less than 7 words – can improve engine quality
- Improved weighted/prioritized TM processing when building engines
- Improved all uppercase segments when building engines
- Bugfix loss of short segments when cleaning – can improve engine quality
- Bugfix creating size files for future GUI update
- Bugfix in regex support
- Various optimizations and refactoring source code
May 24, 2018
- fixed custom file association setup to reduce overhead of Windows Defender
- fixed moses demo scripts sample data layout
- fixed trial edition installer support for –setup
- fixed edge case error cleaning/reorganizing functions
Apr 6, 2018
- Preparation support for version 2.x
- Tweak Slate Connect™or to run in background without opening terminal window
- add watchdog timer shuts down Slate Connect™or when unused for 20 minutes
- Temporarily removed logging (to be re-added)
- Edge case bugfix for Unicode SEP characters
- Edge-case bugfix for Python 3 file sizes
Mar 6, 2018
- Support for Python 2.7 and Python 3.6
- Updated to latest Python libraries on 2.7 and 3.6.
- Installs Python 3.6 for new installations.
- Uses Python 2.7 on updates.
- Edge-case installer fixes.
- No functional Slate changes.
- Removed expired installer code signing certificate.
Feb 11, 2018
- Linux version – added basic neural network probabilistic language model (NPLM) support
- Linux version – added basic neural network bilingual language model (BLM) support
- Windows & Linux versions – minimal tweaks to progress display
- Tweaks to demo shell (Bash and Batch) scripts
Jan 11, 2018
- optimize performance by excluding Slate processing from Windows Defender bottleneck
- cross-compiler, cross-platform binaries updated and optimized
- updated demo shell (Bash and Batch) scripts for macOS and consistency
- edge-case error trap
- various preparation updates for macOS and Moses 4
Dec 21, 2017
- MAJOR BUGFIX: The Windows 10 1709 update caused an open source component fail, truncating one of an engine’s databases without warning. This problem may have also manifest itself in larger engines before this update, causing poor translation output quality. Everyone should download this update and rebuild their engines using the “Base on engine…” feature to make sure your new engine has the same TMs as the older engine.
Dec 14, 2017
- UPDATED SDL Trados Studio plugin: Fix bugs from Trados and Windows updates
- UPDATED SDL Trados Studio plugin: Added option to remove the ‘AT’ flag
- UPDATED SDL Trados Studio plugin: Signed by SDL
- mydutchpal.com support for en-nl and nl-en
- update BUILDS manifests for birds example
- improved formatting/display of progress during MERT
- consistent creation of .json manifest files
- optimize progressbar updates during train-tm
- update branding URL references to https://www.slate.rocks
- add simple exclude option to corpustypes
- update cleaning to extract short terminology segments
- consistent script processing between command-line and GUI
- updated scrub-tm to save dupe pairs in tm-DUPE tree
- updated scrub-tm to save identical in tm-SAME tree
- clean exit GUI
- Added instructions to setup and configure CafeTran MT plugin
Oct 26, 2017
- Attempt to fix edge-case connector error on some localized Windows systems
Oct 23, 2017
- fixed windows setWindowsACL of folders/files errors
- cleaned __slate_prefix__ global variables
- refactored command-line executables for consistency
- added adaptive support to `xslate` executable
- added mydutchpal.com `xslate` graph to MT connector
- move set_up() out of highlevel to setup.py module
- always add Python prefix to beginning of path
- buggix edge case for clean-tm RATIO tree
- added option to continue batch queue if an input file fails
- changed slate-daemon* to xslate-daemon*
- bugfix installing Perl Date::Format module
- improved Windows file association to support delete-engine from double-click
- replaced windows batch files with mklink symlinks
Sep 28, 2017
- reverse edge-case unknown train-model.perl errors
- full support for Fedora, Redhat, CentOS, Ubuntu and Debian Linux
Sep 23, 2017
- bugfixes regression tests for Linux version
- release Slate Connect™
Sep 14, 2017
- bugfix for edge-case unknown train-model.perl errors
- Added Croatian language support
Sep 02, 2017
- EULA updates
- Trados Studio connector README udpate
- branding updates to Slate/Slate Rocks
- OS path consistency updates
- edge-case bugfix installer %COMSPEC% update
- edge-case bugfix character encoding error with wx GUI display updates.
Jul 10, 2017
- Jul 10, 2017Updates to use secure socket (https) with the license authentication server.
IMPORTANT: THIS UPDATE DEPRECATES ALL PREVIOUS INSTALLERS. YOU MUST UPDATE TO THIS INSTALLER TO ACCESS OUR NEW HTTPS SECURED WEBSITE.
- Bugfix: installer failed to update when activation count was at max. – fixed.
- Branding updates
Jun 20, 2017
- Bugfix “translate file” button was case-sensitive with file extension (.txt was ok but .TXT failed). Fixed.
- Copyright and branding updates.
- Major packaging update to support upcoming “starter” and other editions.
- User-choice component selection during installations
- Modular installer packaging. Now users download small (10 MB) executable and the installer downloads only the components that the user selects.
Apr 27, 2017
- Removed aggressive cleanup of backed up changes from installer that deleted a terminology.tab file.
- Implemented a less aggressive cleanup of backed up changes in the installer.
- Added protection to block reusing engine names that were deleted.
Apr 11, 2017
- Bugfix/work-around UTF-8 character decoding error with unidentified illegal UTF-8 characters. The work-around uses a nuclear option to ignore the error and drop the illegal character.
Mar 17, 2017
- Added extra error trap to Moses’ train-model.perl processing to terminate processing when open source MGIZA++ fails to report the error.
- Refined error handling logic where a secondary error masked the original fatal error.
- New vocabulary logs under the BUILDS\lm and BUILDS\tm subfolders. Previous vocabulary files were a simple list of vocabulary. The new format is a tab-delimited file with vocabulary in the left column and the count in the right column.
Mar 10, 2017
- bugfix – the change from processing with memory buffers to on-disk allows processing of huge data files, but created a bug. If there were HTML escape codes for newline and line separator characters, the ‘scrub-tm’ graph converted them to unicode characters in the text. Processing on-disk meant these changes shifted the number of lines and broke source-target alignment. This fix converts newline and separators to spaces that preserve alignment. This restores the functionality that was used in the in-memory buffer processing.
Mar 8, 2017
- bugfix `merge-original` functionality in writer-xlf.py. This bug caused SD to create a new XLIFF output file instead of merging translations into the original input XLIFF file. Fixed.
Mar 3, 2017
- Bugfix – failure to translate mqxliff files from the Dashboard’s “Translate a File” button – fixed
- Bugfix – removed UI enhancement that displayed progress bars for filter chains per TM file processed. When processing large numbers of TMs (sets of over 1,200 files), Slate Desktop™ crashed and disappeared — fixed.
- Bugfix – target language exceptions not saved in workbench tree for use in language model — fixed.
- Enhancement – ratio exceptions saved in new ‘tm-RATIO’ tree under ready-workbench for detailed review.
- Quality enhancement – removed conservative configuration settings that reduced hardware requirements for early users, but potentially also reduced the quality of some engines. These restored configuration values consume slightly more hardware resources but based on actual customer reports about their hardware, most users will not notice the difference. These updates change does not affect current engines, but potentially improves the quality of new engines
Feb 15, 2017
- Bugfix – the overhaul that enabled huge TM files created a “too many files open” error when processing hundreds of normal-sized TMs. This update fixes that bug such that one huge TM and many average sized TMs process equally. If you’re merging hundreds of TMs into one engine, you need this update.
Feb 8, 2017
- Fixed math calculation for invalid, edge-case data
- Improved resume-after-error support
Feb 8, 2017
- New Korean language support
- New Hindi language support
Feb 6, 2017
- Added new evaluation report metrics to the engine details
(i) Report additional BLEU calculated without BLEU 1.0 segments. This indicates the amount of work necessary to edit non-correct segments
(ii) Report average edit-distance per non-correct segment. It’s another indicator of the amount of work necessary to edit non-correct segments.
Feb 03, 2017
- scrapped SoMaJo German tokenizer
- reverted to Moses built-in tokenizer
- supplemented tokenizer with true unsupervised compound splitting for any language – ask us if you want to build a custom compound splitting dictionary for your language.
- ships with German compound splitting dictionary (136,450 unique word lexicon with 3 or more instances in EuroParl German corpus)
WARNING TO ALL GERMAN USERS:
This update requires that you rebuild all of your engines. Please plan carefully and apply this update only when you have time to rebuild all of your engines. Please contact us for instructions if you need to gradually migrate engines over time.
Jan 31, 2017
- major code refactoring for cleanup and consistency
- migrated to consistent internal object model
- clean/remove legacy files/code
- migrated to file-based corpus processing – eliminate need to chunk huge files
Jan 22, 2017
- installer update to force deep cleaning during update
Jan 20, 2017
- bugfix xslate graph URL escape for source language
- bugfix error when restoring archive and importing engine
- bugfix engine details sometimes not displayed
- faster scrub-tm performance
- better error reporting with malformed XML
- updated evaluation report to save tokenized source with retoken markers
Jan 9, 2017
- Significant code refactoring. reorganized source code libraries in preparation to add new features
- No functional difference. All works the same as before. No practical reason to apply this update
Dec 28, 2016
- Bugfix optimized MT connector daemon failed on some systems.
- added –content command-line argument to xslate to show translated content instead of file path to file with content.
- Updated evaluation report to show BLEU 1.0 instead of 0.0* for exact match segments shorter than 4 words.
Dec 20, 2016
- Bugfix optimized MT connector daemon auto-launch error on localized version of Windows
- Bugfix data buffering problem with optimized MT connector that cause no input to the CAT.
- Changed to the optimized daemon as the default for all MT connectors.
- RELEASED Linux update with all functionality now parallel to the current Windows version.
Thank you Igor for letting me use your Russian Windows system to troubleshoot the bug. I couldn’t have done it so quickly and painlessly without you.
Also, see Igor’s performance report for the optimized MT connector daemon. It looks like there’s an average 2 second per segment speed improvement with the new connector. Again, thanks Igor for sharing!
Dec 8, 2016
- Bugfix installer failure to recognize/authenticate valid license key
Dec 5, 2016
- Bugfix for German compound splitting tokenizer
- New icons
Dec 3, 2016
- Improved auto-start for MT connector optimized daemon
- Support additional features with daemon
Nov 30, 2016
- Optimize MT connector daemon
- Support for TM weighting
- New German compound splitting tokenizer
- Added Slate Desktop™ version to engine description
- Vocabulary files to builds – see what vocabulary is in you corpus builds
(BUILDS\lm\$BUILDAME and BUILDS\tm\$LANGS-$BUILDNAME)
- Allow dots in labels
- Various bugfixes
Oct 23, 2016 (documented Dec 4, 2016)
- Automatic terminology file added to Engine
- Curate TUs in your TMX
- Regex replacement support for corpus preparation
- Various bugfixes
Sep 21, 2016
- Added “birds” examples and changes to allow generating an engine with only 10 sentences in the parallel training data.
- Preparations for Optimized CAT MT connectors.
Sep 14, 2016
- Added terminology file support to evaluation of new engines.
- Fixed error with UTF-8/ASCII characters in displays.
- Some optimization updates.
Aug 10, 2016
- Moved management functions from the GUI to graphs. Changes shouldn’t affect users.
- Resume failed engine generation. You can fix the problem and SD will to resume and continue at the point of failure.
- Preparations to optimize CAT MT connectors
Jul 31, 2016
- Affects legacy DoMT customers. With this update, DoMT customers can use these instructions to convert their old BUILD sets to generate a new Slate Desktop™ engine.
July 27, 2016
Fixes a bug in the conversion from legacy engine configurations to the new configuration.
When you apply this update to 1.1.5 or before, you will not have any problems. If you applied 1.1.6, your engine configurations are corrupted and need to be fixed. Unfortunately, the legacy configuration files were deleted and can not be used for a fix. There are 2 options:
- If you have an export package of that engine, simply import the engine over your existing engine. The import process will convert the old configuration file from that package and you’re ready.
- If you do not have an export package, please contact me directly and we will manually convert your engine configurations for you. For us to fix the configuration files, we need you to send the following:
(a) go to C:\xslate\User\graphs
(b) zip all of your engine folders. They start with “xslate-” these files only contain configuration settings, not any data
(c) Forward the zip file(s) to us. We’ll repair the corruption and send back to you
Jul 25, 2016
Fixes 1.1.1 to 1.1.5 fixed several minor bugs. This 1.1.6 update adds some new features:
- Linux installer bugfixes (non-English localizated folder names caused problems. This is a work-around, not a true fix. If you’re installing on non-English Ubuntu, please contact us.)
- Better terminology support:
(a) target terms output with same casing as in the terminology file
(b) engine-specific terminology files
- New encapsulated engine configurations. This enables better terminology support above.
(a) The installer converts your legacy engine configuration formats to the new format. If you experience problems, please let me know right away.
(b) When you import existing engines packages, the process automatically converts to the new engine configuration format.
Jul 25, 2016
Fixes that require you to regenerate your engines. Note that your current engines will run without rebuilding but these changes might degrade quality if you don’t regenerate:
- Fixed tokenizer escaping error that caused punctuation and symbol errors (most prominent in French text)
- Fixed tokenizer unescape error that caused %(#93)s and %(#91)s to leak
- New GUI look.
- Split engine build process and added UI pages for each.
- New page to add TMs to SD’s inventory.
- New page to select TMs from inventory to generate an engine.
- UI remember the most recently entered language pair and file labels (no need to retype).
- Added backup for engines & TM inventory – Saves more than “export” feature.
- Support to delete inventory files.
- Added Base on engine button to use existing engine as template for new engine.
- Single, shared error box for text input fields (cleaner UI).
- Auto-select an engine in the dashboard list after generation is complete.
- Enhanced the Engine Details panel.
- Extended keyboard control through added menus with shortcuts.
- Fixed Error running msiexec /package installer problem.
- Various other installer updates and fixes, including updating to InstallBuilder 16.4.0.
- Prevent Windows suspend mode while generating engines and batch translating files (Windows shouldn’t sleep).
- Installer associates Slate Desktop™ icons to identify files in Windows Explorer.
- Added uppercase feature. When source segment is all uppercase text, SD sets target segment to uppercase.
- Tuned preparation to stop removing TUs with only one token.
- UI remembers the languages and labels from the last session and between new pages.
Feb 18, 2016
- Now available.