Sublanguages in memoQ
Posted by Péter Botta on 15 February 2016 01:12 PM
This article describes how memoQ uses translation memories, term bases, LiveDocs corpora, and the spelling checker if the current project or the resource(s) use a sublanguage, for example, English (US) instead of just English. The article also explains what happens if the sublanguage setting of the project is different from the sublanguage setting of the resource.
In memoQ, as a rule, different sublanguages count as the same language: for example, you can use translation memories in English (US) in projects from or into English (no sublanguage) or English (UK). The same goes for term bases and LiveDocs corpora. In the spelling checker, memoQ will choose the dictionary for the sublanguage, or a default if no sublanguage is specified. More details as follows.
1 Translation memories
memoQ completely ignores sublanguages when it is working with translation memories. With variants of English, here is what this means:
In a project from English into German, translation memories from English, English (US), and English (UK) will all return matches. The match rate will not be different if the same segment occurs in two TMs with different sublanguages. When it lists the results, memoQ will not even give priority to the TM with the matching sublanguage. Instead, TMs that return the same match will be listed in alphabetical order of their names.
In a project from English (US) into German, the same happens.
In a project where the target language is English, memoQ will not distinguish TMs by the sublanguage of their target languages.
In addition, memoQ will not have a problem updating or adding segments to a translation memory that has a different sublanguage either for the source or the target language.
2 Term bases
Term bases are different from TMs because a single term base can contain a main language and several of its sublanguages. Normally, memoQ will return hits from the term base regardless of the sublanguages.
However, when it adds terms, memoQ will look for the exact same sublanguage as the one in the project. For example, if you have a project from German into English (US), and you add a TB that contains German and English (UK), memoQ will use it in the project without a problem -- until you try to add a new term. Then memoQ will offer to add the new sublanguage to the term base. If you do not allow memoQ to do that, the new term will not be added.
From memoQ 2014, you can tell memoQ to be less tolerant of mismatching sublanguages. In the Languages section of the project settings (Project Home / Settings), you can check the Treat sublanguages as different languages in TB lookup check box. If this is checked, memoQ will only return TB hits if they come from the same sublanguage as either the source or the target language of the project
For more information, see memoQ help: http://kilgray.com/memoq/2015-100/help-en/index.html?strict_sublanguage_matching.html
Because of this behavior, a memoQ term base may end up having several sublanguages for the same main language. This often becomes a nuisance, and when it does, you can clean it up. For more information, see memoQ help: http://kilgray.com/memoq/2015-100/help-en/index.html?cleaning_up_sublanguages.html
3 LiveDocs corpora
A LiveDocs corpus can hold documents or document pairs for several different languages or language pairs. As a result, a LiveDocs corpus can return matches for several different sublanguages of the same language.
If you have a project where the target language is a sublanguage, a LiveDocs corpus will give you matches from the same sublanguage only. If there are no documents in the LiveDocs corpus for the exact same sublanguage, memoQ will return matches from documents that have the main language as the target language. If the LiveDocs corpus has documents in the same sublanguage, but there's no match from those documents, the LiveDocs corpus will not return any matches. In other words, a LiveDocs corpus will never return matches from a different sublanguage, only the same sublanguage or the main language.
If your project has a main language as the target language (with no sublanguage specified), LiveDocs will return matches from all sublanguages and the main language, too.
4 Spelling checker
In Options / Spelling and Grammar, you need to set up the spelling checker separately for each sublanguage. If you configure the spelling checker for the main language, say, English, that will not affect the settings for a sublanguage (English (US), for example).
For each main language and sublanguage, you have two choices: either you check spelling through Microsoft Word, or you use Hunspell.
If you use Word for spell checking, it will always choose the matching sublanguage. For the main language, it will use a default sublanguage: for English (no sublanguage), the Word spelling checker will use American English spelling.
If you use Hunspell for spell checking, memoQ will use the dictionaries you specify. For each sublanguage, you need to check the check box of a dictionary you want to use:
You can select several dictionaries, all of them, if you like.
If you select multiple dictionaries, memoQ will look up words in all of them at once. memoQ will mark a word as misspelled only if it's not found in any of the selected dictionaries. So, for the English main language, you can choose the British and the American dictionary: memoQ will accept both the British and the American variants of words (both 'travelling' and 'traveling' will be accepted).