2. Basic level: First steps into migration
You are not familiar with translation management systems, your experience is a bit rusty, or you just want a short overview?
We've got your back!
In this part of our guide you will learn about some basic concepts and how to use them during your data migration process.
First, you need to know that memoQ has two kinds of resources:
heavy resources - translation memories, term bases, LiveDocs corpora, and muses,
light resources - for example, segmentation rules, TM settings, QA settings, non-translatable lists etc.
Resources are information used regularly during translation or translation management. They make your work in memoQ more efficient or enhance its quality. You and your colleagues can reuse them in different projects.
Want to learn more? Read the documentation page about all types of resources.
Let's start with heavy resources
Heavy resources contain a large amount of language data, for example, monolingual or bilingual text, pairs of segments, glossary entries, or a large amount of statistical data. Here, we will talk about two the most useful when migrating data:
Translation memories (TMs) are the key to every modern CAT tool (Computer-Aided Translation tool) or TMS (Translation Management System).
When you confirm a segment in the translation editor, memoQ saves the source text and its corresponding target text into a translation memory. So the TM contains pairs of equivalent sentences in both source and target languages.
When you translate or pre-translate a document, memoQ will scan every translation memory attached to the project, and it will show the segments where the source text is the same or similar enough to the segment you need to translate, so you can re-use your previous translation. This is how you can recycle your work and save time.
Translation memories in memoQ are bilingual. This means that each TM includes one language combination only. If they are online TMs (they usually are), every user in a project can see the changes made to the TM in real time.
Why they are important for my migration?
Before migration, you need to think how you want to use your current resources in memoQ.
TMs can be used for leverage, but also as a reference only, to allow for a fresh start. memoQ allows you to use several translation memories in one single project.
There are three different TM types:
Working TM - stores all of the segments translated during the project.
Master TM - saves reviewed translations only. It can be updated only by the project manager (or through an automated process) after the last step of the project is done. However, its translations can be seen and used by all translators during translation.
Reference TM(s) - they can be used for pre-translation and reference but cannot be overwritten.
For example, a TM with UI translations that could be used when translating documentation but should not be changed, or, a TM that stores very old translations that were never reviewed.
Every project must have a master translation memory for every target language, and one working TM (it can serve as a master TM at the same time if no review is necessary.)
You can have one single TM that will play both working and master roles. It means that such TM would save all translations either reviewed or not.Actions to consider for your migration:
What you need to do? Why? Decide what translation memories should be part of the pilot phase. This will allow you to see how TMs work in memoQ, so you can check the different possibilities before the final migration. Decide which translation memories will be used as Master TMs (if any). You might want to have a closer look an them during migration and treat them more in depth. Decide which TMs are for reference or are outdated. They can be merged*, or not even migrated in the beginning. *Talk to your memoQer to discuss merging. Make sure the TM size is manageable.
Before migration: Choose a small set of translation memories (or bilingual reference documents) and send them to memoQ for the pilot phase. They should allow you to test the various setups in a project.
Context TMs - what are they and why they are important for my migration?
There are different types of translation memories divided because of their behavior and the way they store content. The most commonly used translation memories (default setting) are context TMs - they store the actual translation and the local context of a segment.
Most translation memories you will use are context TMs. If possible, your current TMs will be migrated as context TMs.
In case you are translating a structured document like software strings, Excel tables or XML documents, it is possible to define a key or identifier as the context for the translation. The specific context for a segment is defined by the actual translation document.
If no context rule is applied or useful (for example, for MS Word documents), the context is the segment before and after of the translated segment. This helps to ensure, for example, that a pronoun is used in the correct gender in the flow of the text.
If, in a new translation, a match is found with the very same context, it is marked as a 101% match. Often these translations can be pre-translated safely and locked before translation starts.
If you are already using context TMs, the context may not be fully migrated. The way it is represented differs from TMS to TMS. In some cases, we can help you migrate most of it, but sometimes it is impossible.
Actions to consider for your migration:
What you need to do? Why? Check how your context is preserved during migration. Do this during the pilot phase. This will allow you to later check with memoQ if there are options to improve migration. Before migration: Send memoQ sample exports of your translation memories as soon as possible. We will analyze them and discuss with you what is possible during migration.
A LiveDocs corpus is a collection of documents. It can contain monolingual documents, bilingual document, as well as aligned document pairs. Thanks to LiveDocs corpora you can reuse existing translations (look up phrases and segments) without going through translation memories.
You can also use it to add background material to the project - documents and document pairs of any language. A LiveDocs corpus offers matches in all the languages of the documents in it.
Why they are important for my migration?
If you don’t have any TMs, you can build them through LiveDocs. Same works for your term bases.
LiveDocs serve three main purposes within memoQ:
They can store monolingual reference materials to a given project to make it available to translators.
They can hold previously translated documents to preserve the document context as a whole and enable the translator to check the corresponding document context for a suggested match.
LiveDocs can be used to align existing translated documents, that were not translated within memoQ or another CAT tool. These alignments will allow you to reuse previous translations without going through a translation memory.
In most cases translation memories are more efficient and powerful, but LiveDocs give you another option to approach specific needs for your project.
Actions to consider for your migration:
What you need to do? Why? Check the possibilities of LiveDocs. Do this in the pilot phase. They can give you an option to organise your migration differently as in the original plan.
Metadata in memoQ is basically all information that are stored alongside with a translation memory entry (or term in a termbase), besides the actual source segment, target segment (and the context of a segment).
Metadata also appear on several other levels within memoQ, for example in projects, translation memory and term base properties. The translator can also see them for a given match. The main purpose of metadata is to define relations between all of these levels, but also to describe the content of a resource.
Why they are important for my migration?
You can use metadata for different purposes, one of them being templates picking up TMs and TBs.
For example, metadata for a segment is shown during translation and can be used as filter in the TM settings.
We can divide metadata into three groups:
Technical metadata: All data that can be technically determined by the tool itself for example, date of creation of a segment in the TM, user ID of the creator, date of the last change to the segment, user ID of the changing person, name of the last document when the segment was created or changed.
This type is often used by translators to determine how reliable a segment translation is. If a translation is very old, some terminology might have changed in the meantime.
Built-In project-based metadata: All data that can be taken from the project, where a translation is performed. These fields are Client, Project, Domain, and Subject. Although the names suggest a certain use, it is not restricted to that usage (for example, Client can contain a specific product line of a client, or a specific department).
This type is often used to organize resources and projects. Not only but especially when you want to use project templates.
Custom project-based metadata: You can define and add custom metadata and it will be handled like the Built-In metadata and stored alongside with the translation. BUT: the option to use these as relations or for filtering is a little bit more limited. In general, it is advised to rather map your metadata to the built-in fields.
As memoQ supports multiple translation memories and term bases in a project, it might be better to use multiple reference translation memories, instead of using custom metadata for differentiation.
For example:
You want to translate the Terms and conditions for a particular client/department. You can set the metadata, from Domain to Legal and store it alongside in the normal Client TM, but then the matches coming from this TM would also appear in all other translations for this client.
The best practice would be to create a separate TM for legal translations for this client. In case the phrase Terms and conditions needs to be updated, you can edit this separate TM again, and in case another client or department wants to translate their Terms and conditions as well, you could also use this TM as a reference.
Actions to consider for your migration:
What you need to do? Why? Check your metadata. You will use them, but there is no need to migrate everything. Decide which data you really use and need. Define the metadata you need to keep.
Because after the full migration, a change here can result in a lot of effort.
Decide if all metadata has to be kept as metadata or can be expressed otherwise.
Metadata is not the only or the best way to express a relation or function. Maybe it is better to separate translations in different TMs?
Before migration: If possible, inform memoQ before the pilot phase what kind of metadata you are using, or plan to use. During the pilot phase, check how your metadata is displayed and if it’s used as expected, and let memoQ know if there are issues.
It's time for the light resources!
What are light resources?
Light resources are settings that doesn't contain actual text or too much data. They are here to help memoQ make your work more efficient.
We will focus on those that are most related to your workflows (not to specific types of content) and are rarely changed:
When you add a translation document to a project, it is segmented during the import process - the text is split into segments, or translation units. Then translator must create a translation for every unit, creating a translation pair that can be stored in a translation memory.
Segmentation rules define how this text is split. In most cases, segmentation is performed on the sentence level. This means punctuation is used to mark the end of a segment (there are special rules to handle abbreviations or number formats.)
Why is this important for my migration?
You need to make sure the segmentation rules you choose for future projects will match the ones you used and have stored in your TM.
This way you can get as much leverage and matches as possible.
Segmentation rules are applied to every document imported to memoQ (not to the target language), which means that changing them afterwards has no effect on the already imported documents. But, remember that the segments are stored in the translation memory, so changing segmentation rules can affect the re-usability of your translations.
This is why it is very important that you adjust them as needed before getting started or even during a pilot phase, and why you should not touch your default rule set afterwards.
When it comes to punctuation and abbreviations, they are specific to each source language, but memoQ comes with a default segmentation rule set for all supported source languages (they will be suitable for most cases).
Actions to consider for your migration:
What you need to do? Why? Open your TM to see how segmentation looks like. To know how to best apply segmentation rules in memoQ, you need to know how segmentation was did before. Estimate how you will use segmentation rules in memoQ.
Segmentation rules are key to get the highest leverage from your TM, so always make sure hits are accurate.
Remove segment fragments in my TM, if needed.
If you leave them, the translation hits you will get won’t be usable.
Concordance wouldn’t always show sentence fragments.
Remove mismatched fragments caused by syntax changes.
If you leave them, the translations for the affected segments will be incorrect.
memoQis not responsible for any issues cause by removing content from your translation memory.
After migration: After migration and during implementation, make sure you have the appropriate segmentation rules. Your memoQ contact person or trainer can assist you on this matter.
memoQ can check many things in the translation automatically. These automatic checks are called quality assurance checks.
A QA settings resource tells memoQ what to check and how. For example, you can choose to check terminology, consistency, and segment length, or simply the inline tags only.
Why is this important for my migration?
QA settings will use your translation memories and term bases for consistency, so you need to make sure your information has the required quality so that those checks will work for you and not against you. QA checks depend on accuracy, cleanliness, and consistency of your data.
If you want to use QA, you may need different settings for different project types. In a scientific project, number checks and the right terminology are key.
For example, in an article, you would like to have number 6 written as 'six', or maybe you do not want to use the exact same phrase every time a specific term appears. Using all checks in every project may create the so-called false positives - the segments are marked as problematic when they are correct. Although this cannot be completely avoided, it can be significantly reduced by choosing the relevant checks according to the project.
QA rules can mark, for example:
misused terminology
incorrectly placed tags
extra spaces in source and target
number formatting
Actions to consider for your migration:
What you need to do? Why? Terminology:
Check correct spelling of terms. memoQ will not verify it for you, and will run the QA check with the incorrect spelling. Remove term duplicates.
If you have duplicated terms, memoQ can’t check term consistency.
Remove paragraphs.
Term bases, for example, are optimized for single words and small expressions. Too long entries might not be shown to the translator.
Remove mismatched fragments caused by syntax changes.
Matching works differently in TM (it is for longer segments) and TB (is optimized for single words and small expressions).
Add forbidden terms.
This is a common pain, and memoQ can help you check for those. You could even create term bases with forbidden terms only.
Translation memories:
Remove duplicates.
If you have duplicates, memoQ can’t check term consistency.
Check consistent use of terminology in TM.
Otherwise those inconsistencies will keep happening and multiplying throughout your translations, and memoQ won’t be able to detect them.
Remove segments that were mistakenly left in source language.
The QA check will show a warning or error if you translate those.
memoQ can’t be held responsible for removing or fixing errors in your translation memories or term bases.
TM settings (and LiveDocs settings)
A TM settings or LiveDocs settings resource tells memoQ how to get matches from a translation memory or a LiveDocs corpus.
This means two things:
match thresholds - set the minimum match rate of any match that memoQ shows, and tell memoQ what counts as a good match.
penalties - used in pre-translation. Penalties are used when a match is not reliable. If the translation is bad or inadequate, the match will be worse than the actual match rate.
Why is this important for my migration?
These settings can help you use translation memories that do not have the best quality and should be checked but are still good enough for leverage.
TM settings define the matching behavior of translation memory hits during translation and pre-translation. They define the level of matches shown and what is recognized as a good match (relevant for pre-translation). They also define how strict the matching of tags between the current segment and the TM must be.
It is possible to define certain penalties for specific TMs, so that if a document shows a 100% match against the TM, such percentage will be reduced.
In practice you would rarely need to change these settings after having defined them.
However, special settings might be needed per project, where you want to penalize a specific (old/unreliable) TM. If a client sends you a translation memory that you don’t know, you may want to penalize it.
Actions to consider for your migration:
What you need to do? Why? Review the quality of the translation memories you decide to migrate. There may be translation memories with lower quality that you may want to penalize, to still be able to use them for leverage but without compromising quality. After migration: After migration and during implementation, make sure you discuss with your memoQ contact person or trainer which TMs you want to penalize before you start using them.
Comments
0 comments
Please sign in to leave a comment.