Filtering your files to extract text for translation
When you upload the file(s) that you want translated (your "source files"), we filter them to extract the text that needs to be translated, while protecting content that should be left untouched (e.g. styles and formatting, in-line code, numbers and placeholders, etc.).
For example, if you upload a Microsoft Word (.docx) file we'll extract all of the visible editable text from that file including the headings, paragraphs, and even the text from cells within tables.
- Visible text is any text that is shown in your file regardless of the text colour, but does not include content that is hidden. For example, any content on hidden tabs, rows or columns in a spreadsheet would not be extracted for translation. Visible text also includes editable content from embedded files. For example, if a PowerPoint presentation includes an embedded Excel spreadsheet, then the editable text from the embedded Excel spreadsheet would also be extracted for translation.
- Editable text is text that can be changed when editing the document, and would include labels that are on an inserted chart, for example. Editable text would not include text within an image.
- Content from embedded assets such as images, videos, etc. is not translated by default.
A key benefit of working with us is our ability to control what should be translated and what should not – we automate this for clients who order regularly with us. This simplifies and speeds up the process by removing the need for you to specify non-translatable content every time you place an order.
The extracted text is then separated into smaller chunks called “segments” – often these are individual sentences. Segments are checked for repetitions and then checked against your Translation Memory to detect whether they have previously been translated.
Your segments define the scope of your order by telling us what needs to be freshly translated and what translations can be reused (we call this “leveraging”) from previous projects – a practice that saves you money.
What if your file contains visible editable content that you do not want translated?
You may have content that needs to remain in the source language in your final returned document. For example:
- In a Microsoft Word (.docx)file you may not wish to translate the headers and footer content.
- In a Microsoft Excel (.xlsx)file you may not wish to translate the content in all columns or sheets.
- In a Microsoft Powerpoint (.pptx)file you may not wish to translate the slide notes.
If you do not want all of the text content in your files translated you have two options:
Option 1 – Edit your file prior to uploading it for a quote
You can edit your file and delete or hide any content that you do not want translated before uploading your file to your order.
- For example, you can hide columns or tabs using the hide feature in Microsoft Excel.
Option 2 – Request our file filtering service when adding your file to your order
Our File Engineering team can filter out content that you do not want translated before the file is counted for your quote. We recommend that you ensure that the content that you do not want translated is styled in a consistent manner and that you give clear instructions which styles (heading 1, subtitles, etc.) or content types (headers, footers, slide notes, etc.) that you do not want translated when you request this service.
To request file filtering for your file simply follow the steps to request file engineering in our step by step guide to creating an order.
Your files are then quoted and are ready for translation
Once your file has been filtered and your text has been extracted, we then send it to the Computer-Assisted Translation (CAT) tool component of our Translation Management System (TMS) where we leverage your Translation Memory and then provide you with your automatic custom translation quote.
Your content is then ready for translation by our professional translators directly within the CAT tool.
Please sign in to leave a comment.