Unbabel supports several file types which use specific filters to ensure that the relevant content is extracted for translation and delivered back to respect the original format.
However, this process is not foolproof and might have different results depending on the file type or the complexity of your file formatting.
The table below contains the filter description and expected behaviour, along with the risk rating, for each file type. The risk rating provides an idea of how likely it is that the translation process can result in some of the original formatting being lost to an uncertain degree.
File Type | Filter Description | Risk Rating |
.NET Managed Resource (.resx) | Extracts the content of the values. Embedded HTML and generic placeholders are protected. | Low |
Adobe FrameMaker Interchange format (.mif) | Extracts variables, index markers, body pages, and master pages. | Medium |
Apple Stringsdict (.stringsdict) | Extracts the content of <string> elements (excluding certain keys). Generic placeholders are protected. | Low |
Comma Separated Values (.csv) | Extracts all table data from all columns. Generic placeholders and embedded HTML are protected. | Low |
Configuration File (.properties) | Extracts the content of the values. Embedded HTML and generic placeholders are protected. | Low |
Darwin Information Typing Architecture (.dita) | Accepts well-formed XML documents adhering to specific DITA syntax rules. Generic placeholders are protected. | Medium |
Darwin Information Typing Architecture Map (.ditamap) | Accepts well-formed documents adhering to specific syntax rules. | Medium |
Document Type Definition XML (.dtd) | Accepts well-formed documents adhering to specific syntax rules. | Medium |
EXtensible Markup Language (.xml) | Accepts well-formed documents adhering to specific syntax rules. HTML is protected in CDATA. | Medium |
HyperText Markup Language (.htm) | Extracts all content from the file. Generic placeholders are protected. | Low |
InCopy Markup Language (.icml) | Extracts all content from the file. | Low |
InDesign Markup Language (.idml) | Extracts all content from the file, except for XML structures. | Low |
JavaScript Object Notation (.json) | Extracts all values. Embedded HTML and generic placeholders are protected. | Low |
Markdown (.markdown) | Extracts all content from the file. Embedded HTML and generic placeholders are protected. | Medium |
Microsoft Excel (.xlsx) | Extracts all content from the file, excluding document properties, comments, hidden rows/columns, etc. | Low |
Microsoft Excel (.xltx) | Extracts all content from the file, excluding document properties, comments, hidden rows/columns, etc. | Low |
Microsoft PowerPoint (.potm, .potx, .ppsm, .ppsx, .pptm, .pptx) | Extracts all content, excluding document properties, comments, notes, and placeholders in the Master slide. | Low |
Microsoft Visio (.vsdx) | N/A | N/A |
Microsoft Word (.docm, .docx) | Extracts everything except document properties, comments, and graphical metadata. | Low |
OpenDocument Presentation (.odp) | Extracts everything from the file, treating embedded files as sub-documents. | Medium |
OpenDocument Spreadsheet (.ods) | Extracts everything from the file, treating embedded files as sub-documents. | Medium |
OpenDocument Text Document (.odt) | Extracts everything from the file, treating embedded files as sub-documents. | Medium |
Plain Text (.txt) | Extracts all content from the file. Generic placeholders are protected. | Low |
Related Articles
More on how to request a translation order
More about our CRM Integrations
Comments
0 comments
Please sign in to leave a comment.