This document describes a proposal for introducing extra attributes to the existing XLIFF 1.1 specification for the treatment of special cases that come under the scope of 'Segmentation'.
This document is designed to complement the document produced by the XLIFF Segmentation sub-committee that deals directly with the issue of how segmentation should be encoded within an XLIFF document.
There are two special cases that fall under the general topic of text segments that need to be addressed by the XLIFF standard:
<trans-unit>
elements that have been incorrectly
segmented to be translated?<trans-unit>
should not be regarded as a direct equivalent for translation memory
purposes?
It is inevitable that for whatever reason individual XLIFF <trans-unit>
elements
may not represent a piece of text that can be translated without reference to one or more surrounding
<trans-unit>
elements. The causes for this may be incorrect segmentation or bad document
design. A mechanism is required that stipulates in the translated XLIFF document that specific <trans-unit>
elements
need to be 'grouped' together to provide a distinct accurate and unified translation.
Example:
<trans-unit id="t1"> <source>The German acronym v.</source> <target>Niemiecki skrót v. OT oznacza górną pozycję silnika.</target> </trans-unit> <trans-unit id="t2"> <source>OT signifies the top dead center position for an engine.</source> <target/> </trans-unit>
In addition linguistically complete text may have to be
broken into a number of segments due to message size constraints. In these instances
the translator is not providing an equivalent translation for each <trans-unit>
,
but rather fitting in the target language text over a number of <trans-unit>
elements to meet the requirements of the target application.
Example:
<trans-unit id="t1"> <source>Constrained text for limited</source> <target>Tekst angielski dla</target> </trans-unit> <trans-unit id="t2"> <source>display for English</source> <target>ograniczonego pola</target> </trans-unit>
There may be other circumstances, where for whatever reason, the translation provided in
the <target>
element is not a direct translation of the <source>
element. A mechanism is required to allow this fact to be signalled within the <trans-unit>
element that the translation is not a direct equivalent. This is important during further processing
of the XLIFF document, say for loading translation memory.
After careful considerations, the XLIFF Segmentation Sub-Committee has come to the conclusion that the following additions are required to the XLIFF 1.1 standard:
The changes for this proposal would be as follow:
There are two changes required to the XLIFF XSD file:
equivalent-translation
" attribute for the <trans-unit>
element. The default value of this attribute will be "yes"
. The other
possible value will be "no"
to indicate that the translation for this
<trans-unit>
is not a direct equivalent linguistically of the source
language text. The following example demonstrates the use of the "equivalent-translation
" attribute:
<trans-unit id="t1" equivalent-translation="no"> <source>Constrained text for limited</source> <target>Tekst angielski dla</target> </trans-unit> <trans-unit id="t2" equivalent-translation="no"> <source>display for English</source> <target>ograniczonego pola</target> </trans-unit>
merged-translations
" attribute for the <group>
element. This new attribute has two possible values: "yes"
or "no"
. The
default value is "no"
. A value of "yes"
indicates that the <trans-unit>
elements contained within this <group>
element are to be treated together for
linguistic purposes. All <trans-unit>
elements that are encompassed by a <group>
element that has its merged-translations
element set to "yes"
normally have their equivalent-translation
attribute set to the value of "no"
. The text of all
of the <source>
and <target>
elements taken together form one linguistic whole. No requirements
are made regarding the distribution of the translation in the <target>
elements. This will be governed by the
requirements of the individual applications. The translated text may be placed within the first <target>
element
leaving the following <target>
elements blank, or distributed among the <target>
elements contained
within the "merged-translations
" <group>
element.
The following example demonstrates the use of the "merged-translations
" attribute for the <group>
element:
<group merged-translations="yes"> <trans-unit id="t1" equivalent-translation="no"> <source>The German acronym v.</source> <target>Niemiecki skrót v. OT oznacza górną pozycję silnika.</target> </trans-unit> <trans-unit id="t2" equivalent-translation="no"> <source>OT signifies the top dead center position for an engine.</source> <target/> </trans-unit> </group>
The following changes will be required:
equivalent-translation
" concept including relevant examples.merged-translations
" concept including relevant examples.equivalent-translation
" attribute regarding
<trans-unit>
elementsmerged-translations
" attribute for <group>
elements to section 3.2 (Elements).===== Start of proposed new entry (all inserted) =====
Linguistically complete text may have to be
broken into a number of <trans-unit>
elements due to message size constraints or other reasons. In these instances
the translator is not providing an equivalent translation for each <trans-unit>
,
but rather fitting in the target language text over a number of <trans-unit>
elements to meet the requirements of the target application.
Example:
<trans-unit id="t1"> <source>Constrained text for limited</source> <target>Tekst angielski dla</target> </trans-unit> <trans-unit id="t2"> <source>display for English</source> <target>ograniczonego pola</target> </trans-unit>
In this circumstance the "equivalent-translation
" attribute for the <trans-unit>
element is used to denote that the translation should not be regarded as a direct translation of the <source>
element. The default value of this attribute is "yes"
. The other
possible value will be "no"
to indicate that the translation for this
<trans-unit>
is not a direct equivalent linguistically of the source
language text. The following example demonstrates the use of the "equivalent-translation
" attribute:
<trans-unit id="t1" equivalent-translation="no"> <source>Constrained text for limited</source> <target>Tekst angielski dla</target> </trans-unit> <trans-unit id="t2" equivalent-translation="no"> <source>display for English</source> <target>ograniczonego pola</target> </trans-unit>
<trans-unit>
elementsIt is inevitable that individual XLIFF <trans-unit>
elements
may not represent a piece of text that can be translated without reference to one or more following
<trans-unit>
elements. The causes for this may be incorrect segmentation or bad document
design.
Example:
<trans-unit id="t1"> <source>The German acronym v.</source> <target>Niemiecki skrót v. OT oznacza górną pozycję silnika.</target> </trans-unit> <trans-unit id="t2"> <source>OT signifies the top dead center position for an engine.</source> <target/> </trans-unit>
In these cases the "merged-translations
" attribute for the <group>
element can be used to denote that the individual <trans-unit>
elements cannot be regarded
as a direct translation, but rather need to be treated linguistically as a merged group. This attribute has two possible values:
"yes"
or "no"
. The
default value is "no"
. A value of "yes"
indicates that the <trans-unit>
elements contained within this <group>
element are to be treated together for
linguistic purposes. All <trans-unit>
elements that are encompassed by a <group>
element that has its merged-translations
element set to "yes"
normally have their equivalent-translation
attribute set to the value of "no"
. The text of all
of the <source>
and <target>
elements taken together form one linguistic whole. No requirements
are made regarding the distribution of the translation in the <target>
elements. This will be governed by the
requirements of the individual applications. The translated text may be placed within the first <target>
element
leaving the following <target>
elements blank, or distributed among the <target>
elements contained
within the "merged-translations
" <group>
element.
The following example demonstrates the use of the "merged-translations
" attribute for the <group>
element:
<group merged-translations="yes"> <trans-unit id="t1" equivalent-translation="no"> <source>The German acronym v.</source> <target>Niemiecki skrót v. OT oznacza górną pozycję silnika.</target> </trans-unit> <trans-unit id="t2" equivalent-translation="no"> <source>OT signifies the top dead center position for an engine.</source> <target/> </trans-unit> </group>
This section lists the various attributes used in the
XLIFF elements. An attribute is never specified more than once for each
element. Along with some of the attributes are the list of their possible
values.
equivalent-translation - Indicates if the target laguage translation is a direct equivalent of the source text.
Value description:
yes
, or no
.
Default value:
yes
.
Used in:
merged-translations - Indicates if the group element contains merged trans-unit
elements.
Value description:
yes
, or no
.
Default value:
no
.
Used in:
===== End of proposed new entry =====
-end-