[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: Applying <mrk> on examples
Hi everyone, Here is my homework: trying to apply the <mrk> element to segment the text of the example file. Cheers, -yvesTitle: Applying segmentation using <mrk>
<mrk>
Notes:
If it is required to do an adjustment of <bpt>
/<ept>
,
<bx>
/<ex>
-type tags, it should be done without
knowledge of the enclosed code.
It may not be necessary to adjust paired tags broken by segmentation if not
adjusting them does not result in invalid XML. Other translation tools for
example do not try to adjust an inline <b>...</b>
element that is
broken by interactive segmentation.
<trans-unit id="TagExample1">
<!--Sloppy HTML starting with <B>bold. Fancy <I>bold italic text</B> in the
middle. Only italic</I> last.-->
<!-- Version 1 (bpt and ept): how are matching <bpt> and <ept> handled if they
need to go into different segments? -->
<source>Sloppy HTML starting with <bpt id="pt1" rid="pt1"><B></bpt>bold.
Fancy <bpt id="pt2" rid="pt2"><I></bpt>bold italic text<ept id="pt1"
rid="pt1"></B></ept> in the middle. Only italic<ept id="pt2" rid="pt2"></I></ept>
last.</source>
</trans-unit>
<source><mrk mtype='x-seg' mid='1'>Sloppy HTML starting with <bpt id="pt1"><B></bpt>bold.</mrk> <mrk mtype='x-seg' mid='2'>Fancy <bpt id="pt2"><I></bpt>bold italic text<ept id="pt1"></B></ept> in the middle.</mrk> <mrk mtype='x-seg' mid='3'>Only italic<ept id="pt2"></I></ept> last.</mrk></source>
<trans-unit id="TagExample2">
<!--Sloppy HTML starting with <B>bold. Fancy <I>bold italic text</B> in the
middle. Only italic</I> last.-->
<!-- Version 2 (bx and ex): how are matching <bx> and <ex> handled if they need
to go into different segments? -->
<source>Sloppy HTML starting with <bx id="pt1" rid="pt1"/>bold. Fancy <bx
id="pt2" rid="pt2"/>bold italic text<ex id="pt1" rid="pt1"/> in the middle. Only
italic<ex id="pt2" rid="pt2"/> last.</source>
</trans-unit>
Codes in green are added to get paired codes in segments (I don't think we HAVE to do this).
<source><mrk mtype='x-seg' mid='1'>Sloppy HTML starting with <bx id="pt1"/>bold.<ex id="pt1"/></mrk> <mrk mtype='x-seg' mid='2'><bx id="pt1"/>Fancy <bx id="pt2"/>bold italic text<ex id="pt1"/> in the middle.<ex id="pt2"/></mrk> <mrk mtype='x-seg' mid='3'><bx id="pt2"/>Only italic<ex id="pt2"/> last.</mrk></source>
<trans-unit id="TagExample3">
<!--Sloppy HTML starting with <B>bold. Fancy <I>bold italic text</B> in the
middle. Only italic</I> last
.-->
<!-- Version 3 (balanced g): how are <g> elements handled if a segment spans
either the start or the end
tags, but not both? -->
<source>Sloppy HTML starting with <g id="g1">bold. Fancy </g><g id="g2">bold
italic text</g><g id="
g3"> in the middle. Only italic</g> last.</source>
</trans-unit>
Codes in green are added to
get paired codes in segment. This is required with <g>
.
<source><mrk mtype='x-seg' mid='1'>Sloppy HTML starting with <g id="g1">bold.</g></mrk> <mrk mtype='x-seg' mid='2'><g id="g1">Fancy </g><g id="g2">bold italic text</g><g id="g3"> in the middle.</g></mrk> <mrk mtype='x-seg' mid='1'><g id="g3">Only italic</g> last.</source>
A thought: maybe we would need an attribute in <g> (and <bpt>/<ept>, etc.) to indicate that one of the tags of the element has been added during the segmentation? Not sure: just thinking aloud.
<trans-unit id="AltTransExample1">
<!-- how are <alt-trans> elements for the entire <trans-unit> handled if the
<trans-unit> content is
split into multiple segments? -->
<source>This paragraph has two sentences. It illustrates alt-trans
handling.</source>
<alt-trans match-quality="90%">
<source>This paragraph has two sentences. It almost illustrates alt-trans
handling.</source>
<target>Det här stycket har två meningar. Det visar nästan hur alt-trans ska
skötas.</
target>
</alt-trans>
</trans-unit>
The <mrk>
elements can be added in the <alt-trans> too if
needed.
<source><mrk mtype='x-seg' mid=1'>This paragraph has two sentences.</mrk> <mrk mtype='x-seg' mid='2'>It illustrates alt-trans handling.</mrk></source> <alt-trans match-quality="90%"> <source><mrk mtype='x-seg' mid=1'>This paragraph has two sentences.</mrk> <mrk mtype='x-seg' mid=2'>It almost illustrates alt-trans handling.</mrk></source> <target><mrk mtype='x-seg' mid=1'>Det här stycket har två meningar.</mrk> <mrk mtype='x-seg' mid=2'>Det visar nästan hur alt-trans ska skötas.</mrk></target> </alt-trans>
But most likely I think would could also not have any segment in the
<alt-trans>
if the text comes from some other source than a TM (like the
result of a leveraging).
<trans-unit id="AltTransExample2">
<!-- can single or multiple <alt-trans> elements be used to match single
segments inside the <trans-
unit>, and if so can we show which part they match? -->
<source>This paragraph has two sentences. It illustrates alt-trans
handling.</source>
<alt-trans match-quality="100% for first sentence">
<source>This paragraph has two sentences.</source>
<target>Det här stycket har två meningar.</target>
</alt-trans>
<alt-trans match-quality="85% for second sentence">
<source>It almost illustrates alt-trans handling.</source>
<target>Det visar nästan hur alt-trans ska skötas.</target>
</alt-trans>
</trans-unit>
<source><mrk mtype='x-seg' mid='1'>This paragraph has two sentences.</mrk> <mrk mtype='x-seg' mid='2'>It illustrates alt-trans handling.</mrk></source> <alt-trans match-quality="100% for first sentence"> <source><mrk mtype='x-seg' mid='1'>This paragraph has two sentences.</mrk></source> <target><mrk mtype='x-seg' mid='1'>Det här stycket har två meningar.</mrk></target> </alt-trans> <alt-trans match-quality="85% for second sentence"> <source><mrk mtype='x-seg' mid='2'>It almost illustrates alt-trans handling.</mrk></source> <target><mrk mtype='x-seg' mid='2'>Det visar nästan hur alt-trans ska skötas.</mrk></target> </alt-trans>
The same notes as for AltTransExample2 apply here: I'm not sure we want to
have segment markers in the <alt-trans>
if they are propositions.
Yes if they are history of the translation (after edit for example), but for TM
matches, it seems not necessary.
-end-
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]