A Model for Fine-Grained Alignment of Multilingual Texts

While alignment of texts on the sentential level is often seen as being too coarse, and word alignment as being too fine-grained, bi- or multilingual texts which are aligned on a level in-between are a useful resource for many purposes. Starting from a number of examples of non-literal translations,...

Verfasser: Cyrus, Lea
Feddes, Hendrik
FB/Einrichtung:FB 09: Philologie
Dokumenttypen:Artikel
Medientypen:Text
Erscheinungsdatum:2004
Publikation in MIAMI:21.06.2006
Datum der letzten Änderung:06.04.2022
Angaben zur Ausgabe:[Electronic ed.]
Quelle:Proc. COLING 2004 Workshop on Multilingual Linguistic Resources (MLR2004). Geneva, August 28 (2004), S. 15-22
Schlagwörter:Korpuslinguistik; Computerlinguistik; syntaktische Annotation; semantische Annotation
Fachgebiet (DDC):400: Sprache
Lizenz:InC 1.0
Sprache:English
Format:PDF-Dokument
URN:urn:nbn:de:hbz:6-92619507330
Permalink:https://nbn-resolving.de/urn:nbn:de:hbz:6-92619507330
Onlinezugriff:0408_coling.pdf

While alignment of texts on the sentential level is often seen as being too coarse, and word alignment as being too fine-grained, bi- or multilingual texts which are aligned on a level in-between are a useful resource for many purposes. Starting from a number of examples of non-literal translations, which tend to make alignment difficult, we describe an alignment model which copes with these cases by explicitly coding them. The model is based on predicate-argument structures and thus covers the middle ground between sentence and word alignment. The model is currently used in a recently initiated project of a parallel English-German treebank (FuSe), which can in principle be extended with additional languages.