Text-sign parallel corpus study to start designing an automatic translation system

Michael Filhol, Line Patris and Pierre Guitteny


This paper presents a new project whose goal is to design an automatic system to translate from French text to Sign Language, using a symbolic approach. After stating two essential properties of Sign Language that makes such a system quite different in terms of internal representational models, we present the 2006 Websourd AFP news corpus we chose for our data-driven design process. It is a parallel corpus consisting of journalistic texts in French and their video translations in Sign Language. Then we present our methodology, based on separate analyses of video description and text annotation first, and a comparison second. The idea is to annotate the entities in the texts thought to trigger the observed signed structures, and as a start we focused on three structures emerging from the video corpus: comparisons, oppositions and geographic localisations. They all involve a use of signing space, an essential notion in Sign Language with no equivalent in a written text. Using the highly-abstract model AZee for representation of Sign Language rules, the ultimate goal is to build a set of translation mechanisms from annotated text to AZee operations, usable as input to a virtual signer animation system.




Use of corpora to inform translation


[Paper (PDF)]  

Supported by START Conference Manager