Research Group of Computational Linguistics, University of Tartu

Development and implementation of formalisms and efficient algorithms of natural language processing for the Estonian language

Target financed research theme SF0180078s08 (2008-2013, principal investigator Mare Koit)

See also projects


Research problems

Problem 1. Changes on the lexical level and modeling them; the coping of the tools for natural language processing with the changes on the lexical level of the actual language use

Goal: developing algorithms for recognition of new words entering the language and words changing their paradigm, as well as identifying the derivational paradigm of these words.


Problem 2. Fixed expressions as lexical units with their own meaning, government and argument structure

Goal: to study the relationships between the government and argument structure of fixed expressions and the government and argument structure of the 'simple' verb that acts as the nucleus of the given fixed expression. To clarify the possibilities for automatic detection of government.


Problem 3. The deep syntactic analysis of the sentence

Goal: to find a suitable formalism for the representation of the deep structure of the Estonian sentence, as well as efficient methods both for morphological disambiguation and for the transition to the tree-shaped structure from the flat structure of Constraint Grammar used to date. To adapt the rules of morphological disambiguation for the task of automatic annotation of the Estonian speech corpus. Automatic detection of disfluencies in order to eliminate from syntactic analysis the phrases which do not conform to grammar rules.
See also Computational Syntax.


Problem 4. The semantic analysis of the sentence

Goal: developing the conceptual and formal means necessary for constructing the semantic representation of Estonian sentences and discourse.


Problem 5. Dialogue modeling

Goal: to develop a formal model of dialogue that would take into account the general rules of human-human communication, as well as the peculiarities of the Estonian language and culture.


Problem 6. A language with rich morphology and free word order as the source and/or target language in machine translation

Goal: identify the special needs of a free word order language with rich morphology regarding machine translation, and develop formalisms and methods for successful machine translation from/to such a language.
See also Machine Translation.


References

Northern European Association for Language Technology, NEALT
Northern European Journal of Language Technology, NEJLT)
Publications of the NEALT at DSpace of the University of Tartu

National program Estonian language technology (2011-2017)
National program Estonian language technology (2006-2010)
Plan of development of the Estonian language (2011-2017)
Strtegy of development of the Estonian language (2004-2010)

Centre of Excellence in Estonian Computer Science (2008-2015)
The Graduate School of ICT (2009-2014)
The Graduate School of Linguistics, Philosophy and Semiotics (2009-2014)



Last modified 28.11.2011
mare.koit at ut.ee
< Institute of Computer science