Research areas

These are the main lines of research of the BerbaTek project which coincide with the main divisions of the language industry, with an additional line for generating general, basic resources:

  • Basic resources: these are resources and technologies that can be used in various spheres of the language industry and which can provide the raw material for developing tools that can be used in one or more of the other areas, like for example text or voice corpora, lexicons, dictionaries, ontologies, computational grammars, morphosyntactic analysers, voice recognition, voice synthesis, dialogue systems...
  • Translation: systems that can be applied in the translation sector, like machine translation, translation memories, voice-to-voice translation systems, automatic dubbing...
  • Content: systems that can contribute towards enhancing the content sector, like information search (monolingual, multilingual, semantic, multimedia...), information extraction, systems for assisting writing, like correctors, knowledge management, question answering systems...
  • Teaching: these are systems that can be applied in the ambit of teaching, like personal tutors, e-learning systems, pronunciation correctors, automatic building of exercises and examples...

General resources

This line of research aims to compile and develop the basic resources needed for carrying out the rest of the tasks in the project.


The line of research on translation will endeavour to improve the existing machine translation systems and extend them to new language pairs.


This line of research is linked to the management of information, not only from the point of view of processing but also the retrieval of it.


The line of research in teaching falls within the framework of developing resources and tools that facilitate the teaching process.