Description of EFNILEX
The objective of the EFNILEX project is the development of a modern, cost-effective method for the production of bi- and multilingual dictionaries, using as much as possible modern language technologies. The inventory component of the project assesses the availability of lexical resources on which the dictionary development could be based. The survey of existing high-quality dictionaries will show for which language combinations such dictionaries are unavailable, so that an appropriate development plan can be established.
A.1. Political relevance of the project
The European Union wishes to contribute to policies aimed at the preservation and strengthening of the multilingualism of Europe and the plurilingualism of its citizens. This goal implies that as many languages as possible should be:
· used in as many domains, functions and situations as possible;
· involved in cross-border European and global communication and information exchange, e.g. through the internet;
· learned and used by as many users as possible, both native and non-native speakers.
The political goals mentioned under Premise 1 can only be realised if there is a language infrastructure that allows for cross-language information exchange and treatment and which supports and helps the learning and use of languages. Among the crucial tools belonging to this infrastructure are multi- and bilingual dictionaries, both in traditional (printed) and digital format, both for human use (e.g. learners) and for integration in technological environments (e.g. as parts of a MT system).
For the sake of equal opportunities for all European citizens and communities our policy efforts should address all languages without distinction and should give priority to the development of bi- and multilingual tools which are currently lacking. It goes without saying that such tools can only be developed if the linguistic data of the languages involved are available.
A.2. Relevance of EFNIL and EFNIL members for this project
Several member organisations of EFNIL are directly involved in the development and/or maintenance of lexical resources (databases, corpuses, tools such as lemmatisers and dictionary editors) for their national language or have direct access to such resources.
Such EFNIL partners have both the know-how and the resources and components to carry out the objectives envisaged by this project. They are eager to use their resources and experience on a larger, European, multilingual scale.
A survey of all institutions involved in EFNIL is added as an attachment to this project description.
B. General description of the project
The data and the expertise can be brought together on a European scale in order to set up:
(a) an inventory of existing dictionaries, lexical databases, corpuses, lexical tools etc…;
(b) an inventory of missing data and lacking language combinations;
(c) the development of a modern, cost-effective methodology for the production of high quality bi- and multilingual dictionaries and other lexicographical tools for end users, which may be particularly relevant for the so-called less favoured (small) languages;
(d) the development of generic lexical tools needed for the implementation of this model, preferably by adaptation and/or enhancement of existing tools
(e) testing and adjusting the methodology in a pilot project aimed at the development of a high-quality dictionary for a ‘missing’ language pair (e.g. Lithuanian – Czech);
(f) the development of bi- and multilingual (digital) dictionaries for missing language combinations;
(g) the development of ‘dedicated’ dictionaries for the language industry, e.g. general and domain dictionaries for MT systems).
Steps (f) and (g) are not part of this project. However, the end result of the project will provide the necessary technical model and much of the expertise needed to draw up an explicit plan for the development of such dictionaries.
The first two steps in the process (the inventories) will be carried out by EFNIL and can be financed from its own budget. For the third step (the production model) a functional specification will be prepared. The functional specification would be the basis for a project proposal submitted for co-financing to the European Commission. It is clear that the implementation of the methodology / model, the development of generic tools with which to implement the model, the testing of it in a real-life production project and the production of more dictionaries and similar lexicographical tools for missing language combinations extend beyond the financial resources of the Federation.
The total cost of the project [steps (a) to (e)] will depend among others on the languages involved in the pilot dictionary project (e), the country/countries in which the lexicographical work will actually be done and the size of the dictionary. Moreover, only the inventories mentioned in steps (a) and (b) will give us a concrete idea of which lexicographical tools and instruments already exist. If such tools could be used as a basis for adaptation this could lower the cost of the project, as compared to a development from scratch of comparable tools.
Once the model has been set up and tested in at least one real dictionary project, it will form the basis for a European bi- and multilingual dictionary planning programme, aimed at the production of dictionaries for language combinations for which there are no good translation devices. The partners in this project could be the core of a future collaborative network, which would be open to owners of lexical resources of all languages of Europe.
The political relevance of such a programme, which should be co-financed by the European Union, can hardly be overestimated, especially from the point of view of the less-favoured languages, among which almost all languages of the relatively new Member States. Without such dictionaries and other translation devices these languages need to use an interlingua such as English for international communication. This contradicts the principle of equal opportunities for all citizens, since it impedes free communication and free exchange of information between all languages and cultures of Europe.
Moreover, this project and the dictionary planning programme that should follow, seem a sine qua non for the language learning model envisaged by the advice group presided by Amin Maalouf. The free choice of a personal adoptive language to be learned and used more or less as a second mother tongue, implies that the individual can rely on lexical tools linking his mother tongue and his adoptive language.
For the publication of newly developed dictionaries the project would collaborate with the existing publishing houses for dictionaries in the various countries, preferably according to standardised conditions regarding IPR, ownership of the lexical databases, re-use of lexical components etc… A dictionary development plan would thus have positive effects on that particular market segment.
Since the model envisages modern multifunctional electronic databases, these would not only lead to bi- and multilingual dictionaries in the traditional sense of the word (on paper as well as in digital format) but could also be used as the lexical basis for the development of dedicated dictionaries in modern translation devices such as translation memories and computer devices for machine translation. Thus the project results can be a boost to language industry.
The first contacts with publishing houses and technology providers will already be established during the first steps of the project (inventory). In this way information about the feasibility of the envisaged collaboration with such market parties will be available from the very beginning.
Last but not least, we wish to stress the importance of the underlying lexical databases on which the products are based. These would be set up in a structured, electronic format, thus guaranteeing re-usability in future developments. This means that once the investment in translation instruments between two languages has been made, these can be re-used in new settings, new devices, and new generations of computer environments! The investment will thus have a lasting effect.
The model which EFNIL envisages has already proven its feasibility on a smaller national and bilateral scale. Information about the results and experiences of one such project is to be found in a special issue of the highly regarded International Journal of Lexicography, volume 20, number 3, September 2007, published by the Oxford University Press (see: www.ijl.oxfordjournals.org).
Coordinator: Tamás Váradi (firstname.lastname@example.org)