The texts that we need to translate are confidential. What happens to the texts that were submitted for translation? Do you keep them or are they deleted afterwards?
All texts are deleted within 24 hours unless it is requested to delete the texts immediately after the request was processed. Users can also choose to delete immediately after download (“delete after download”) or choose to retrieve the output directly from the interface (tab “my translation requests”) instead of receiving the output via email (“e-mail me my translation”).By registering to use this application, you are consenting to eTranslation’s use of personal data as described below. Privacy Statement. By registering to use this application, you are consenting to eTranslation’s use of personal data as described below. eTranslation records the login, time of access, languages requested, size of document submitted for translation and the domain of your email address (…@ec.europa.eu) to enable access to the service and processing of requests, as well as for statistical purposes. These are kept for 18 months and then archived. If you choose to have your document returned by e-mail, your e-mail address will be kept until the document is sent. It will subsequently be reduced to its domain only. Users should exercise their judgement when submitting potentially sensitive documents to any online service, including eTranslation. Documents submitted remain available for 24 hours after which they are deleted. There is a “delete after delivery” option which, if ticked, results in the text being deleted immediately after it is delivered. Data will not be shared with third parties. Should you wish to raise any concerns on the eTranslation’s use of personal data please write to DGT-MT@ec.europa.eu,
What is going to happen to the data we provide?
The data will go to the EC (DG Translate) to support the improvement of the machine translation system eTranslation. For data that are identified as Open Data, they will be made available through the EU Open Data Portal (https://open-data.europa.eu
Why should we (public institutions) actually provide data?
Supporting your own language is supporting Europe and vice versa. Only with your help and with the provision of your language resources, CEF eTranslation can be made fit to your needs. Within the CEF programme, CEF eTranslation is available for free to public administrations in all EU member states and CEF affiliated countries (Iceland and Norway). So for your data, you receive a better service.
We (public institutions) don’t have any data for you! We work only paper-based. We outsource our translations.
If translations are outsourced, you should ask for the translated data to be delivered with the translation memories. Make sure to negotiate the translation memories with the language service provider ahead.
We cannot just share our data with you – they are confidential!
Most data held by the public sector is public data. Administrations provide various types of information online to the citizens (e.g. news, legal texts, official communications, interviews, brochures, background information, etc.). This information can also be available in a foreign language. In Germany, for instance, on the website of the national government, all information is provided in German, English, and French.
How can I upload my data to the repository?
You can upload data to the ELRC repository in three simple steps:
- Register (new user) or login (returning user)
- Provide a basic description for the language resource (title, short description, language(s))
- Upload the .zip file
For further instructions, please read the Walkthrough for Contributors and/or contact the helpdesk
What is CEF eTranslation?
eTranslation is the European Commission's machine translation system. It is an online service with a web user interface in 24 languages for human use. It guarantees confidentiality of data. Any Member State administration, small and medium-sized enterprise (SME) and university language faculty in an EU country, Iceland or Norway can use the web user interface free of charge at least until 2020. eTranslation can also be used as a web service in a machine-to-machine scenario. This web service is restricted to any Member State administration or SMEs participating to CEF-funded projects. As part of CEF Digital, eTranslation provides automatic translation services with the goal of making any digital service accessible to any EU citizen in his/her own language. European public online services such as Europeana, the Open Data Portal, the Online Dispute Resolution Platform, etc. should benefit from CEF eTranslation.
How can we access eTranslation?
eTranslation can be used by any Member State administration. It can be accessed as follows:
Staff working for EU institutions or agencies can directly access eTranslation with their EU Login (formerly ECAS) credentials and therefore do not need to register.
Staff working for a public administration, small and medium-sized enterprises and university language faculties in an EU country, Iceland or Norway can self-register here.
Individual accesses will be automatically deactivated after 12 months if not used.
Further information available here.
Why would we need eTranslation? We have human translators!
eTranslation can substantially help make the translation process more productive and more efficient. EC translators are responsible for translating content into all official EU languages. In total, more than 7,000 translators working for DG Translation and EU institutions have translated more than 2.3M pages in 2014.
eTranslation is used daily for French, Spanish, Portuguese and Italian to produce initial translations that are post edited in a very efficient way. For other languages (e.g. German) the quality level of the output is still too low.
In the last year, however, significant progress has been achieved through domain-specific engines. For domain-specific reports and texts, the quality of the translated output by eTranslation is acceptable. In other cases, the tool can rapidly scan long texts in a foreign language and point out passages to be translated by humans.
Overall, the translation quality is directly related to the availability of good quality data in the language: if the data for MT is good, then the MT system will be good.
eTranslation can provide SMEs with a cost effective solution to translate, for instance, daily conversations with foreign clients while continuing to use human translators for complicated texts needing perfect interpretation and understanding.
Why should we support eTranslation – we can have our own national solution?
Typically, national or proprietary solutions are targeted on particular range of topics. Hence, the scope of eTranslation is broader and more comprehensive. By supporting eTranslation participants can expect to have access to a broader service. For SMEs that do not possess proprietary translation solutions, eTranslation can be a cost effective translation tool able to provide first impressions on a wide variety of texts.
Machine translation is directly opposed to our national policy that young people should learn foreign languages
Not necessarily. Machine translation can actually provide a good basis for learning languages. Initially, it can be used to bridge the gap for people who cannot speak a particular language until they acquire initial language skills. For instance, at university level, machine translation can be used to provide automatic and simultaneous translations of lectures for foreign students who do not master the language.
Machine translation will never work for our languages (e.g. Estonian, Finnish, Hungarian and other morphologically rich languages).
Processing certain languages with the current MT technologies is more difficult because of e.g. their free morphology or their free constituent order. MT experts are working on new MT solutions based on neural networks more adapted to these languages. Moreover, the European Commission funds several actions (e.g. www.qt21.eu ) to investigate MT solutions for languages which currently receive only sub-optimal MT support. However, regardless of the methodology, huge amounts of parallel resources are needed for the implementation of the systems, since these systems rely on machine learning. By opening up access to SMEs, we will be able to collect data for “under-resourced” European languages or morphogically rich languages and improve the quality of translations and extend the domains that eTranslation covers.
The texts that we need to translate are confidential. What happens to the texts that were submitted for translation? Do you keep them or are they deleted afterwards?
All texts are deleted within 24 hours unless it is requested to delete the texts immediately after the request was processed. Users can also choose to delete immediately after download (“delete after download”) or choose to retrieve the output directly from the interface (tab “my translation requests”) instead of receiving the output via email (“e-mail me my translation”).By registering to use this application, you are consenting to eTranslation’s use of personal data as described below. Privacy Statement. By registering to use this application, you are consenting to eTranslation’s use of personal data as described below. eTranslation records the login, time of access, languages requested, size of document submitted for translation and the domain of your email address (…@ec.europa.eu) to enable access to the service and processing of requests, as well as for statistical purposes. These are kept for 18 months and then archived. If you choose to have your document returned by e-mail, your e-mail address will be kept until the document is sent. It will subsequently be reduced to its domain only. Users should exercise their judgement when submitting potentially sensitive documents to any online service, including eTranslation. Documents submitted remain available for 24 hours after which they are deleted. There is a “delete after delivery” option which, if ticked, results in the text being deleted immediately after it is delivered. Data will not be shared with third parties. Should you wish to raise any concerns on the eTranslation’s use of personal data please write to DGT-MT@ec.europa.eu.
Are translations protected by copyright? If so, who holds the copyright? How about machine translation?
a) Are translations protected by copyright? According to internationally binding regulations (e.g. the Berne Convention for the Protection of Literary and Artistic works), translations, among other alternations, are protected as original works without prejudice to the copyright in the original work. b) If so, who holds the copyright? Since the editing copyright is a separate right equivalent to that in the original work, the translator is subject to the same legal regime as the author of the original work. The copyright in the translation is, although it is a separate right, dependent on the copyright in the original work. This is because the translator himself can only use his translation if the author of the original work has given his consent. Regarding employees performing translations or post-edition of machine translated documents the legislation can vary from country to country. Some countries provide for a direct transfer of copyright from the employee to the employer some may not. Therefore employment contracts should contain a clause providing for the transfer of copyright works performed during the work hours to the employer. Check with a lawyer to help you in drafting and applying such a clause. c) How about machine translation? A translation produced by a machine in general is not a work capable of copyright protection. Only the code of the translation program is protectable. Nonetheless, if an author is using machine translations as a supporting tool for recommendations, but the translation is still the result of his intellectual act of creation, copyright will still apply.
Are translated texts in eTranslation equivalent to certified translations?
Usually, it is up to each Member state to certify that a translation is faithful to the original text, in the source language when the translation is provided by court-registered translators, or the translation has been commissioned by the national administration itself.
For the time being, no Member state has considered that a machine translation output, whether from eTranslation or any other machine translation, is equivalent to a human translation.
eTranslation cannot and will not replace a human translator. The platform is only intended to provide a first glimpse on the meaning of a text.
How can I retrieve performance and/or usage indicators linked to the usage of eTranslation?
Some usage indicators linked with eTranslation are made available on the following page:
https://ec.europa.eu/cefdigital/wiki/display/CEFDIGITAL/eTranslation+dashboard.
Why should I care about translations and get hold of/keep corresponding language data?
Whether you translate your material internally or outsource it, your process can benefit from the re-use of language data from previous translations in a cost-effective way while improving the quality of the output.
How should I man-age my data and why? We don’t have any infrastructures or resources (espe-cially small transla-tion services)!
In the public sector there is a great diversity in translation management: from paper-based to digitized workflows with term lists and translation memories storage. From an organizational point of view, much benefit can arise even from small changes in dealing with language data. Suggested actions can be taken without major effort, including:
- Analysis of all phases of data development.
- Based on this, creation of a “data management plan” (DMP), even a very basic one: [1] Which data is important?[2] Where is it stored? [3] Can it be further processed?
- Document all relevant data
- If possible, use the web as additional publication channel and reap the benefits of linked data (see https://www.w3.org/DesignIssues/LinkedData.html )
What is Open Data?
Open data is data that can be freely accessed, used, re-used, modified and disseminated by anyone for any purpose - maximally restricted by requirements that preserve provenance and openness. The most important characteristics of open data are: 1. Availability and access: The work shall be available as a whole, at a cost no higher than the cost of reproduction, preferably for free download on the Internet. The work should also be available in an appropriate and modifiable form. 2. Reuse and subsequent use: The data must be made available under conditions that permit reuse, subsequent use and linking with other data sets. The data shall be machine-readable. 3. Universal participation: Everyone must be able to use, reuse and subsequently use the data. There must be no discrimination against specific fields of action, persons or groups. The subsequent use may not be limited to individual areas (e.g. only in education), nor may certain types of use (e.g. for commercial purposes) be excluded.
What are Open Licences?
In general, an Open License is a license that grants permission to access, reuse, and redistribute a work with few or no restrictions. The exact permissions granted depend on the full text of the open license used. Different projects can easily require different permissions or restrictions - and there are a number of different licenses to accommodate these different uses. A list of the most common open licenses can be found on the Open Knowledge Licenses. Creative Commons licenses have evolved into an international standard for open licencing.
Does the share-alike requirement in CC licenses apply to translations?
The copyleft is a clause in copyrighted licenses of use that obligates the licensee to license any modification of the work under the license of the original work. The Copyleft clause is intended to prevent modified versions of the work from being distributed with restrictions on use that the original does not have. Since a translation is an adaptation, a translation must also be licensed under the license of the original work.
If I have access to a multilingual website, can I lawfully create a multilingual language resource out of it? Can I then share it under an Open License?
The mere fact that a website is online does not provide any information about the copyright status of its content. For the content of a multilingual website, it is therefore also required to check how the content is licensed. The legal boundaries of text and data mining (TDM) in the EU must, however, be considered separately. Unlike countries - including the US, Israel, Singapore, Taiwan and the Republic of Korea - where “fair use" can be invoked against claims of copyright infringement arising from the use of TDM techniques, or where TDM activities - as in Japan - are generally permitted and prohibited only in exceptional cases, the use of such techniques in cases of long-term intermediate storage in the EU currently will generally require the consent of the right holder. While the UK early introduced a special copyright exception for text and data mining that allows lawful access to perform text and data analysis for non-commercial research, the legal handling of TDM is still unclear in the rest of the EU. With the forthcoming copyright reform the legal framework is expected to become more concrete, whereas TDM is thought to be classified as a separate form of usage.
In some countries (e.g. in Germany), official documents are expressly excluded from copyright. Does this mean that they can be considered public domain also in countries that do not have such a limitation (e.g. in the UK)?
Within the EU, foreign and supranational official works – following the internationally applicable principle that the law of the country for which territorial protection is sought is applicable, as well as the territoriality principle – are treated according to the respective national domestic law. Also in the U.S., according to Section 105 of the Copyright Act, works of the U.S. Government are not entitled to domestic copyright protection and are therefore considered public domain in the U.S. These official documents are freely used in international practice.
How is legal information relating to a Language Resource verified in the ELRC-SHARE repository?
When a Language Resource provider submits a contribution onto the ELRC-SHARE repository, one of the requirements is that he/she certifies that he/she possesses legal documentation or title to deposit the Language Resource. In case of doubt regarding the legal status of a particular resource, a human investigation is conducted to search for any proof or documentation that ascertains the provenance and reusability of the Language Resource. A specific section dedicated to legal information assessment of Language Resources is provided in ELRC Validation Guidelines:
http://lr-coordination.eu/sites/default/files/common/Validation_guidelines_CEF-AT_v6.2_20180720.pdf
How do you ensure the quality of data made available on ELRC-SHARE repository? Is there a validation process?
ELRC has implemented an extensive validation process to ensure that the data created by the ELRC consortium in the course of the project comply with a high standard of quality and accuracy.
An extensive document detailing the validation process that should be followed by any contributor of Language Resources wishing to upload their data in the ELRC-SHARE can be found at the following address:
http://lr-coordination.eu/sites/default/files/common/Validation_guidelines_CEF-AT_v6.2_20180720.pdf
How can users identify that a Language Resource was processed by the ELRC Consortium in the ELRC-SHARE repository?
In order to create a consistent catalogue, the ELRC Consortium performed an extensive technical work on several resources available on the ELRC-SHARE repository as they were donated by third parties, Those resources can be easily identified: the tag (Processed) is affixed in the title of the Language Resource and additional metadata are available in "Created using ELRC Services" or "Resource creation" sections. Moreover, the "Relations" section shows the links between Original data and Processed data (for instance, “is aligned version of”).
Is the metadata used to describe Language Resources in the ELRC-SHARE repository available?
All registered users of ELRC-SHARE repository have the option to download the metadata of published Language Resources in JSON format. When logged in, you can click 'Export JSON metadata' button as shown here: https://www.elrc-share.eu/repository/browse/acm-news-items-2017-and-2018/93debb64175111ea913100155d026706c1d000ad1e8a43c183c6d0f6a9506013/.
In compliance with GDPR requirements, personal information such as contact details of metadata creators or contact persons are excluded from the JSON file.