NEC TM Data: Latest insights into pan-European language data sharing

2020-04-07

The NEC TM Data project, which ran from September 2018 to February 2020, aimed to increase the amount of parallel language data within the EU, to promote the flow of translation data, and to lower translation costs. On 28th of February 2020, the final NEC TM Dissemination Day took place in Madrid, featuring latest insights in NEC TM’s collective market research efforts and the publication of the Country Reports and White Paper.

The overall findings and conclusions drawn from the project are pathbreaking: The country that was found to spend the most on translation services, and therefore would benefit the most from centralising parallel data is Spain. One factor contributing to this is the fact that Spain has several co-official languages: Catalan-Valencian-Majorcan, Basque and Galician. Many, if not all, official documents are required to be translated into all co-official languages in Spain. Together with Scandinavian countries, European institutions and Portugal, the Spanish administration has also proven very transparent in reporting expenditure.

 

NEC TM member presenting at NEC TM Dissemination Day in Zagreb, September 2019

It is also important to point out that throughout the course of the consortium’s research, it was not possible to access data equally in all participating Member State. This is because of various circumstances: Some countries, such as Belgium, closely adhere to the public procurement publication threshold - set at €10,000 for most EU countries - and do not publish many contracts under this amount. On the other hand, Portugal, for example, published contracts falling much below the national publication threshold of €10,000, and a robust corpus was built based on the amount of data available.

Moreover, some Member States publish or issue very few contracts at regional level or local levels. Germany, a highly de-centralised country, has proven very difficult to mine for data as each land publishes in different ways. There was no central repository found for translation or similar contracts. On a similar strand, some Member States such as Italy have very unreliable data sources, which is similar to Greece and Romania. As for Cyprus, for example, there was no information found at all.

As such, Member States’ diverse national legal and procedural orders have an influence on the estimation of public procurement spending. Therefore national expenditures on translation also reflect the state of data availability. Finally, it is worth noting that European public institutions are extremely transparent, and all Requests for Proposals (RFPs) and contracts issued are centrally reported in the official TED website.

The detailed Country Reports can be found in full downloadable version on the NEC TM website, in which they summarise each country’s national expenditure on translation between 2015- 2019. They also provide an analysis on each country’s translation market by providing data on the companies awarded the most translation contracts, as well as an analysis of national expenditures of translation by each sector in each country studied.