On Monday 6 June, the FutureTDM project organised a Knowledge Café on the first evening of Berlin Buzzwords, a conference on storing, processing and searchability of large amounts of digital data.
With an audience of predominantly people working in tech companies, as well as researchers from universities and research institutions, a large table was set up to discuss experiences with text and data mining (TDM) research, and legal, economic and technical barriers that currently hinder the wider uptake of this.
“Help fostering the future through TDM”
Participants saw the legal barrier as a predominant one: they noted that the exact legal situation around TDM is often unclear or confusing to scientists, in addition to it being different per country. Another hindrance is that lawyers sometimes lack specific IT knowledge: bridging such gaps and broadening the understanding between the different fields would greatly contribute to creating a proper knowledge society.
Another factor standing in the way of the wider use of TDM tools is that specific tools and parsers are needed for the different European languages. The idea came up of creating a publicly available list of such tools, or to stimulate development of such tools and parsers through organising open competitions, where companies would be able to see how different tools and parsers perform on the same data.
Similar to the previous Knowledge Cafés, the need for more standards was also expressed. Finally, participants mentioned that companies often have trouble to access the state-of-the-art scientific knowledge, because not all academic papers are available as open access.
The Knowledge Café ended with a round of wishes for the future, which ranged from publishers allowing unrestricted content mining to having both more open data and more user-friendly tools to extract value from content available.
We want to thank all participants for their valuable input: a short video impression of two of the participants is provided below.