Home Blog Education & Skills European TDM Projects: OpenMinTeD, an Overview

European TDM Projects: OpenMinTeD, an Overview

July 1, 2016

6868

What is OpenMinTeD?

OpenMinTeD stands for Open Mining INfrastructure for TExt and Data. It is an open, service oriented, e-Infrastructure for Text and Data Mining (TDM) of scientific and scholarly content.

Why is such an infrastructure necessary?

Digital research data is being produced in massive amounts and it cannot be harnessed, not to mention utilized, unless there exists an appropriate technology. This combination is provided by infrastructures which serve both as data treasuries and offer the necessary tools for its processing. OpenMinTed, in specific, provides TDM technologies to be used in the scientific publications world. Through the analysis of data at multiple levels, unstructured data is turned to structured and further processed so as to detect relations and patterns, discover hidden and new knowledge and reveal meaning. OpenMinTed makes use of pre-existing TDM tools and platforms which are integrated on the basis of interoperability.

When did the project start?

It started on June 1^st, 2015 as a Horizon 2020 project, funded by the European Commission, and will continue until May 31^st, 2018.

Which are the domains of interest for OpenMinTed?

OpenMinTed covers a wide range of scientific fields, starting from the broad categories of social sciences, humanities, food and agriculture, etc. and extending to areas of special scientific interest such as Neuroinformatics.

Whom does the project address?

OpenMinTeD addresses stakeholders from both the public and private sector: researchers, universities, research institutions, libraries and computing centres, publishers, OA repositories, scholarly or learning societies and content providers, data and computing centres, legal experts, infrastructure developers, industrial players, and SMEs.

Who is involved in the project?

Athena Research and Innovation Centre (ARC), a Greek scientific research and technological organization, with great expertise regarding the design, development and operation of data infrastructures in a broad range of subject areas, is the project coordinator while the consortium consist of:

the National Centre for Text Mining (NaCTeM), based in the School of Computer Science at the University of Manchester, the first publicly-funded text mining centre in the world,
the Ubiquitous Knowledge Processing (UKP) Lab in the Department of Computer Science at the Technische Universität Darmstadt, which has been carrying out cutting-edge research in natural language processing (NLP)with a strong emphasis on lexical-semantic resources and algorithms, and the innovative applications of NLP to novel problems in social media, social sciences, and humanities,
Institut National de La Recherche Agronomique (INRA), ranked the number one agricultural institute in Europe and number two in the world and carrying out mission-oriented research for high-quality and healthy foods, competitive and sustainable agriculture and a preserved and valorised environment,
the European Bioinformatics Institute (EMBL-EBI), a non-profit academic organisation that forms part of the European Molecular Biology Laboratory (EMBL), which has developed Europe PubMed Central (Europe PMC), a database for the life science research literature, with over 30 million abstracts and 3 million full text articles, about 800,000 of which are “gold” open access, i.e. free to read and reuse,
Agroknow (AK), a fast growing SME specialized in agro-biodiversity knowledge management with a clear research focus on knowledge-intensive technology innovation for agriculture, food and biodiversity and an extensive experience in building services for the agricultural community,
Liber (Ligue des Bibliothèques Européennes de Recherche), the main network for research libraries in Europe, including more than 400 national, university and other librariesfrom over 40 countries,
the Institute for Information Law (IViR) of the University of Amsterdam (Universiteit Van Amsterdam), which engages in cutting-edge research into fundamental and topical aspects of information law, and provides a forum for critical debate about the social, cultural and political aspects of regulating information markets,
the Knowledge Media Institute (KMi) of the Open University, serving the university’s need to be at the forefront of research and development in the areas of Cognitive and Learning Sciences, Artificial Intelligence and Semantic Technologies, and Multimedia,
École Polytechnique Fédérale de Lausanne (EPFL) and the associated Blue Brain Project (BBP), the first comprehensive attempt to use detailed modelling and simulation as tools to systematically integrate data about the brain,
the CNIO (Centro Nacional de Investigaciones Oncológicas) team, provider of the Biocreative challenges and associated infrastructure containing TDM and other NLP and non-biology driven components,
the Natural Language Processing (NLP) group at the University of Sheffield, one of the largest and most successful research groups in language and information in the EU, which has world-leading research record in the fields of NLP infrastructures (GATE), information extraction, standardisation, machine learning methods for NLP, dialogue systems, question answering, terminology extraction, NLP methods for Knowledge Management and the Semantic Web,
GESIS - Leibniz-Institute for the Social Sciences, the largest infrastructure institution for the Social Sciences in Germany which provides research-based infrastructure services, such as expertise and tools for study planning, collecting, and analyzing as well as archiving research data,
the Greek Research and Technology Network (GRNET) S.A., provider of high quality Internet services to the Greek research and academic community and which is leading the infrastructure provisioning activity offering cloud computing resources to the project,
Frontiers, one the largest and fastest-growing open-access publishers, receiving millions of monthly page views, with publishing agreements with many universities, and collaborations with Nature Publishing Group, Scientific American, Digital Science, OpenAire, CrossRef, OASPA, COPE, Jacobs Foundation, and others to advance Open Science worldwide, and
the University of Stirling, one of the UK’ s leading research universities in the fields of health and wellbeing, the environment and people, culture and society, law, enterprise and the economy, and sport, which provides the support (guidance and training) for legal issues concerning different project issues.

A Project Synopsis

This is what Europe can do to stimulate Text and Data…

A practical view on text data mining from ContentMine

FutureTDM and OpenMinTeD organise TDM workshop for research libraries

European TDM Projects: OpenMinTeD, an Overview

NO COMMENTS

LEAVE A REPLY Cancel reply