Home Method Profile

Method Profile

Text Mining

ABOUT

Text mining is a sub-discipline of data mining, which analysis unstructured natural language in order to detect patterns and to derive information. Whereas sets of structured language can be analysed easily with the right computer tools, natural language does not provide a structure necessary for computer processing. Therefore text mining tools need to transfer natural language into a computer-readable code in order to make them analysable with traditional data mining techniques.

Natural language processing , statistical modeling and machine learning processes are techniques used in text mining. Moreover, text mining processes are based on various methods for extracting and analysing information , e.g. information retrieval, computational linguistics, classification and clustering. Text mining methods can be applied in various disciplines in order to analyse web-documents as well as company internal or external sources.

RESOURCES

Weiss, S. M., Indurkhya, N., Zhang, T., & Damerau, F. J. (2005). Text Mining. Text Mining. Springer New York.
Berry, M. W., & Kogan, J. (2010). Text Mining: Applications and Theory. Text Mining. John Wiley & Sons, Ltd.
Liu, Y., Navathe, S. B., Civera, J., Dasigi, V., Ram, A., Ciliax, B. J., & Dingledine, R. (2005). Text Mining: Wissensgewinnung aus natürlichsprachigen Dokumenten. IEEE/ACM Transactions on Computational Biology and Bioinformatics / IEEE, ACM, 2(1), 62–76.

LATEST POSTS

A Project Synopsis

Benjamin White -

March 8, 2018

This is what Europe can do to stimulate Text and Data Mining

September 13, 2017

A practical view on text data mining from ContentMine

August 31, 2017

FutureTDM and OpenMinTeD organise TDM workshop for research libraries

July 27, 2017

FutureTDM awareness sheets: Organisations

July 26, 2017

The Open Knowledge Library of this platform contains the main outcomes of the FutureTDM project licensed under a Creative Commons Attribution 4.0 International License (CC-BY) to be adapted with appropriate credit.