Both in scientific and in business environments, the last few decades showed an increasing need to cope efficiently and intelligently with extensive data. Moreover, the integration of fact-based decision making and business analytics has become a key strategic factor for companies. For this purpose, classical statistical methods were enhanced and new techniques developed with the aim of exploring, extracting, discovering, or condensing any valuable information contained in large datasets.

The course gives an introduction to the most common statistical concepts applied in this context – from exploratory data analysis to data mining. Amongst the topics covered are classification and regression trees, cluster analysis, principal component analysis for dimensionality reduction, and multidimensional scaling. Others, like correspondence analysis, are treated as time permits. A series of exercises accompanies the lecture and provides students with the opportunity to gain practical experience. The examples and concepts will be illustrated with R, the free software environment for statistical computing.