Prof. Ansgar Steland – Institut für Statistik und Wirtschaftsmathematik

Title: Large Sample Approximation for High-Dimensional Covariance Matrices and Applications

Abstract: Here.

Dr. Hanno Scharr, Quantitative Image Processing, Forschungszentrum Jülich GmbH

Title: Imaging and Image Processing for Plant Sciences



Prof. Holger Rauhut- Lehrstuhl C für Mathematik (Analysis)

Title: Signal recovery from incomplete data


In many applications it is expensive, time-consuming or otherwise difficult to collect a sufficiently high number of measurements (data) for signal reconstruction via traditional methods. Compressive sensing predicts that efficient recovery is possible from much fewer data if the signal to be recovered is (approximately) sparse in some sense. In this particular context, care has to be taken when designing the recovery method and  the measurement process. This is where a lot of mathematics enters.  Somewhat surprisingly, for certain random measurement constructions, it can be shown that only a comparably small (and provably minimal) number of measurements are sufficient for recovery. I will give a short introduction and overview on this field and its potential applications to data (signal, image) processing tasks.

Dr. Andreas Nowack/ Dr. Thomas Kress, Physics

Title: How to analyze hundreds of Petabytes data from the Large Hadron Collider at CERN?


Since a few years the Large Hadron Collider at CERN produces a huge amount of data which is analyzed heavily by thousands of elementary particle physicists all over the world. For this purpose distributed GRID computing is used, connecting two hundred data centers with together half a million of processor cores and 500 Petabytes of stored data. In this presentation we introduce the principles of high energy physics data analyses and explain the computing model and the role of the data center at RWTH Aachen.

Prof. Herman Ney – Lehrstuhl für Informatik 6/ Chair of Computer Science 6 - Human Language Technology and Pattern Recognition

Title:The Statistical Approach to Speech Recognition and Natural Language Processing


The last 25 years have seen a dramatic progress in statistical methods for recognizing speech signals and for translating spoken and written language. In particular, this talk will focus on the remarkable fact that for these tasks the statistical approach makes use of the same four principles: 1) Bayes decision rule for minimum error rate; 2) probabilistic models, e.g. Hidden Markov models or artificial neural networks; 3) training criteria and algorithms for estimating the free model parameters from large amounts of data; 4) the generation or search process that generates the recognition or translation result.

Dr. Daniel Ewert - Institute of Information Management in Mechanical Engineering

Title:Data Science at the Institute Cluster IMA/ZLW & IfU


In this short presentation, we give an overview of our data science projects in research and industrial applications. Thereby, we cover different areas and approaches like the detection of emotions, the prediction of product qualities during production and the comparison of huge list structures. Further, we present research results regarding the analysis of time interval datasets and social media data.

Dr.-Ing. Matthias Meinke, Lehrstuhl für Strömungslehre und Aerodynamisches Institut

Title: Analysis of large scale flow data by dynamic mode decompositioning


Large scale simulations of turbulent flows produce a huge amount of data, in the order of O(100) Terabyte for a single solution, if the most energetic turbulent flow structures are resolved. In this talk it will be discussed how such data is analyzed to extract the most relevant dynamic modes, which is a valuable information for example for developing active flow control techniques.

Daniel Mallmann, Jülich Supercomputing Centre (JSC), Institute for Advanced Simulation (IAS)

Title: JSC Data Activities


The presentation gives an overview of the JSC data infrastructure, the EUDAT project, and the developments for data federations and data analytics at JSC.

Prof.  Marco Lübbecke - Lehrstuhl für Operations Research / Chair of Operations Research

Title: Optimization goes Data Science goes Optimization?


Our group's research interests are in computational mathematical optimization, in particular in models and algorithms for solving large-scale integer programs. This is not "data science" (whatever the precise definition is) per se, but of course depends heavily on data when it comes to solving practical optimization problems. We believe that both fields can benefit from each other. This talk introduces our work, gives a few examples where we hope for collaboration, and foremost: invites discussions.

Prof. Bastian Leibe -- Lehr- und Forschungsgebiet Informatik 8 (Computer Vision)

Title: Computer Vision for Dynamic Scene Understanding and Large-Scale Visual Search



Prof. Martin Grohe - Lehrstuhl für Informatik 7 – Logik und Theorie diskreter Systeme/ Logic and Theory of Discrete Systems

Title: Big Data from a Database Theory Perspective


I will highlight a few topics in data science and the managment of big data that arise when looking at the field from a theoretical-computer-science perspective, or more specifically, database-theory perspective. This will include algorithmic and complexity theoretic aspects as well as the conception of suitable query languages and issues of uncertainty.

Prof. Wolfgang Dahmen - Institut für Geometrie und Praktische Mathematik

Title: Data and Models


We briefly touch on several application scenarios where new mathematical concepts are needed to efficiently extract quantifiable information from possible large data sets. We present a „sample“ theoretical performance result stressing the role of adaptive techniques. If time permits we briefly address also the „small data“ problem arising in the context of data-assimilation and inversion tasks.

Gregor Fuhs, Institute for Industrial Management

Title: Proactive Fault Management by Big Data Usage


Presentation aims are the usage of Big Data in the research project BigPro and hereby expected benefits. For this purpose, the definition of Big Data in based on the research of FIR institute will be presented and delimited for the Smart Data Concept. That within the program „IKT 2020 – Forschung für Innovationen“ of the Federal Ministry for Education and Research funded project BigPro develops a proactive fault management based on Big Data methods in the field of production in order to reduce disruption and the amount of rework. This will be done by analyzing Real-Time Data and calculating potentially occurring faults in the production processes. The advantage lies herein that errors in components are found not only in the subsequent analysis, but a possible improvement can be instructed early on such that no longer on a faulty process step followed process steps will be preformed with defective product parts. In the outlook will be presented, how to develop new business models in the age of industry 4.0 based on intelligent networking and the use of Big Data or Smart Data.

Prof. Dr. med. Danilo Bzdok, Klinik für Psychiatrie, Psychotherapie und Psychosomatik

Title: Learning generative statistical models from brain imaging repositories


Neuroimaging datasets are constantly increasing in resolution, sample size, multi-modality, and meta-information complexity. This opens the brain imaging field to a more data-driven machine-learning regime (e.g., minibatch optimization, structured sparsity, deep learning), while analysis methods from the domain of classical statistics remain dominant (e.g., ANOVA, Pearson correlation, Student's t-test). Special interest may lie in the statistical learning of scalable generative models that explain brain function and structure. Instead of merely solving classification and regression tasks, they could explicitly capture properties of the data-generating neurobiological mechanisms. Python-implemented examples for such supervised and semisupervised machine-learning techniques will be provided as applications to the currently biggest neuroimaging dataset from the Human Connectome Project (HCP) data-collection initiative. The emphasis will be put on the feasability of deep neural networks and semisupervised architectures in imaging neuroscience. The successful extraction of structured knowledge from current and future large-scale neuroimaging datasets will be a critical prerequisite for our understanding of human brain organization in healthy populations and psychiatric/neurological disease.

Dr. Arash Behboodi, Institute of Theoretical Information Technology

Title: Heterogeneous Networks: A Big Data Perspective


In this talk, we present an overview of how the big datasets provided by operators can be utilized to address various challenges of future networks.