British flag

Michael Krause

PhD in machine learning for music and audio processing. My vision is to create algorithms that can listen to and understand music. To this end, I build deep learning models and analyze their behavior in depth. Research interests include: music information retrieval, deep learning, audio signal processing, self-supervised learning, musical sound event detection, differentiable digital signal processing, soft dynamic time warping.

Publications

Johannes Zeitler, Michael Krause, and Meinard Müller
Soft Dynamic Time Warping with Variable Step Weights
Published in: Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024.
Links: PDF
Simon Schwär, Michael Krause, Michael Fast, Sebastian Rosenzweig, Frank Scherbaum, and Meinard Müller
A Dataset of Larynx Microphone Recordings for Singing Voice Reconstruction
Published in: Transactions of the International Society for Music Information Retrieval (TISMIR), 7(1): 30–43, 2024.
Links: PDF, Project page (incl. data)
Michael Krause, Sebastian Strahl, and Meinard Müller
Weakly Supervised Multi-Pitch Estimation Using Cross-Version Alignment
Published in: Proceedings of the International Society for Music Information Retrieval Conference (ISMIR), 2023.
Links: PDF, Project page (incl. code)
Michael Krause, Christof Weiss, and Meinard Müller
A Cross-Version Approach to Audio Representation Learning for Orchestral Music
Published in: Proceedings of the International Society for Music Information Retrieval Conference (ISMIR), 2023.
Links: PDF, Project page (incl. code, data)
Johannes Zeitler, Simon Deniffel, Michael Krause, and Meinard Müller
Stabilizing Training with Soft Dynamic Time Warping: A Case Study for Pitch Class Estimation with Weakly Aligned Targets
Published in: Proceedings of the International Society for Music Information Retrieval Conference (ISMIR), 2023.
Links: PDF
Michael Krause and Meinard Müller
Hierarchical Classification for Instrument Activity Detection in Orchestral Music Recordings
Published in: IEEE/ACM Transactions on Audio, Speech, and Language Processing (TASLP), 2023.
Links: PDF, Project page (incl. data)
Christof Weiß, Vlora Arifi-Müller, Michael Krause, Frank Zalkow, Stephanie Klauk, Rainer Kleinertz, and Meinard Müller
Wagner Ring Dataset: A Complex Opera Scenario for Music Processing and Computational Musicology
Published in: Transactions of the International Society for Music Information Retrieval (TISMIR), 6(1): 135–149, 2023.
Links: PDF, Data
Michael Krause, Christof Weiss and Meinard Müller
Soft Dynamic Time Warping for Multi-Pitch Estimation and Beyond
Published in: Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023.
Links: PDF, Code
Michael Krause, and Meinard Müller
Hierarchical Classification of Singing Activity, Gender, and Type in Complex Music Recordings
Published in: Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022.
Links: PDF
Michael Krause, Meinard Müller, and Christof Weiss
Towards Leitmotif Activity Detection in Opera Recordings
Published in: Transactions of the International Society for Music Information Retrieval (TISMIR), 4(1): 127–140, 2021.
Links: PDF, Project page (incl. data)
Meinard Müller, Yigitcan Özer, Michael Krause, Thomas Prätzlich, and Jonathan Driedger
Sync Toolbox: A Python Package for Efficient, Robust, and Accurate Music Synchronization
Published in: Journal of Open Source Software (JOSS), 6(64): 1–4, 2021.
Links: Code, PDF
Michael Krause, Meinard Müller, and Christof Weiss
Singing Voice Detection in Opera Recordings: A Case Study on Robustness and Generalization
Published in: Electronics, 10(10): 1–14, 2021.
Links: PDF
Michael Krause, Frank Zalkow, Julia Zalkow, Christof Weiss, and Meinard Müller
Classifying Leitmotifs in Recordings of Operas by Richard Wagner
Published in: Proceedings of the International Society for Music Information Retrieval Conference (ISMIR), 2020.
Links: PDF, Poster, Project page (incl. data)
Kristian Kersting, Christoph Lampert, and Constantin Rothkopf (eds.)
Wie Maschinen lernen - Künstliche Intelligenz verständlich erklärt
ISBN: 978-3-658-26762-9,
Chapters: Maschinelles Lernen (Machine Learning), Bayesregel (Bayes Rule)
Springer, 2019.
Paul Voigtlaender, Michael Krause, Aljoša Ošep, Jonathon Luiten, Berin Balachandar Gnana Sekar, Andreas Geiger, and Bastian Leibe
MOTS: Multi-Object Tracking and Segmentation
Published in: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019.
Links: PDF, Project page (incl. data), Code

CV

2019–2023: PhD student at the International AudioLabs Erlangen
- Music Information Retrieval
- Supervisor: Prof. Meinard Müller
- PhD thesis: Activity Detection for Sound Events in Orchestral Music Recordings
2017–2019: Postgraduate studies of Computer Science at RWTH Aachen University
- i.a. machine learning, computer vision, speech recognition and logic
- Philosophy and ethics of technology
- Master thesis: Multi-Object Tracking and Segmentation using Instance Embeddings
2013–2017: Undergraduate studies of Computer Science at RWTH Aachen University
- Basic study, algorithmic learning theory, SMT solving
- Ethics and philosophy of science
- Bachelor thesis: Sensitivity Analysis over Polymatroids with Applications to Game Theory
2015–2016: ERASMUS+ exchange to the University of Edinburgh
- i.a. quantum computing, algorithmic game theory, machine learning, music informatics and secure programming

Works

PhD thesis: Activity Detection for Sound Events in Orchestral Music Recordings
Composers of music can express emotions and communicate with their audience in a multitude of ways. They decide on which voices or instruments to use, arrange notes into melodies, and develop recurring musical patterns. When a composition is performed and turned into sound, their decisions are realized acoustically as sound events. Despite being easily understood by human listeners, teaching a machine to perceive and process such musical sound events can be a challenging task. This thesis studies computational techniques for detecting the activity of sound events in a music recording, i. e., identifying the exact moments in time when a certain event occurs. We focus on orchestral and opera music, which are rarely considered in music processing research and particularly complex due to their high degree of polyphony. In this context, we cover four different types of musical sound events, namely singing, instrumental sounds, different pitches, and leitmotifs (special kinds of musical patterns used for storytelling in opera). To detect the activity of these events within a recording, we design, implement, and evaluate deep learning systems. In addition, we explore a range of techniques including hierarchical classification, differentiable sequence alignments, and representation learning. Beyond evaluating the accuracy of our detection systems, we aim at a deeper understanding of our models with regard to their robustness and sensitivity to confounding effects.
Master thesis: Multi-Object Tracking and Segmentation using Instance Embeddings
Simply put, the goal of this thesis is to build and evaluate algorithms that can identify objects in a video by assigning each pixel in the video to the object instance it belongs to, or to mark it as background. This goal draws upon several different research areas of computer vision, the study of algorithms that process visual data like images and videos. Research questions that have previously been considered separately - such as “where in this video is the machine operator’s right hand?” and “which pixels correspond to a hand and which pixels are background?” - now need to be considered jointly.
Bachelor thesis: Sensitivity Analysis over Polymatroids with Applications to Game Theory
This thesis deals with certain combinatorial structures and their applications to game theoretic problems. As part of this, we will review several classical results and recent publications on these topics and give illustrating examples. We will give an introduction to matroid and polymatroid theory as well as basic concepts from non-cooperative and cooperative games. Our focus will be on applications of sensitivity results for polymatroid optimization to matroid congestion games and related concepts. In addition, we will demonstrate how the polymatroid structure of certain convex cooperative games can be exploited to recompute core allocations efficiently.
Music Generation Using Machine Learning
This report deals with computer generated music. In particular, it discusses two recent approaches for automated melody composition using machine learning methods. Apart from melody, music needs accompaniment (e.g. chords played by different instruments) and must be performed, possibly by a synthesizer. Computer generated music can be used where music is required on- demand (e.g. video games) or as an aid for human composers.
Quantum Simulation
This essay deals with a technique called Quantum Simulation, which means simulating one quantum system through another. It begins by motivating the benefits one might obtain from simulating quantum systems and by introducing the main challenges one faces in the process. In section 2, an algorithm that employs quantum simulation to solve linear equations is reviewed and possible applications are discussed.
Algorithmic Learning Theory - Pattern Languages
In this report, we have looked at pattern languages. These were originally introduced as a model for inductive inference of formal languages, the question being, if and how a pattern descriptive of a set of strings can be found algorithmically. We studied the results of Dana Angluin, which state that finding descriptive patterns of maximum length is an NP-complete problem. We then described her algorithm for finding descriptive patterns with only one variable. Furthermore, we put the problem into the context of Gold's model of learning in the limit and discussed the algorithm by Lange and Wiehagen. Finally, we described some further results of research into pattern languages.
Download the slides of the accompanying talk