CMC24 - Generative AI Music

The first part of the Computational Music Creativity course focuses on exploring, implementing, and critically evaluating symbolic music generation techniques, including both traditional approaches and deep learning-based methods.

The second part of the course focuses on raw audio generation techniques using deep learning-based methods.

Learning Objectives

Part One

Understand the theoretical foundations and principles of symbolic generative music techniques, including Markov chains, cellular automata, genetic algorithms, RNNs/LSTMs, and transformers.
Analyze the strengths and limitations of both traditional and machine-learning based symbolic generative music techniques in terms of creativity, coherence, and practical application.
Code generative music systems from scratch, demonstrating the ability to translate theoretical concepts into functional implementations.
Connect theory to practice by exploring how generative music techniques are applied in research, industry, and artistic projects, and discussing real-world challenges and hurdles in the field.
Combine, adapt, and implement generative techniques to design symbolic music systems tailored to specific creative tasks and musical objectives.
Recreate and implement generative systems described in research papers.
Run inference and fine-tune pre-trained symbolic models by leveraging the Hugging Face Transformers library for generative music tasks.
Leverage cloud platforms like AWS, GCP, or Azure to run generative music models on remote servers with GPU acceleration for scalable and efficient computation.
Discuss the skills, knowledge, and steps required to become a generative music AI engineer.

Part Two

Develop an understanding of the deep learning techniques used for generative audio.
Develop/deepen your ability to read and understand research papers in the field.
Understand not just how to use, but how to build or change networks to achieve your goals.

Concepts

Other “subsymbolic” audio representations frame-based codecs
Conditioning
Managing long time dependencies
Latent spaces for creative interaction
Managing computational overhead
Playability
Offline vs online / Composition vs performance / Text vs RT Interaction

Architectures

Convolutional (e.g. Wavenet)
GAN (e.g. Sound Model Factory)
VAE (e.g. Rave)
DDSP (e.g. DDSP)
Transformers (Vampnet)

Pre-requisites

Students are expected to have intermediate proficiency in Python programming.
Basic understanding of linear algebra (e.g., matrices, vectors, and matrix operations).
Basic knowledge of TensorFlow/Keras and PyTorch will be helpful for deep learning techniques.

Teaching Approach

This course will cover both the theoretical and implementation aspects of symbolic generative music systems.

The main teaching principles are:

learning by doing,
fostering proactivity and independent learning.

Students are expected to study the theory and basic implementation of the techniques independently by watching The Sound of AI’s Generative Music AI Course and its corresponding implementations.

Theory classes will focus on advanced aspects of the techniques, building on the video lectures. These sessions will emphasize real-world applications, presenting one industry or academic system that leverages the technique being studied. The classes will be interactive, with activities designed to encourage problem-solving and critical thinking.

Practical classes will center on code implementation of the assignments. Students’ solutions will be reviewed and discussed in detail. Additionally, students will be tasked with reverse-engineering a generative music system, applying the knowledge and skills acquired throughout the course.