Generative AI Music
• CMC24 • Music Technology Group
The first part of the Computational Music Creativity course focuses on exploring, implementing, and critically evaluating symbolic music generation techniques, including both traditional approaches and deep learning-based methods.
The second part of the course focuses on raw audio generation techniques using deep learning-based methods.
Learning Objectives
Part One
- Understand the theoretical foundations and principles of symbolic generative music techniques, including Markov chains, cellular automata, genetic algorithms, RNNs/LSTMs, and transformers.
- Analyze the strengths and limitations of both traditional and machine-learning based symbolic generative music techniques in terms of creativity, coherence, and practical application.
- Code generative music systems from scratch, demonstrating the ability to translate theoretical concepts into functional implementations.
- Connect theory to practice by exploring how generative music techniques are applied in research, industry, and artistic projects, and discussing real-world challenges and hurdles in the field.
- Combine, adapt, and implement generative techniques to design symbolic music systems tailored to specific creative tasks and musical objectives.
- Recreate and implement generative systems described in research papers.
- Run inference and fine-tune pre-trained symbolic models by leveraging the Hugging Face Transformers library for generative music tasks.
- Leverage cloud platforms like AWS, GCP, or Azure to run generative music models on remote servers with GPU acceleration for scalable and efficient computation.
- Discuss the skills, knowledge, and steps required to become a generative music AI engineer.
Part Two
- Develop an understanding of the deep learning techniques used for generative audio.
- Develop/deepen your ability to read and understand research papers in the field.
- Understand not just how to use, but how to build or change networks to achieve your goals.
Concepts
- Other “subsymbolic” audio representations frame-based codecs
- Conditioning
- Managing long time dependencies
- Latent spaces for creative interaction
- Managing computational overhead
- Playability
- Offline vs online / Composition vs performance / Text vs RT Interaction
Architectures
- Convolutional (e.g. Wavenet)
- GAN (e.g. Sound Model Factory)
- VAE (e.g. Rave)
- DDSP (e.g. DDSP)
- Transformers (Vampnet)
Pre-requisites
- Students are expected to have intermediate proficiency in Python programming.
- Basic understanding of linear algebra (e.g., matrices, vectors, and matrix operations).
- Basic knowledge of TensorFlow/Keras and PyTorch will be helpful for deep learning techniques.
Teaching Approach
This course will cover both the theoretical and implementation aspects of symbolic generative music systems.
The main teaching principles are:
- learning by doing,
- fostering proactivity and independent learning.
Students are expected to study the theory and basic implementation of the techniques independently by watching The Sound of AI’s Generative Music AI Course and its corresponding implementations.
Theory classes will focus on advanced aspects of the techniques, building on the video lectures. These sessions will emphasize real-world applications, presenting one industry or academic system that leverages the technique being studied. The classes will be interactive, with activities designed to encourage problem-solving and critical thinking.
Practical classes will center on code implementation of the assignments. Students’ solutions will be reviewed and discussed in detail. Additionally, students will be tasked with reverse-engineering a generative music system, applying the knowledge and skills acquired throughout the course.

- Instructor Valerio Velardo
- Email: valerio@thesoundofai.com
- Office hours: Check Logistics Tab

- Instructor Lonce Wyse
- Email: lonce.wyse@upf.edu
- Office hours: Check Logistics Tab

- TA Anmol Mishra
- Email: anmol.mishra01@estudiant.upf.edu
- Office hours: Check Logistics Tab
- Location: Tanger Building 55.310