Generative Music Without LLMs

Generating Music Without LLMs

Why?

🎶

This is part of a ongoing series called Building A Music Engine

It documents the process of me smashing my head against the keyboard to build a game called LETSGO

It’s gotten long enough to break into several sections:

‣

Building A Music Engine entries

So here's what I want to do. I want to build Stockfish, but for music.

Stockfish, of course, is the chess engine orders of magnitude better at chess than human.

Can I make a music engine orders of magnitude better than humans?

No.

There main problem with building Stockfish for music theory is a simple one:

Chess is a zero-sum game.

There is a winner and loser and each possible move either improves or worsens your chances of winning the game.

The entire basis of chess engines is to walk the most amount of most winningest moves into as efficient an algorithm as possible.

Chess Engine programming is a surprisingly deep rabbit hole:

Chessprogramming wiki

The Chess Programming Wiki is a repository of information about programming computers to play chess. Our goal is to provide a reference for every aspect of chess programming, information about programmers, researcher and engines. You'll find different ways to implement LMR and bitboard stuff like best magics for most dense magic bitboard tables. For didactic purposes, the CPW-Engine has been developed by some wiki members. You can start browsing using the left-hand navigation bar. All of our content is arranged hierarchically, so you can see every page by following just those links. If you are looking for a specific page or catchword you can use the search box on top.

www.chessprogramming.org

My next adventure in low-level programming might just be replicating some of these fundamentals in Zig.

In music, it’s all subjective. It’s hard to say choosing a note is bad in absolute terms.

It’s true there are cultural rules to music- this entire series started with 🎵Designing Sound, the conceit being culture gives you rules to make reasonable sounding music.

But the best you can get with perfect adherence to music theory is stunning average music.

But if we can create a system to create inoffensive music, in theory it can have an infinite playtime. And things get interesting if we can hook in gameplay events to change the music:

Shift to a Minor scale when an enemy approaches
Or brighten to a Major scale when you get the big reward.

Music is not the sound that’s made, music is the feeling the human get when they hear the sound.

Human participation is the most important part of music.

A good music engine is one that responds appropriately to the human’s action in the game.

Possible Approaches

My first instinct for programming this note composition was to use the Strategy pattern to define different methods of constructing notes.

It’s a fairly straight-forward approach, utilizing a well-known design pattern in its intended use.

It would allow me to define things like Set Pedal Point on a bass instrument- Just start playing a low Tonic note every beat or whatever.

Then I can write a contained strategy for Develop Motif, a one/two bar melodic structure.

The thing I find potentially interesting about this is strategies that extend/consume other strategies:

Establish Chords could consume the Motif created and build chords using the notes chosen by the motif.

But that might also be optional, the ordering reversed- we establish chords, then generate a motif consuming the chords.

What I like about this approach in general is that it is additive in nature. Repetition legitimizes, so a motif of ii-V-I repeated in the chord progression creates a sense of cohesion in the piece.

Music is not a zero-sum game. It’s positive, fractal in nature. It’s about consuming the other musician’s contributions and iterating.

It might not be stockfish for music theory, but it is more in line with the fundamental nature of music.

Musical Strategies

So I opened a new PR to encapsulate this new feature:

Bootstrap strategy pattern for musical composition

Merged

#11 ⋅ JerkyTreats ⋅ 2 months ago

I created a simple MusicalStrategy interface and built out a single concrete strategy of setting a pedal point.

This was all accomplished quickly.

Things immediately slowed down once I started asking hard questions like “How do strategies get chosen”

To figure that out, I decided to go on a long rubber duck session designing the data for what the Composer actually needs:

💽Music Composer Data Design

The result of which is this pseudo-data object:

//ComposerData Object 	

// Data relevant for playing the sounds
FInstrumentData = BassInstrument;
int OctaveMin   = 1;
int OctaveMax   = 2;

// Music Theory Data
Scale = [ C, D, E, F, G, A, B ]

// Chosen strategies, what to play, when to play it
ChosenStrategies = [
	{
		InstrumentSchedule = { [ C, C, C, C ] }
		StartAtBar = 4,  
		BarsToPlay = 2,
		Strategy = PedalPoint
	},
	{
		InstrumentSchedule = { [ C, D, G, E ] }
		StartAtBar = 6,  
		BarsToPlay = 4,
		Strategy = PlayMotif
	},
]

// Weighted strategies to choose from when composing next section
PossibleStrategies = [
	PlayMotif = {
		AppropriatenessToChooseStrat = 8.0,
		InstrumentInput = {
			{ 
				ComposerData = LeadGuitar*,
				AppropriatenessForStrat = 1.0,
		  },
		  {
			  ComposerData = ChordSynth*,
			  AppropriatenessForStrat = 0.0
		  },
		},
	},

This object:

Is representative of a single instrument through InstrumentData.
Has a set of InstrumentSchedules, wrapped in a data object with ancillary information like when to play, and how many times to play it.
Has a set of StrategyData objects, defining different approaches for creating music, along with data related to choose each strat, and a set of ComposerData for other instruments that can be used by the Strategy.

I was able to implement this object into the game pretty easily:

Add data objects for MusicComposer

JerkyTreats

But, things slowed down considerably once I started trying to define those Appropriateness weights.

The Weighted Sum of All Fears

I had this idea that strategies could be chosen by some weight.

In essence, the MusicComposer would be much more likely to choose CreateMotif strategy over PedalPoint strategy if Create Motif had an Appropriateness value of 0.6 vs. 0.2.

And that’s true. Easy logic.

Slightly more complex trying to figure out how to generate 0.6 vs. 0.2

Like, do I start at 1.0 and remove points for conditions not met?

Or maybe start at 0.0 and add for conditions that are met?

Or 0.5 and why not both?

And then how do I maintain any sort of consistency among strategies?

There are some global rules that can be followed: Repetition Legitimizes for example.

If a motif of D G C exists, its better for an instrument to use and extend that motif.

An instrument can be chosen to do chords- D-F-A, G-B-D, C-E-G - creating triads off that root note.

Ultimately I decided to go for a fairly straightforward approach- start at an arbitrary value, increase or reduce that value based on input.

After fighting with the data structure, specifically flattening and simplifying as much as possible, I ended up with something that kind of works- a pedal point strategy hitting the same note every beat:

The video reveals some bugs… some of which have since been fixed in a major refactor:

Music Composer Refactor

Merged

#12 ⋅ JerkyTreats ⋅ a month ago

… Some of the bugs still remain. Regardless, we forge ever onward.

Constructing Music: Tension And Repetition