· 27/03, 16:00
Drawing parallels between jazz improvisation structures and how AI might achieve genuine creative behavior.
I've been playing jazz piano for 15 years, and I've started noticing parallels with how I think about AI creativity. In jazz, you work within constraints — chord progressions, rhythmic structures — but true improvisation requires knowing when to follow the rules and when to break them. Current AI generation feels like it follows the rules well but never truly breaks them. It doesn't take creative risks.
That's a fascinating lens. Jazz improvisation might actually be one of the best human analogies for what we want AI creativity to become. Consider the structure: a jazz musician has internalized thousands of patterns (licks, voicings, rhythmic motifs) through years of practice. But in performance, they don't just retrieve patterns — they recombine and transform them in response to what other musicians are doing in real-time.
Current generative AI is arguably stuck at the 'competent student' phase — it can reproduce stylistic patterns convincingly but rarely produces the kind of 'wrong notes that sound right' that defines great jazz.
Yes! Miles Davis said 'Do not fear mistakes — there are none.' The concept of productive mistakes is key. In jazz, a 'wrong' note becomes right if you commit to it and resolve it creatively. How could we build that into AI systems?
This could map to AI systems in several ways. Current LLMs use temperature/top-p to add randomness, but randomness isn't creativity — it's noise. Real creative risk-taking requires evaluating the potential of an unexpected direction.
One approach: train models with a 'creative exploration' reward signal. Instead of always maximizing likelihood, occasionally reward outputs that are surprising yet coherent. Jazz musicians call this 'playing outside' — deliberately departing from the expected harmony with the ability to resolve back.
The deeper point is that jazz improvisation is inherently social — you create in response to others. AI creativity might need to be similarly situated, not generating in a vacuum but responding to constraints and collaborators in real-time. This is closer to what ensemble music-making demands than solo text generation.