commit 7ca4031 — RLAIF training overhaul: corpus reward, theory sweep models, audio samples
A music generator trained against psychoacoustic physics rewards only — no MIDI, no audio corpus, no learned vocoder. Every audio file below is freshly rendered from the model weights in this commit by an additive sine-bank synthesizer.
8 sample melodies from the Phase-3 generator after a REINFORCE pass that uses a feature-based mock judge as reward. Same loop accepts Qwen2.5-Omni-7B as the judge instead — see scripts/judge_with_ollama.py and src/train/rlaif_train.py --judge qwen.
Multi-track arrangement of trained generators: Phase-2 progressions on a pad, plucked bass on the chord roots, Phase-3 melodies quantized to a major pentatonic on the lead, and a 4/4 backbeat. Closer to listenable music than the per-phase demos but still robotic — work in progress.
Two-frequency generator + Sethares dissonance reward. Consonant intervals (major sixth / octave region) emerge with no music data.
3-voice generator + pairwise Sethares + voice spread. Discovers sus4 (6:8:9), major (4:5:6), augmented, and diminished chords; prefers the upper register.
Adds Tymoczko-style voice-leading cost. Mixed canonical triads with smooth voice movement.
Sequential Sethares + Terhardt virtual-pitch salience + pitch-class diversity. 4–5-PC melodic gestures emerge — the top Western-scale match is the blues scale.
Phase-coherence-based entrainment reward (linear approximation to Large–Kolen 1994). Discovered tempo peaks at ~120 BPM — inside Fraisse's preferred-tempo window.
Phase-3 melody pitches placed at Phase-4 rhythm onsets — no joint training, just synthesis composition.
Single MLP emits (pitch, IOI) pairs. Reward = melody + rhythm, jointly optimized. Tonal salience reaches 0.73 while phase coherence holds at 0.69.
Banded per-voice generator + horizontal/vertical Sethares + voice-crossing penalty. Zero crossings, P5–octave vertical intervals.
Same architecture, n_voices=3. Best-checkpoint reward +5.33, stratified voice lines with zero crossings.
n_voices=4. Six vertical pairs to satisfy — the limit of banded-MLP + REINFORCE at this training budget.
Same triad generator, partials=odd. Discovered chords cluster on BP-style ratios (≈ 5:7:9). Rendered with odd-only-harmonic synthesis so you hear the matching timbre.
Full statistics across all phases:
============================================================
Phase 1 — intervals (harmonic timbre)
============================================================
median ratio: 3.049
top labels:
7 minor_seventh
7 octave
7 major_seventh
4 minor_sixth
3 non-musical (ratio=2.919)
============================================================
Phase 2 — triads (harmonic timbre)
============================================================
N samples: 512
mean dissonance: 0.195
pct samples with spread penalty > 0.01: 2.0%
top triad labels:
40 sus4_6_8_9
28 major_4_5_6
18 augmented
10 diminished
2 non-musical (r=[1.461,1.814])
1 non-musical (r=[1.398,1.625])
1 non-musical (r=[1.391,1.790])
1 non-musical (r=[1.395,1.771])
============================================================
Phase 3 — melodies
============================================================
N samples: 512
mean tonal salience: 0.646
PC count distribution: {5: 186, 4: 177, 3: 80, 6: 52, 2: 9, 7: 5, 8: 3}
closest Western scale matches:
115 blues root=3
64 blues root=9
50 blues root=6
32 blues root=4
26 major root=4
============================================================
Phase 4 — rhythms
============================================================
N samples: 256
mean phase coherence: 0.828
median best period: 0.553s (108.5 BPM)
IQR period: 0.473–0.567s
============================================================
Phase 7 — counterpoint
============================================================
N samples: 256
mean vertical dissonance: 0.241
mean voice crossings (out of 8): 0.00
mean shared tonal salience: 0.419
vertical-interval percentiles: 25%=22.0st, 50%=26.8st, 75%=33.4st
============================================================
Phase 8 — intervals (ODD-partial timbre)
============================================================
median ratio: 1.425
top BP-style labels:
214 7:5
177 3:2
72 9:7
16 25:21
5 other (r=1.339)
============================================================
Phase 8b — triads (ODD-partial timbre)
============================================================
N samples: 512
mean dissonance (under odd partials): 0.092
mean dissonance (under harmonic timbre, for comparison): 0.194
median r1: 1.389, median r2: 1.683
============================================================
Phase 8c — intervals (INHARMONIC partials, negative control)
============================================================
median ratio: 1.225
top labels (under Western tuning):
101 minor_third
84 major_third
63 perfect_fourth
58 major_second
26 tritone
============================================================
Phase 11 — autoregressive melody (with motif autocorrelation)
============================================================
N samples: 256, length per melody: 16
mean tonal salience: 0.536
PC count distribution: {8: 70, 9: 57, 7: 50, 6: 24, 10: 23, 5: 17, 11: 12, 12: 1, 13: 1, 4: 1}
mean motif autocorrelation: 0.342
closest Western scale matches:
103 chromatic root=0
18 harmonic minor root=2
10 major root=2
============================================================
Phase 12 — cadence-aware chord progressions
============================================================
N samples: 256, chord positions: 4
mean cadence_arc (middle − endpoint dissonance): -0.135
mean dissonance per position:
pos 0: 0.353
pos 1: 0.168
pos 2: 0.162
pos 3: 0.247
============================================================
Phase 13 — 3-voice counterpoint
============================================================
N samples: 256
mean vertical dissonance: 0.824
mean voice crossings: 0.02
mean shared tonal salience: 0.424
============================================================
Phase 13 — 4-voice counterpoint
============================================================
N samples: 256
mean vertical dissonance: 1.998
mean voice crossings: 0.65
mean shared tonal salience: 0.416
built by .github/workflows/audio-preview.yml