squirrelon
7 hours ago
Feel free to comment any enhancement suggestions or points you have.
Rochus
5 hours ago
Cool. Generation of symbolic music using transformers is indeed a pretty neglected field. I assume you have to encode more "musical knowledge" into the embeddings than when you "just" compress waveforms. Can you provide information on your embeddings?
squirrelon
3 hours ago
Hi, It is actually not using transformers, those would be too slow. It is using a combination of CNN's and linear layers. Correct, it uses embedings, not waveforms or spectrograms. The inputs are midis, some of which I made myself in FL Studio. The model creates a "latent representation" from each midi, I can then sample randomly from this latent space to get an original piece. The most important part is the preprocessing in my opinion.
Rochus
2 hours ago
That's fascinating. This sounds like a variational autoencoder. The embeddings, which from my humble point of view (as a trained musician) are a largely unexplored field not really supported by existing theory, are at the same time game-deciding. Have you found a good solution for this?