"AI" or Machine Learning for humanized tempo mapping?

DSP, Plugin and Host development discussion.
Post Reply New Topic
RELATED
PRODUCTS

Post

Hi,
I love writing music with MIDI on a grid, yet I miss the ebb and flow of real-time performance, but I have never been successful at hand editing timing nuances after writing MIDI.

I have found the quasi-randomized humanization command processes to sound worse than grid-perfect, so I rely on velocity choices to portray a modicum of expressiveness.

Lately, I've been wondering when someone will apply large-model machine-learning techniques with the task of re-timing MIDI files to help the written music convey emotion convincingly.

Is it coming anytime soon? :-)

Thank you.

Post

Well if you want a Neural Net to "learn" how to do this you have to feed it a large enough sample set, now each sample set will need to be two performances - the first some MIDI locked-to-grid (like you might eventually input into it) and second some human performed version. Now rinse and repeat many many times until you have enough data sets to get a valid outcome.

The problem here then is getting these data sets is complex - you will need humans to play your score, and all of this takes time and effort - and what you end up with is probably not the mainstream product that will pay you back for all this effort.
VST/AU Developer for Hire

Post

why do you need Ai for this? you can already do audio to midi and there's plenty of functions in different DAWs or max for live plug ins, or VST/AU midi sequencers and midi generators that will add all kinds of timing manipulation to audio and midi.. just find one that has the features you want and demo it. try a few things.

some people are already doing this for themselves. just not sharing. various people who use max/msp have trained it to write beats in their style.. using Magenta and also creating neural networks. it's sort of "dumb Ai" though because they're still telling it what to do and programming it to be a certain way.

so, it's possible already and being done.

edit: also worth mention things like ableton's "groove pool". you can extract and save grooves to apply to other things.

Post

Lind0n wrote: Tue Jan 30, 2024 2:26 pm Well if you want a Neural Net to "learn" how to do this...
Hi,
Thank you for contributing these insights. I had not considered the idea of side-by-side comparison. I do agree that the cost would seem prohibitive with regard to ROI, but so do many of the AI projects I see advertised by the venture IPO clubs. Perhaps an academic will find the subject interesting.

I guess I imagined an analysis of human-performed songs where associations were identified in the music. Qualities like tension and resolution, melodic movements up and down the spectrum, and patterns of note length usage were linked to changes in the timing within the performance.

Maybe I am thinking about an algorithm rather than AI. I have only been producing music for about 45 years, and while I am more or less pleased with the result of my acoustic real time human performances, I have never been good at quantifying and detailing what happens with your muse when you play in real-time, so I have never figured out how to effectively program emotive time variances on a workstation timeline.

I enjoy the idea that computing can augment human endeavor, so I am hoping that someday, something that might be thought to improve the presentation of a composition by responding to its content rather than simply injecting an arbitrary groove, etc., might become available as a tool for music composition.

I am just throwing out an idea with the hope that some part of it might be met with common interest.

Thank you!

Post

Lind0n wrote: Tue Jan 30, 2024 2:26 pm Well if you want a Neural Net to "learn" how to do this you have to feed it a large enough sample set, now each sample set will need to be two performances - the first some MIDI locked-to-grid (like you might eventually input into it) and second some human performed version. Now rinse and repeat many many times until you have enough data sets to get a valid outcome.

The problem here then is getting these data sets is complex - you will need humans to play your score, and all of this takes time and effort - and what you end up with is probably not the mainstream product that will pay you back for all this effort.
You can get midi recorded from human performances and then auto-quantize that to get the two data sets.

Another option would be to extract midi from music for a large training set.

One complication is that different instruments and styles may tend towards different timing variations.

Post

D2sX9ek8w3 wrote: Mon Jan 29, 2024 11:54 pm I have found the quasi-randomized humanization command processes to sound worse than grid-perfect, so I rely on velocity choices to portray a modicum of expressiveness.
Yes, well I believe "humanization" aspect is non-random. While the musician may perhaps not be conscious of it, I think it is entirely deliberate.
Lately, I've been wondering when someone will apply large-model machine-learning techniques with the task of re-timing MIDI files to help the written music convey emotion convincingly.

Is it coming anytime soon? :-)
I don't know. I'm wonder if the AI needs to know what thoughts your having inside your head if it is to add the desired expressiveness. I really enjoy using a MIDI controller with a low-latency DAW setup though.

Post

Within the context of MIDI, I make use of it because I can not or do not have proficiency in playing the instrument whose voice I am hoping to emulate, so I have found it is better for me to think in terms of "composing" to a quantized grid rather than make a recording of my inability to play well.
downtempo wrote: Sun Mar 31, 2024 12:10 am While the musician may perhaps not be conscious of it, I think it is entirely deliberate.
I wholeheartedly agree with this, but I do not think that the acts of deliberation are necessarily unique. I suspect that there are patterns related to movements in melody, scale structure, etc.; that are shared throughout a culture, which might be discernable to a large model analysis and made available to composers who compose to a grid.

I feel that many of the deliberate choices made by performing artists are made in the subconscious. I struggle unsuccessfully to replicate them while studying a score as a conscious exercise.

Thank you.

Post Reply

Return to “DSP and Plugin Development”