Page 1 of 1

Current state of Linux singing synths?

Posted: Mon Oct 07, 2019 7:06 am
by rhydermike
Hiya all

I was looking into the current state of Linux singing speech synthesizer software recently. I want to turn some of my songs into finished demos, but due to my living arrangements, I have to keep the noise levels down, and I can't easily record my singing.

The only thing I've had up and running so far is the native Linux version of SynthV (https://synthesizerv.com/en/). SynthV seems like a good piece of software, and I might get it at the end of the year (80USD). One limitation of that software is that there are no English male voice packs for it (as far as I can see).

I was wondering if there are any currently maintained open source alternatives?

I keep seeing references to something called "A Neural Parametric Singing Synthesizer" (https://mtg.github.io/singing-synthesis-demos/). The demos of this sound amazing, but I keep going around in circles trying to find out if this is a runnable application or just a research project. (Demos of it in action - https://www.youtube.com/watch?v=HANeLG0l2GA)

There seem to be various other bits of software on Linux, but most of it doesn't seem to be currently maintained. I didn't want to invest too much time trying to build older projects that might never work, unless anyone else can tell me that they are viable.

Any heads-ups are appreciated, even if it means adapting something like CSound to do the job. I'd even consider running something under WINE if there are any intriguing applications that don't cost too much.

Thanks in advance for any tips.

Re: Current state of Linux singing synths?

Posted: Mon Oct 07, 2019 8:05 am
by stanlea
Maybe for the second one you can ask the devs as their mail adress is provided on the site.

Re: Current state of Linux singing synths?

Posted: Mon Oct 07, 2019 8:39 am
by tavasti
rhydermike wrote:I keep seeing references to something called "A Neural Parametric Singing Synthesizer" (https://mtg.github.io/singing-synthesis-demos/). The demos of this sound amazing, but I keep going around in circles trying to find out if this is a runnable application or just a research project. (Demos of it in action - https://www.youtube.com/watch?v=HANeLG0l2GA)
Sounds good, but have you seen that software anywhere, or just papers and examples?

Re: Current state of Linux singing synths?

Posted: Mon Oct 07, 2019 10:20 am
by tramp

Re: Current state of Linux singing synths?

Posted: Mon Oct 07, 2019 11:46 am
by tavasti
tavasti wrote:
rhydermike wrote:I keep seeing references to something called "A Neural Parametric Singing Synthesizer" (https://mtg.github.io/singing-synthesis-demos/). The demos of this sound amazing, but I keep going around in circles trying to find out if this is a runnable application or just a research project. (Demos of it in action - https://www.youtube.com/watch?v=HANeLG0l2GA)
Sounds good, but have you seen that software anywhere, or just papers and examples?
Page says 'In the following examples only timbre is generated by the model. Pitch and phonetic timings are extracted from a recording (in most cases of a different singer).' So does that mean, that someone has been singing those, and then just voice has been changed?

Re: Current state of Linux singing synths?

Posted: Wed Oct 09, 2019 2:13 am
by rhydermike
tavasti wrote:
rhydermike wrote:I keep seeing references to something called "A Neural Parametric Singing Synthesizer" (https://mtg.github.io/singing-synthesis-demos/). The demos of this sound amazing, but I keep going around in circles trying to find out if this is a runnable application or just a research project. (Demos of it in action - https://www.youtube.com/watch?v=HANeLG0l2GA)
Sounds good, but have you seen that software anywhere, or just papers and examples?
That's what I'm trying to get to the bottom of.

Re: Current state of Linux singing synths?

Posted: Wed Oct 09, 2019 2:25 am
by rhydermike
tramp wrote:should be this one:
https://github.com/MTG/WGANSing
I've had a go with this, and I have it installed, but I'm still struggling to get it working. I'll report back on the thread if I get anywhere with it.

Re: Current state of Linux singing synths?

Posted: Wed Oct 09, 2019 7:42 am
by tavasti
rhydermike wrote:
tramp wrote:should be this one:
https://github.com/MTG/WGANSing
I've had a go with this, and I have it installed, but I'm still struggling to get it working. I'll report back on the thread if I get anywhere with it.
For me, even first step 'pip install' does not work. utf8-interpy does not exist, and vamp install dies in some compiler error.

When reading what main.py does, looks like it is wanting some hdf5 file as input. That looks to be some generic file format, which might contain any data? https://www.hdfgroup.org/

Re: Current state of Linux singing synths?

Posted: Fri Oct 25, 2019 11:28 am
by pink
That thing got me curious, so I wasted^H^H^H^H^H^Hspent some time on this.

That WGANSing thing is not an implementation of the "A Neural Parametric Singing Synthesizer" paper, but a somewhat different approach to voice synthesis. It is described here: https://arxiv.org/pdf/1903.10729.pdf and here are some demos: https://pc2752.github.io/sing_synth_examples/

I also struggled to get the code from github to work. The requirements.txt file looks like junk to me, so I ended up just installing packages as needed, until I got the script running. I also failed to get 'tensorflow-gpu' working, but it turns out the the non-gpu version works just as well (presumably just slower).

I eventually succeeded in getting some wav file out of it, but the result does not replicate the quality from that demo website at all:
https://drive.google.com/file/d/1rnWWJH ... sp=sharing
No idea, what I did wrong...

I also found this one: https://github.com/seaniezhao/torch_npss
Doesn't seem to be the implementation by the authors of the paper, but just some random guy creating his own implementation based on the paper.
It also didn't work with the requirements.txt file from the repo. It doesn't specify a version for tensorflow, so the latest 2.0 got installed, but it only works with 1.x.

And then I actually managed to get some proper sounding audio out of it.

But both implementations are just in a "proof-of-concept" state and can only be used to re-synthesize the song from the training data. To do anything more meaningful with it, I would expect to feed in arbitrary sequences of (pitch, phoneme) data into it, and let the model "sing" it. I don't understand the tensorflow API enough (i.e. not at all) to be able to do that.

Re: Current state of Linux singing synths?

Posted: Sat Oct 26, 2019 7:42 am
by stanlea
You are very brave !

Re: Current state of Linux singing synths?

Posted: Sat Oct 26, 2019 2:27 pm
by tavasti
Thanks for digging up and reporting!

Re: Current state of Linux singing synths?

Posted: Sun Oct 27, 2019 2:12 pm
by rhydermike
pink wrote:
I also struggled to get the code from github to work. The requirements.txt file looks like junk to me, so I ended up just installing packages as needed, until I got the script running. I also failed to get 'tensorflow-gpu' working, but it turns out the the non-gpu version works just as well (presumably just slower).
I had a similar experience with it, but I didn't get quite as far as getting any music out of it because the tensorflow-gpu module started to make the machine freeze. I had thought it was something to do with my machine so I gave up at that point.
I eventually succeeded in getting some wav file out of it, but the result does not replicate the quality from that demo website at all:
https://drive.google.com/file/d/1rnWWJH ... sp=sharing
No idea, what I did wrong...
It does have an interesting, natural sound to it, even though there is something wrong with it.
But both implementations are just in a "proof-of-concept" state and can only be used to re-synthesize the song from the training data. To do anything more meaningful with it, I would expect to feed in arbitrary sequences of (pitch, phoneme) data into it, and let the model "sing" it. I don't understand the tensorflow API enough (i.e. not at all) to be able to do that.
So, it looks like SynthV is the most complete singing synth (GUI editor and actual synth) with a native Linux binary at the moment then.