RFC on architecture for MIDI piano practice tool for blind/low vision/elderly users.

Programming applications for making music on Linux.

Moderators: MattKingUSA, khz

Post Reply
danjcla
Posts: 2
Joined: Sat Dec 05, 2020 7:46 pm

RFC on architecture for MIDI piano practice tool for blind/low vision/elderly users.

Post by danjcla »

I'm working on a recording solution for my technophobic father for X-Mas, and just wanted to see if this seems like a reasonable high-level architecture to more experienced people:

(a) midish or arecordmidi (which would be better? Or something else?) continuously records the MIDI output of a digital piano to a compressing file system - I'm thinking BTRFS - say one file per day.

(b) Configure Rhasspy to be able to feed MIDI back to the device via midish, so it gets played, initially in the simplest way possible, e.g. by speaking things like "Play 2020 12 5 at 1 15 pm" and "stop". (My dad doesn't have or want internet access, so need to avoid any Speach-To-Text that sends voice offsite for processing.)

(c) Over time, improve the interface, for instance, to allow the user to give arbitrary names to start points. Perhaps also add the feature of being able to choose between sending the MIDI back to the device and playing via some nicer-in-some-ways instrument via linuxsampler and the appropriate .gig files.

Any thoughts?

Thanks,
-Danny
Basslint
Established Member
Posts: 1511
Joined: Sun Jan 27, 2019 2:25 pm
Location: Italy
Has thanked: 382 times
Been thanked: 298 times

Re: RFC on architecture for MIDI piano practice tool for blind/low vision/elderly users.

Post by Basslint »

Hello Danny, and welcome!

I think it's a great idea. I like the fact that you are following the *NIX way and working in a modular client-server fashion.

One helpful feature would be a "ls" command to list recordings on a given day. I know that they are all contained in the same file but without it, you would have to remember the exact time you recorded something. Perhaps you could use text-to-speech to answer the question "What did I record on 2020 12 5?". Or you could open/close a file manually, by saying "start recording" and "stop recording".

Otherwise, you could trigger recordings on input. You would record a temporary file continuously and when a note is first played, that last note is copied onto a new file which refers to that specific session. The file closes automatically after some idle time (let's say if no note is played for 10 minutes).
The community of believers was of one heart and mind, and no one claimed that any of his possessions was his own, but they had everything in common. [Acts 4:32]

Please donate time (even bug reports) or money to libre software 🎁

Jam on openSUSE + GeekosDAW!
folderol
Established Member
Posts: 2072
Joined: Mon Sep 28, 2015 8:06 pm
Location: Here, of course!
Has thanked: 224 times
Been thanked: 400 times
Contact:

Re: RFC on architecture for MIDI piano practice tool for blind/low vision/elderly users.

Post by folderol »

Interesting project.
You will need to pay a lot of attention to the parser. Decoding typed info is hard enough (I know from experience!) so spoken words will have even more potential variations.
The Yoshimi guy {apparently now an 'elderly'}
jeanette_c
Established Member
Posts: 728
Joined: Tue May 12, 2020 5:53 pm
Has thanked: 347 times
Been thanked: 268 times

Re: RFC on architecture for MIDI piano practice tool for blind/low vision/elderly users.

Post by jeanette_c »

Hi Danny, I think you could lure a few people with a project like that. The way you proposed it is especially nice because it relies only on open source software, so you could in the end create a full install with all configurations.
If you don't want to go as far as that, I think Pianoteq includes the recording feature with an idle time and would, naturally, also include an in-the-box sound engine. I'm mostly mentioning it, since I supose you will have enough to do with the interface to your system and have less than three weeks now.
I have seen only one off-line Linux speech recognition system, which works as a complete standalone. Maybe there are more such systems in connection wiith some desktop environment. I think this system, written in Java, was called Sirius.
Midish as a recording tool is quite useful, since it can be scripted or operated by piped commands. The direct shell is just a convenience interface, originally written to demonstrate how other UIs could be built on top of Midish. If you'd like to work with idle times to turn off and trim recordings, maybe an additional piece of software attached to the MIDI input, perhaps using RTmidi could tell you when notes are being played on the input. Or you could try to parse the output of aseqdump for incoming events.
There are a few text-to-speech options on Linux, espeak, which is very synthetic, but has a few voice parameters. I think Mary TTS has a better voice or two. Mbrola might work on your architecture. It's voices aren't great by today's standards, but good enough for short feedback. Then there is Festival which has a few OK voices. There is one commercial voice provider Voxin. It sort of works with speech-dispatcher, the tool used to pass text to a speech engine from desktop environments. It has a small commandline utility. There was an announcement on the Orca mailinglist for a new alternative to speechdispatcher that is tailored to work well with Voxin.
I hope some of that is helpful.
Best wishes, Jeanette
--
distro: ArchLinux, DAW: Nama, MIDI sequencer: Midish
All my latest music on https://www.youtube.com/channel/UCMS4rf ... 7jhC1Jnv7g
Albums, patches and Csound on http://juliencoder.de
Post Reply