Thursday, May 12, 2011

Tux4Kids on MacPorts; Google's text-to-speech

TuxMath-2.0.3 on MacPorts

As of last week, users of  MacPorts now have access to the latest TuxMath (version 2.0.3) with a simple "sudo port install tuxmath".  For those unfamiliar with MacPorts, it is a project reminiscent of the FreeBSD Ports Collection that aims to bring thousands of Free/Open Source applications to the Mac OS-X desktop.  TuxMath and Tux Typing have been in MacPorts since early 2008.  I have long been a big fan of package management as epitomized by Debian's Apt system, and felt that Windows and OS-X users have never known what they were missing.  It seems sad and ironic that with the general public, Apple is widely credited with the idea of having a platform provide a centralized source for it's users' software.  I'd love to see Debian come up with the guts to call its repositories "The Apt Store".  Anyway, Mac users with MacPorts can now test out the latest new features of TuxMath on their favorite platform.

There are two known bugs with TuxMath-2.0.3 on the Mac, AFAIK.  First, there is a display problem when the program is run in windowed mode ("tuxmath -w" on the command line) that causes scaled SVG images to be translucent and purple-tinged, very likely some sort of transposition of the RGBA values of the pixels.  The relevant code lives in src/t4k_loaders.c in t4k_common.  Second, the network game has one serious glitch - if the Mac client is connected to a tuxmath server on another machine, it does not remove the comets from the screen when another player answers the question correctly.  The causes of these bugs aren't yet clear - hopefully someone will have time to get to the bottom of this before too long.

Text-To-Speech in Tux Typing?


For some time, I've wanted to add TTS support to Tux Typing, having the program "say" each word when it is correctly typed.  This would be great for younger kids who are just learning to read, making Tux Typing into a "learn to read" program as well as a "learn to type" program.  In the past, we considered trying to do this using the FOSS Festival package, but decided it would add too much complexity and "bloat" to the program, and also that too many languages in Tux Typing lacked Festival voices.

I recently came across the TTS aspect of Google Translate, which provides not only machine translation but decent speech synthesis for the large majority of languages supported by Tux Typing.  It turns out that it is quite a simple matter to send the text to Google Translate using wget as the "browser", and save the returned mp3 file for local playback.  Importantly, the current locale selection can be followed to provide pronunciation in an appropriate voice for most of tuxtype's language-specific "themes".  It took me less than an hour of Googling and hacking to put together a "proof-of-concept" modification to Tux Typing that works exactly as intended.

Unfortunately, my proposed use of the TTS feature of Google Translate isn't actually allowed under Google's terms of service.  I don't think Google would really care, but it isn't a supported API, and thus could break at any time, which might make our users mad at us and at Google.  So, I was strongly discouraged from putting this into a public release of Tux Typing as an official feature.  I asked if it Google would consider making TTS an official API, and was told that my suggestion would be passed along to the relevant folks.  I'm not holding my breath, but I would not be shocked if Google TTS eventually becomes officially usable by apps such as ours.

Of course, there are drawbacks - this would make our Free program dependent for this one feature on a cloud service that is merely gratis, rather than libre.  Depending on how we implemented it, the feature also might require an internet connection to work, which would be less than ideal.  However, it was a good illustration of the power of the cloud model - I was able to add TTS support to tuxtype with one line of C code and a tiny BASH script, plus the freely available wget and mplayer.

For anyone who is interested, this hack is in the "origin/origin/google_tts" branch of our git repository at Alioth.  If Google makes TTS officially available as an API, I will put some variant of this feature into future public releases of Tux Typing.