image provider

Voice Compression


This essay does not describe an existing computer program, just one that should exist. This essay is about a suggested student project in Java programming. This essay gives a rough overview of how it might work. I have no source, object, specifications, file layouts or anything else useful to implementing this project. Everything I have prepared to help you is right here.

This project outline is not like the artificial, tidy little problems you are spoon-fed in school, when all the facts you need are included, nothing extraneous is mentioned, the answer is fully specified, along with hints to nudge you toward a single expected canonical solution. This project is much more like the real world of messy problems where it is up to you to fully the define the end point, or a series of ever more difficult versions of this project and research the information yourself to solve them.

Everything I have to say to help you with this project is written below. I am not prepared to help you implement it; or give you any additional materials. I have too many other projects of my own.

Though I am a programmer by profession, I don’t do people’s homework for them. That just robs them of an education.

You have my full permission to implement this project in any way you please and to keep all the profits from your endeavour.

Please do not email me about this project without reading the disclaimer above.

The intent of this project is to come up with a very low cost, low-bandwidth Internet radio. There is another project with a similar goal, Internet Radio that describes a BitTorrent-like protocol for low bandwidth broadcasting. This project focuses on how to compress the broadcast itself by creating a sort of MIDI for voice.

It works like this. Low budget Internet radio content is mostly one person talking or a small group of people talking to each other. Each gets his own noise-cancelling microphone. You analyse the speech and convert it into phonemes. What you broadcast is then a sort of phonetic text transcript of what they said, augmented by volume and emphasis. Each speaker has a profile, much like the one Dragon Naturally Speaking compiles of how the speaker pronounces various phonemes. The phonemes don’t have to be a standard universal set, just a set of typical noises a given speaker makes in speaking. At the receiving end, the speech is reconstructed gluing together the phonemes with smoothing.

A primitive version of this would sample each word used in the transcript and create a model where each word was treated as a phoneme. Even something that primitive would still result in huge compression. The problem is coming up with sufficiently good voice quality that people would put up with it to listen to tiny political radio stations with no budget.

You have already seen technology similar to this in political parody.

super compressor
The George Bush speech impersonator

This page is posted
on the web at:

Optional Replicator mirror
on local hard disk J:

Canadian Mind Products
Please the feedback from other visitors, or your own feedback about the site.
Contact Roedy. Please feel free to link to this page without explicit permission.

Your face IP:[]
You are visitor number