The Pebble and the Avalanche

Moshe Thumbnail
Current Revolutions in Business and Technology

by Dr. Moshe Yudkowsky,

author of The Pebble and The Avalanche: How Taking Things Apart Creates Revolutions

 

Tue, 2006-Jan-31, 07:17

Story Marker
Free Speech: New Speech Technology Component Will Encourage Wild Applications

One of the barriers to low-cost, clever speech technology applications is about to be swept away by an avalanche of change. Last night at SpeechTek West, Voxeo started handing out "sneak previews" of their Prophecy 2006 platform, with full public release slated for today (Tuesday). P2006 provides — at no cost — speech recognition; text-to-speech; VoiceXML, CCXML, and CallXML interpreters; and a SIP interface to Voice over IP, including a way to connect Free World Dialup, Vonage, and other VoIP service providers to the P2006 platform. To make certain setup is painless and the XML interpreters work, Voxeo bundles a web server including PHP into the package. In other words, P2006 is a miniature version of their hosting service, and it's free, and it's easy to integrate into all sort of projects.

The free speech technology is particularly important: until now, the cost of speech technology has been a barrier to anyone outside the business (if you're in the business, you've had no choice). Voxeo created their own automatic speech recognition (ASR) and text-to-speech (TTS) technology in order to be able to give it away, which is also big news in an industry with very few basic speech "engines." The baseline P2006 platform includes two ports (instances) of speech technology.

Beside being a big step for Voxeo, this is a big step for the industry as a whole. The W3C's VoiceXML and CCXML made it far easier to create speech applications, but these applications need interpreters and speech technology to run — the applications need a platform. Voxeo and a few others offer free hosting to developers, but hosted platforms don't let you build (for example) a speech-enabled system to handle calls to your home ("did you want to speak to Moshe, his wife, or his daughter?"). P2006 eliminates this barrier.

I've had P2006 running for a week, and I'm still experimenting with it. Voxeo let me know in advance because they will likely release my open source Voice Conference Manager, which is currently optimized to run on Voxeo host platforms, in a version that's optimized for P2006. If you're looking for a demo of how to use VoiceXML and CCXML, check out Voice Conference Manager. (And yes, I know that I need to release some updates to VCM very soon.)

The Prophecy 2006 platform will do for speech technology what digital cameras did for photography. Digital cameras made it possible for cameras to appear as components inside PDAs and cell phones. The Prophecy 2006 platform allows people to experiment with speech technology as a component; and with thousands of new developers, we'll likely see a few "killer apps" emerge, to the benefit of us all.

To leave a comment, please fill out this form.

Comments are closed for this story.

Trackbacks are closed for this story.

[ 1 ]