Behind the Phone
October 25th, 2010 by Jose de CastroIf you’re reading this, you’ve probably heard the buzz around the new product from Voxeo Labs. We launched Phono SDK at the jQuery Conference last week. Thanks to everyone for the support leading up to the launch; it was a great success!
If you haven’t heard of Phono, here’s a brief description off the site:
The Phono SDK provides an object-oriented JavaScript API for embedding two-way audio and chat onto any web page. The Phono SDK is a pure client-side solution and requires zero server-side logic on your part; all communication is handled transparently by the Voxeo Cloud.
This post is the first in a series. I’ll attempt to capture the rationale behind Phono and give a basic overview of the architecture.
Why Phono?
A growing number of people now leverage standard Internet protocols for making and receiving phone calls. Some networks are more advanced than others but they all share the same basic idea: integrating voice and data equals big savings and better features. Yet from a user interface perspective the average in-call experience is still very limited. You get a few preset buttons and a dialpad. This got us thinking: what better place to build the next generation of phone than the web browser?
Enter Phono…
Phono aims to unlock the latent potential of today’s IP communication networks.
Today’s converged voice/data networks represent an enormous opportunity for innovation. When combined with real-time services like presence, video, virtual whiteboards and screen sharing the calling experience becomes an incredibly powerful collaborative tool. Phono will unlock these services from the shackles of proprietary applications and put them where they belong: the web.
Where are we today?
Phono 0.1 was released last week and people are freaking out! The response has been incredible and we’re already being approached by call centers, dating services and mobile game developers wanting to get Phono integrated into their products. Needless to say, we’re very excited.
The Phono stack consists of two main parts: a JavaScript library and a server component we call the Phono Gateway. Last week, we open sourced JavaScript library and plan to release source for the Phono Gateway later this year.
Currently, the Phono Gateways are deployed on the Voxeo Cloud giving us access to a global telephony network, extremely mature SIP infrastructure and support from some of the smartest people in the industry. As an added bonus, Phono developers get access to phone numbers in over 50 countries!
The core Phono SDK is an XMPP client which provides basic services like presence and connection management. On top of this core we’ve layered three plugins: Messaging, Phone and Audio. Together these plugins provide a simple JavaScript API for things like chat clients, making calls, sending keypad input and adjusting media levels like volume and gain.
On the signaling side we chose XMPP because it’s a low latency protocol, is firewall friendly and provides rich support for domain federation. Internally, we use the popular Strophe XMPP client written by Jack Moffitt to do some of the heavy lifting.
For media, Phono uses Adobe Flash to access the microphone and speaker. We mostly chose Flash because we already had experience with it, it’s installed on most desktop browsers and does an decent job of adapting to low bandwidth conditions. However, there are significant drawbacks to using Flash: sparse integration with mobile devices, lack of echo cancelation and the use of TCP as opposed to it’s low latency sibling: UDP. Flash 10.1 made headway in these areas but in true Adobe style, the new protocols are not open or available for general use. We’re certain Adobe will do the right thing and open RTMFP but let’s not hold our breath.
Not to worry though, our good friends at Google, Skype and a range of other companies have started a working group to bring native Microphone and Camera access to the browser! The RTC-Web group (now open to the public) is focused on bringing real-time media capture/control to the browser. Voxeo Labs will be monitoring this group and assisting wherever possible.
New developments like Flash 10.1 and the work happening in the RTCWG are precisely why Phono’s audio layer was designed as it’s own plugin. Our goal is to allow developers to switch between media implementations with just a few lines of configuration.
Stay tuned…
That’s about all the time I have for writing today. Make sure to follow us on Twitter or Facebook and thanks again for all the support. Keep an eye out for the next part of this series where I’ll dive deeper into XMPP, media and other Phono goodies.
No related posts.


RSS Feed
October 27th, 2010 at 8:40 am
This sounds like a great product and we would to incorporate something like this in our Online products, but I cannot see anywhere the costs.
October 27th, 2010 at 10:48 am
Awesome! We’re always on the looking for emerging technologies and telephony adoption rates have exploded accross the USA & Canada. This further confirms that things will continue to get better!
October 27th, 2010 at 10:50 am
Just found out about this on HN. Wow, highly impressed! I was previously going to do a somewhat-related start-up, using Flash as the transport layer for various voice recognition-related services.
It’s amazing what you can build on top of your existing platform, which must have been an incredible amount of hard work.
Hats off to the programmers! May you enjoy much success!
October 27th, 2010 at 6:24 pm
Hi everyone,
Thanks for all the kind words. We’re working hard on a new release that will blow this one out of the water so stay tuned.
Steve: regarding pricing, Phono to Phono and Phono to SIP are completely free. Phono to PSTN (regular numbers) are also free with a short “nag” message. However, you can always create a Tropo app that receives an incoming SIP call and transferes it to whatever number you’d like.
Tropo pricing can be found here: http://tropo.com/pricing
This is only the beginning. We plan to have greatly improved audio quality in the next release;l as well as screen sharing and other API improvements.
Thanks again for the support!
October 28th, 2010 at 12:36 am
[...] quote them: "Phono is a simple jQuery plugin and JavaScript library that turns any web browser into a phone; capable of making phone calls and sending instant [...]