WebRTC: Realtime Communications and HTML5

Session Notes

Session Type: Working group

Session Category: Open Media Developers

Session Leader: Tim Terriberry (Mozilla), Serge Lachapelle (Google)

Day: Saturday
Room: Faculty Commons
Time: 12pm – 1:30pm

Description

HTML5 <audio> and <video> have become first-class citizens of the web, but they are designed for pre-recorded content or relatively high-latency streaming from server to client. Real-time, low-latency communication directly between peers enables a whole host of new applications, but is currently only available in custom applications or via sparsely deployed plugins, using incompatible signaling and protocols.

Last year the WebRTC project began as a joint effort between the W3C and the IETF to standardize the APIs and protocols for real-time communications in browsers. Compared to the traditional <video> tag, there are a lot of moving parts: cross-platform camera and microphone access, signal processing to improve audio quality, encoders in the browser, with severe latency constraints, firewall penetration, codec and media negotiation, unreliable transports like RTP, possibly multiplexed with a generic peer to peer data channel, with encryption and a whole host of security considerations, and a huge collection of legacy equipment to interoperate with.

In this session, we will discuss the current progress of both the standardization efforts and the implementation efforts, and we will look for solutions in areas that currently lack consensus. This could include the APIs that are under design, the back-end architectures that these imply, and the underlying protocols that are needed to connect them, and the codec technology that makes them work.

Of particular interest is feedback from non-browser vendors who will want to interact with this system, and from developers who want to build innovative new things with it. It is not too late to influence use cases and requirements.

Outcome

This session will communicate the current technical proposals and make progress in defining the standards for real-time communication on the web.

One possible outcome could be the proposal of new APIs for managing devices, error reporting, negotiation, etc., as input for the W3C working group.

Another possible outcome could be a list of new use cases for applications that go beyond “video chat in a browser”.

Other recommendations for the protocol pieces or scoping of codec or other work could result from this session.


Notes

Jan Linden, Google
Tim Terriberry, Mozilla
Ethan Hugg, CISCO

Introduction - what is WebRTC

Slides: https://docs.google.com/present/edit?id=0AbRZ9Ue-MGwtZGhyNDc1cTdfMWZkNmc3dGQ0&hl=en_US

  • real-time communication for the Web: need for open and free solutions
  • make it easy for any Web developer to easily deploy RTC apps
  • Requirements:
    • open
    • high quality
    • low latency
    • secure
    • interoperable
    • ease of use
  • What will it be used for?
    • phone app
    • video conferencing
    • do what flash is capable of and more

How did we get where we are

  • started a year ago over lunch, IETF meeting July 2010
  • requirements have been there for many years
  • this is the right time to address it
  • IETF RTCWEB WB formed in April 2011: focus on network protocols
  • W3C WebRT WG created May 2011: JS API focus
  • focus on WHATWG specification

Implementation

  • standards and implementations need to go at the same pace
  • open source project developed that Chrome, Firefox & Opera etc can pick up
  • PeerConnection in Webkit and Chrome Sept 2011 (demo provided)
  • PeerConnection API in Firefox Q1 2012 (ikran) (demo provided)
  • CISCO open sourced SIP stack (SIP proxy server to signal to real phone)
  • CISCO demos shows a Firefox plugin that implements the SIP stack
  • Ericsson has a Webkit implementation that uses WebRTC

What does the platform provide?

  • media capture
  • real-time media encoding
  • signal processing (noise filter, echo cancellation, gain control etc)
  • real-time secure media transmission (RTP distribution, low latency, jitter buffer, encrypotion etc)
  • P2P data channel should be included
  • how to do signalling / negotiation

DISCUSSION

  • SIP - implementation in the browser necessary? is it possible to do peer2peer SIP without a SIP server? SIP requires a registrar
  • call recording (real-time processing required, remote machine recording, where to compress, saving to local file)
  • p2p distribution of content?
  • DDOS? protocol to provide a shard secret and IP address is used called ICE; rate-limit the ICE protocol (there will be a native implementation of ICE)
  • what protocols are supported? SRTP/RTP, DTLS (key exchange), RTCP, TURN server for two NAT-ed peers, SDP-based codec negotiation will exist
  • codecs? trying to mandate vp8 & opus (2.5-20ms latency codec); real-time transcoding isn't going to happen so how do we get
  • what apis to expose for statistics? round-trip time, packet loss, latency, delay
  • retain connection when switching networks/ IP addresses? not sure it works
  • SIP signalling gateways exist in telephone providers
  • codec transcoding gateways are not something that people want to implement
  • applications: screen sharing, e.g. a live video editing session, real-time file sharing (e.g. experience video/audio together)
  • cross-origin permission restrictions of what one is able to do with the data? ICE channel gives you the trust and your peer can do what they want; depending on the key exchange mechanism, you can ascertain that they are still who they claim they were at the beginning of the connection
  • are schemes like ZRTP supported? not right now; don't have secure conversation through an untrusted website!
  • have all the browsers implement Kademlia?
  • isn't the WebRTC stack over-engineered? is there a slim version of WebRTC? e.g. for Web server content that connects to non-browser UAs like vlc; you should support codecs and probably SIP if you don't want to support JavaScript
  • WebRTC and mobile? can VP8 run on phones? wait for HW implementations.
  • chrome on mobile and WebRTC? right now phone users use apps more than the browser; in future it will be both and increasingly browsers
  • what's still missing? requirements document - spec is still unstable; API is very basic only and goes barely beyond establishing a connection; network side of things is quite complete though what parts we want to support and which don't is still under discussion
    • mediaStream API needs more
    • device handling e.g. how do you enumerate cameras
    • error reporting e.g. if TURN server doesn't exist
    • metrics
    • signalling
    • use cases
  • is there a p2p websocket API? media goes over RTP and thus over UDP and is unreliable; do we build a reliable channel over this or do we leave it to the Web developer; is it possible to do this via Web sockets?
  • are there going to be implemeter prescriptions for the protocols?

Outcome:

Create a break-out group to discuss use cases, signalling needs, metrics needs, error reporting needs for the mediaStream API

Please review mediaStream API and send feedback to Mozilla: http://hg.mozilla.org/users/rocallahan_mozilla.com/specs/raw-file/tip/StreamProcessing/StreamProcessing.html