Standards for Browser Video Playback Metrics

Session notes

Session Type: Working group

Session Category: Open Media Developers

Session Leader: Zachary Ozer (Longtail Video)

Day: Sunday
Time: 2:00—3:30pm
Room: Faculty Commons

Description

With the proliferation of video playback on mobile devices, video metrics have become more important than ever. Beyond simple engagement data, metrics around data transferred, bandwidth, dropped frames, and playback rate are more important than ever to creating great user experiences.

In spite of this, there is not currently any standardization around measuring the playback performance of HTML5 video in browsers. Flash supports a large collection of network and player measurements and provides excellent insight into the performance of video in the Flash player. In browser, Firefox and Chrome have implemented some custom metrics into their HTML5 video elements.

A proposal of a set of metrics ready for introduction into HTML5 has been prepared, but has not been introduced into HTML5 yet.

Outcome

This session analyses the proposed set of metrics for HTML5 video playback and networking performance. The discussion will likely involve:

Enumerating the relevant use cases Reviewing the current metrics proposal Revision of the current proposal to ensure that it satisfies the use cases, or developing one or more new proposals Identifying technical concerns surrounding metric equivalence across user agents. One likely outcome of the session is an agreement on the necessary set of metrics that should be exposed by browsers, potentially based off of the current proposal.

A second outcome could be an implementation of the metrics in one or more user agents.

A third outcome might include a request to include the metrics in the HTML5 specification at the WHATWG or W3C.


Notes

Zachary Ozer (JWPlayer)
Pablo Schklowsky (JWPlayer)

  • start off with demo of Flash with adaptive bitrate switching and metrics
    • shows bandwidth
    • shows size of video (width x height)
    • shows dropped frames
    • shows switching between different levels
  • there is no way to do this in HTML5 right now

Use cases:

  • gathering engagement data for analytics
  • adaptive bitrate switching
  • playback quality monitoring & dropped frames for measuring the user experience
  • network congestion monitoring & devices in specific areas / system performance monitoring -> can the server adapt the media rather than the client
  • page visibility / tab visibility management
  • electricity consumption / CPU utilization / GPU utilization
  • you can e.g. report on the quality of CDNs

What metrics/statistics have the frameworks implemented:

  • Zach shows a table with the existing proprties as fimplemented by Firefox and Chrome, as well as the existing proposal and what other frameworks implement (Silverlight, Flash etc)
  • current proposal:
    • bytesReceived
    • bytesDecoded: audioBytesDecoded, videoBytesDecoded
    • decodedFrames
    • droppedFrames
    • presentedFrames
    • playbackJitter
  • are these good metrics? what else do we need?

Discussion:

  • frames per second is a bad metric - what does it give you? maybe something like percentage of frames not displayed would be better? maybe measuring the accumulated time that a clock has been stopped/delayed?
  • there is an optimal playback quality; a counter that accumulates the lateness of frames can help, but then you cannot catch up with changes; you either need to define a window or be able to reset the counter to do useful measurements
  • you want to be able to separate out whether a frame latency comes from a network or a CPU issue
  • there are two main use cases: measure whether a user's experience is good after-the-fact and collect data to adapt the delivery to improve the user's experience in real-time
  • you cannot know ahead of time, even when knowing what device you are playing on and what bandwidth is available, what quality the video will playback in because there are other things happening on the machine
  • for privacy reasons we cannot expose the details of the hardware
  • what happens when the data comes from the cache rather than from the network?
  • all metrics in the browser are right now based on time, not on bytes - you need bytes when you want to do rate switching; you need to know if you are not playing back video in realtime (i.e. sec of video played per second)
  • we care about three levels of measurement:
    • measuring the network performance (bandwidth)
    • measuring the decoding pipeline performance
    • measuring the display quality

What can be derived from the current proposal?

  • all min, max, average
  • downloadBytesPerSecond
  • playbackRate
    • audioPlaybackBitrate
    • videoPlaybackBitrate

What is missing from the current proposal?

  • it doesn't tell you whether data comes from the cache or the network
  • we need a bytes to time mapping if any of the byte information will be useful
  • on the network-level: no differentiation between a/v -> if we drop audio, that's a disaster, so it would be good to know this
  • on the decoding level: no differentiation between parsing and decoding -> if decoded is less than parsed, then our decoder is too slow --> what is the script going to do about it? aks for a lower resolution / bitrate video
  • on the display level: if played is less than decoded, then our GPU is too slow
  • maybe the WebGL metrics should apply to video, too
  • what we don't deal with yet is the situation where metrics need to be reset, e.g. during seeking, pause or stop
  • are the rates more useful than the actual numbers of frames? it's easier to be flexible in the time interval
  • what do we do with variable framerate videos? most videos on the Internet actually don't provide fixed framerate and browsers won't expose this information.
  • how consistent are currentTime and buffered ranges at any point in time? possibly very inaccurate in the browser

What else is not in the current proposal?

  • buffer
    • audioBufferByteLength
    • audioBufferLength (sec)
    • videoBufferByteLength
    • videoBufferLength (sec)
  • startupTime
  • requestResponseTime
    • audoRequestResponseTime
    • videoRequestResponseTime
  • requestError
    • videoRequestError
    • audioRequestError