FOMS 2013 Main/Web VTT Non-captions

WebVTT beyond captions

Examples and Needs

http://demo.jwplayer.com/text-tracks/captions.html

Use the timeline search - search for "chicken" - success!
Thumbnails
chapters

Jeroen: three things I am interested in 1. preview thumbnails - make it a "kind" 2. chapters - can we put more than just text into chapters 3. how to put track inband into live streams

Preview Thumbnails

Proposals:

we're providing urls for images
or sprite them with media fragment URIs ? (YouTube ?, Hulu, Brightcove, JWPlayer ?)
base64 encoded images as cue content
Apple trickplay: provides offsets for iframes

JW: Native implementation in players in browsers?

JW: lots of players are doing this nowadays

PJ: extracting iframes by browser is somewhat not useful

might get black frames at beginning
need to buffer all the video
prefer linking to images

SP: is the proposal to introduce a new @kind="thumbnails" track?

JW: possibly

PS: anything that goes into a img @src attribute

SP: do we need to do responsive images?

PJ: possibly - eventually

JN: we can easily extract the images from inband easily because we have offsets

I would like to introduce an inband version of a @kind="thumbnails" track

JN: let's make sure we introduce a kind="thumbnails" track asap because too many people need/use it

SP: are chapters with thumbnails the same as a thumbnail track?

JW: a thumbnail track will have a higher frequency of images

MD: Should the server just define an API for how to load the images?

JN: you want the images served ahead of time

MD: do ppl use <track> and <video>

JW: smaller companies are starting to really get on the bandwagon

Summary:

we need a new @kind="thumbnails" track
should contain a URL that would be interpreted by an <img> element (or a <div> with the bg set?
if you want a binary blob, just us a data URI
if you want to use a sprite, use media fragment URIs ? (good first use of spatial media fragment URIs ? actually - not implemented yet)

Chapter Markers

JW: there is interest for more rich chapter content; in particular images

JW: we support this functionality where we have small text markers in the timeline

JW: Playlist with a title, text and image

PJ: could we just use a thumbnail track with a chapter track?

PJ: chapters have images that represent them - thumbnail tracks have images that are at that point in time

JN: if you click on the seek bar at a representative image, ideally you want to display the video from the point in time where the image is taken

PJ: I don't know how we deal with this in the browser in a native way

SP: I think we need a single "thumbnails" track that is always active

but the example that JW showed includes both a thumbnail and some descriptive text on top of the chapter title text

JW: I'd like to see a thumbnail be part of chapters; descriptive text not so much

Silvia: I think the display that JW has with images, title & descriptive text is a good example for a metadata track - it's displayed outside the video frame anyway

but if all browsers say that chapters always contain images, then we've obviously underspecified chapter tracks

JN: we always display thumbnails with chapter titles

PJ: if it's a real use case, maybe we should extend the spec

SP: should we include thumbnails into chapters?

PJ: le'ts experiment with thumbnail tracks first

general agreement that that type of track is more important

Inband tracks for live video

JW: how can we transport text tracks in live video?

in particular we have cue start times only with no end time - end times are determined by the start of the next cue; mostly for captions

SP: at one stage we had a proposal to introduce a special "NEXT" keyword for end times; then we can stream e.g. in-band in WebM ?

PS: and use blank cues to get breaks

JW: it's a good proposal

PJ: how do we deal with chapters in live streaming?

SP: do they even make sense?

PS: for DVR, yes

JW: workflow - ppl want their content available as fast as possible

PJ: what to do for thumbnails for DVR? What if there is already containers that have in-band thumbnails? We should support that.

JN: I don't know if there is a standardised thumbnail track in MP4 ?

PJ: it seems weird to put a condensed version of the movie into the movie

MD: is there an action plan for live VTT?

Summary:

NEXT keyword instead of end time would enable support of live captioning
need to support live VTT - needs a spec change to stop blocking loading until all cues have been loaded

Cross-domain VTT

VC: it seems like cross-domain loading of VTT is blocked in the spec, but no browser implements it
PJ: same-origin, cross-origin sandboxed, cross-origin DOM-exposed are the three available options
VC: yes, but nobody has implemented it

Summary:

need to discuss at W3C ?

Others

JN: in-band metadata cues are not yet exposed to iOS to the browser
Sam: 2nd screen use cases; metadata in cues for DOM interaction
MD: live MPEG-DASH support ? We need data events to JS from text tracks from HLS.
Steve: Re-using Track / Cue infrastructure for advertising (VMAP).