├── CODE_OF_CONDUCT.md
├── CONTRIBUTING.md
├── LICENSE.md
├── Makefile
├── README.md
├── dash-events-explainer.md
├── emsg-processing-model-figure1.png
├── emsg-processing-model-figure2.png
├── emsg-processing-model-figure3.png
├── emsg-processing-model.md
├── explainer.md
├── inband-events-using-datacue.png
├── inband-events-using-vttcue.png
├── index.bs
├── index.html
├── requirements.md
├── text-track-cue-constructor.md
└── w3c.json
/CODE_OF_CONDUCT.md:
--------------------------------------------------------------------------------
1 | # Code of Conduct
2 |
3 | All documentation, code and communication under this repository are covered by the [W3C Code of Ethics and Professional Conduct](https://www.w3.org/Consortium/cepc/).
4 |
--------------------------------------------------------------------------------
/CONTRIBUTING.md:
--------------------------------------------------------------------------------
1 | # Web Platform Incubator Community Group
2 |
3 | This repository is being used for work in the W3C Web Platform Incubator Community Group, governed by the [W3C Community License
4 | Agreement (CLA)](http://www.w3.org/community/about/agreements/cla/). To make substantive contributions,
5 | you must join the CG.
6 |
7 | If you are not the sole contributor to a contribution (pull request), please identify all
8 | contributors in the pull request comment.
9 |
10 | To add a contributor (other than yourself, that's automatic), mark them one per line as follows:
11 |
12 | ```
13 | +@github_username
14 | ```
15 |
16 | If you added a contributor by mistake, you can remove them in a comment with:
17 |
18 | ```
19 | -@github_username
20 | ```
21 |
22 | If you are making a pull request on behalf of someone else but you had no part in designing the
23 | feature, you can remove yourself with the above syntax.
24 |
--------------------------------------------------------------------------------
/LICENSE.md:
--------------------------------------------------------------------------------
1 | All Reports in this Repository are licensed by Contributors
2 | under the
3 | [W3C Software and Document License](http://www.w3.org/Consortium/Legal/2015/copyright-software-and-document).
4 |
5 | Contributions to Specifications are made under the
6 | [W3C CLA](https://www.w3.org/community/about/agreements/cla/).
7 |
8 | Contributions to Test Suites are made under the
9 | [W3C 3-clause BSD License](https://www.w3.org/Consortium/Legal/2008/03-bsd-license.html)
10 |
11 |
--------------------------------------------------------------------------------
/Makefile:
--------------------------------------------------------------------------------
1 | SOURCEFILE=index.bs
2 | OUTPUTFILE=index.html
3 | PREPROCESSOR=bikeshed
4 | REMOTE_PREPROCESSOR_URL=https://api.csswg.org/bikeshed/
5 |
6 | all: $(OUTPUTFILE)
7 |
8 | $(OUTPUTFILE): $(SOURCEFILE)
9 | ifneq (,$(REMOTE))
10 | curl $(REMOTE_PREPROCESSOR_URL) -F file=@$(SOURCEFILE) > "$@"
11 | else
12 | $(PREPROCESSOR) -f spec "$<" "$@"
13 | endif
14 |
15 | clean:
16 | rm -f index.html
17 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | # Web Incubator CG For DataCue
2 |
3 | This is the repository for the DataCue Web Incubator CG, a collaborative project hosted by the WICG.
4 |
5 | The repo is used for developing documentation and code resources identified by the group.
6 |
7 | ## Proposals
8 |
9 | ### DataCue API
10 |
11 | * [Explainer](explainer.md)
12 |
13 | ### TextTrackCue enhancements for programmatic subtitle and caption presentation
14 |
15 | * [Explainer](https://github.com/WebKit/explainers/blob/main/texttracks/README.md)
16 |
17 | ### Expose TextTrackCue constructor
18 |
19 | * [Explainer](text-track-cue-constructor.md)
20 |
21 | ## References
22 |
23 | * [Draft Spec](https://wicg.github.io/datacue/)
24 | * [DataCue Explainer](explainer.md)
25 | * [Requirements for Media Timed Events](https://w3c.github.io/me-media-timed-events/) W3C Media & Entertainment Interest Group Note
26 | * [WICG Discourse Thread](https://discourse.wicg.io/t/media-timed-events-api-for-mpeg-dash-mpd-and-emsg-events/3096)
27 | * [Video Metadata Cues TPAC 2018 breakout summary](https://github.com/w3c/strategy/issues/113#issuecomment-432971265)
28 | * [Video Search with Location OGC TC Leuven 2019 breakout summary](https://github.com/w3c/sdw/issues/1130#issuecomment-508531749)
29 | * [Video Metadata for Moving Objects & Sensors TPAC 2020 breakout summary](https://github.com/w3c/sdw/issues/1194#issuecomment-718702993)
30 |
--------------------------------------------------------------------------------
/dash-events-explainer.md:
--------------------------------------------------------------------------------
1 | # Browser Handling of DASH Event Messages
2 |
3 | The aim of this proposal is to define browser-native handling of MPEG-DASH timed metadata events, which today is handled at the web application layer. MPEG DASH `emsg` events are included in MPEG CMAF, which has emerged as the common media delivery format in HLS and MPEG-DASH.
4 |
5 | The current approach for handling in-band event information, implemented by libraries such as [dash.js](https://github.com/Dash-Industry-Forum/dash.js/wiki) and [hls.js](https://github.com/video-dev/hls.js), is to parse the media segments in JavaScript to extract the events and construct `VTTCue` objects.
6 |
7 | On resource constrained devices such as smart TVs and streaming sticks, this leads to a significant performance penalty, which can have an impact on UI rendering updates if this is done on the UI thread (although we note the [proposal](https://github.com/wicg/media-source/blob/mse-in-workers-using-handle/mse-in-workers-using-handle-explainer.md) to make Media Source Extensions available to Worker contexts). There can also be an impact on the battery life of mobile devices. Given that the media segments will be parsed anyway by the user agent, parsing in JavaScript is an expensive overhead that could be avoided.
8 |
9 | Avoiding parsing in JavaScript is also important for low latency video streaming applications, where minimizing the time taken to pass media content through to the media element's playback buffer is essential.
10 |
11 | Instead of using `VTTCue`, a separate proposal introduces `DataCue` as a more appropriate cue API for timed metadata. See the [DataCue explainer](explaner.md) for details.
12 |
13 | ## Use cases
14 |
15 | Many of the use cases are described in the [DataCue explainer](explainer.md).
16 |
17 | ### Dynamic content insertion
18 |
19 | A media content provider wants to allow insertion of content, such as personalised video, local news, or advertisements, into a video media stream that contains the main program content. To achieve this, timed metadata is used to describe the points on the media timeline, known as splice points, where switching playback to inserted content is possible.
20 |
21 | [SCTE 35](https://scte-cms-resource-storage.s3.amazonaws.com/ANSI_SCTE-35-2019a-1582645390859.pdf) defines a data cue format for describing such insertion points. Use of these cues in MPEG-DASH streams is described in [SCTE 214-1](https://scte-cms-resource-storage.s3.amazonaws.com/Standards/ANSI_SCTE%20214-1%202016.pdf), [SCTE 214-2](https://scte-cms-resource-storage.s3.amazonaws.com/Standards/ANSI_SCTE%20214-2%202016.pdf), and [SCTE 214-3](https://scte-cms-resource-storage.s3.amazonaws.com/Standards/ANSI_SCTE%20214-3%202015.pdf). Use in HLS streams is described in SCTE-35 section 12.2.
22 |
23 | ### Media player control messages
24 |
25 | MPEG-DASH defines several control messages for media streaming clients (e.g., libraries such as [dash.js](https://github.com/Dash-Industry-Forum/dash.js/wiki)). Control messages exist for several scenarios, such as:
26 |
27 | * The media player should refresh or update its copy of the manifest document (MPD)
28 | * The media player should make an HTTP request to a given URL for analytics purposes
29 | * The media presentation will end at a time earlier than expected
30 |
31 | These messages may be carried as in-band `emsg` events in the media container files.
32 |
33 | ## Proposed API
34 |
35 | The proposed API is based on the existing [text track support](https://html.spec.whatwg.org/multipage/media.html#timed-text-tracks) in HTML and the [proposed `DataCue` API](explainer.md).
36 |
37 | > TODO: Add API summary
38 |
39 | As new `emsg` event types can be introduced from time to time, we propose to expose the raw binary `emsg` data for applications to parse. This avoids the need for browsers to natively understand the structure of the event messages.
40 |
41 | We will need to specify how to extract in-band timed metadata from the media container, and the structure in which the data is exposed via the `DataCue` interface. There are a couple of options for how to do this:
42 |
43 | 1. We could update the existing [Sourcing In-band Media Resource Tracks from Media Containers into HTML](https://dev.w3.org/html5/html-sourcing-inband-tracks/) spec.
44 |
45 | 2. We could produce a new set of specifications, following a registry approach with one specification per media format that describes the timed metadata details for that format, similar to the [Media Source Extensions Byte Stream Format Registry](https://www.w3.org/TR/mse-byte-stream-format-registry/). This could be based on [Sourcing In-band Media Resource Tracks from Media Containers into HTML](https://dev.w3.org/html5/html-sourcing-inband-tracks/).
46 |
47 | ## Code examples
48 |
49 | > TODO: Needs updating: show how to subscribe to specific event streams, show how to set the dispatch mode.
50 |
51 | ### Subscribing to receive in-band timed metadata cues
52 |
53 | This example shows how to add a `cuechange` handler that can be used to receive media-timed data and event cues.
54 |
55 | ```javascript
56 | const video = document.getElementById('video');
57 |
58 | video.textTracks.addEventListener('addtrack', (event) => {
59 | const textTrack = event.track;
60 |
61 | if (textTrack.kind === 'metadata') {
62 | textTrack.mode = 'hidden';
63 |
64 | // See cueChangeHandler examples below
65 | textTrack.addEventListener('cuechange', cueChangeHandler);
66 | }
67 | });
68 | ```
69 |
70 | ### MPEG-DASH callback event handler
71 |
72 | ```javascript
73 | const cueChangeHandler = (event) => {
74 | const metadataTrack = event.target;
75 | const activeCues = metadataTrack.activeCues;
76 |
77 | for (let i = 0; i < activeCues.length; i++) {
78 | const cue = activeCues[i];
79 |
80 | // The UA delivers parsed message data for this message type
81 | if (cue.type === 'urn:mpeg:dash:event:callback:2015' &&
82 | cue.value.emsgValue === '1') {
83 | const url = cue.value.data;
84 | fetch(url).then(() => { console.log('Callback completed'); });
85 | }
86 | }
87 | };
88 | ```
89 |
90 | ### SCTE-35 dynamic content insertion cue handler
91 |
92 | This example shows how a web application can handle [SCTE 35](https://scte-cms-resource-storage.s3.amazonaws.com/Standards/ANSI_SCTE%20214-3%202015.pdf) cues, both in the case where the cues are parsed by the browser implementation, and where parsed by the web application.
93 |
94 | ```javascript
95 | const cueChangeHandler = (event) => {
96 | const metadataTrack = event.target;
97 | const activeCues = metadataTrack.activeCues;
98 |
99 | for (let i = 0; i < activeCues.length; i++) {
100 | const cue = activeCues[i];
101 |
102 | if (cue.type === 'urn:scte:scte35:2013:bin') {
103 | // Parse the SCTE-35 message payload.
104 | // parseSCTE35Data() is similar to Comcast's scte35.js library,
105 | // adapted to take an ArrayBuffer as input.
106 | // https://github.com/Comcast/scte35-js/blob/master/src/scte35.ts
107 | const scte35Message = parseSCTE35Data(cue.value.data);
108 |
109 | console.log(cue.startTime, cue.endTime, scte35Message.tableId, scte35Message.spliceCommandType);
110 | }
111 | }
112 | };
113 | ```
114 |
115 | ### Cue enter/exit handlers
116 |
117 | This example shows how a web application can use the proposed new `addcue` event to attach `enter` and `exit` handlers to each cue on the metadata track.
118 |
119 | ```javascript
120 | // video.currentTime has reached the cue start time
121 | // through normal playback progression
122 | const cueEnterHandler = (event) => {
123 | const cue = event.target;
124 | console.log('cueEnter', cue.startTime, cue.endTime);
125 | };
126 |
127 | // video.currentTime has reached the cue end time
128 | // through normal playback progression
129 | const cueExitHandler = (event) => {
130 | const cue = event.target;
131 | console.log('cueExit', cue.startTime, cue.endTime);
132 | };
133 |
134 | // A cue has been parsed from the media container
135 | const addCueHandler = (event) => {
136 | const cue = event.cue;
137 |
138 | // Attach enter/exit event handlers
139 | cue.onenter = cueEnterhandler;
140 | cue.onexit = cueExitHandler;
141 | };
142 |
143 | const video = document.getElementById('video');
144 |
145 | video.textTracks.addEventListener('addtrack', (event) => {
146 | const textTrack = event.track;
147 |
148 | if (textTrack.kind === 'metadata') {
149 | textTrack.mode = 'hidden';
150 |
151 | textTrack.addEventListener('addcue', addCueHandler);
152 | }
153 | });
154 | ```
155 |
156 | ## Considered alternatives
157 |
158 | > TODO
159 |
160 | ## References
161 |
162 | This explainer is based on content from a [Note](https://w3c.github.io/me-media-timed-events/) written by the W3C Media and Entertainment Interest Group, and from a number of associated discussions, including the [TPAC breakout session on video metadata cues](https://github.com/w3c/strategy/issues/113#issuecomment-432971265). It is also closely related to the DASH-IF [DASH Player's Application Events and Timed Metadata Processing Models and APIs](https://dashif-documents.azurewebsites.net/Events/master/event.html) document.
163 |
--------------------------------------------------------------------------------
/emsg-processing-model-figure1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/WICG/datacue/dac8f4ece6f84aaf61f2dc6257d105319066a324/emsg-processing-model-figure1.png
--------------------------------------------------------------------------------
/emsg-processing-model-figure2.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/WICG/datacue/dac8f4ece6f84aaf61f2dc6257d105319066a324/emsg-processing-model-figure2.png
--------------------------------------------------------------------------------
/emsg-processing-model-figure3.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/WICG/datacue/dac8f4ece6f84aaf61f2dc6257d105319066a324/emsg-processing-model-figure3.png
--------------------------------------------------------------------------------
/emsg-processing-model.md:
--------------------------------------------------------------------------------
1 | # DASH inband event processing using MSE data model
2 |
3 | Iraj Sodagar, irajs@live.com
4 |
5 | Tencent America
6 |
7 | 2021-05-17
8 |
9 | ## Introduction
10 |
11 | This document provides an extended W3C [Media Source Extensions](https://w3c.github.io/media-source/) (MSE) model for the processing of DASH inband events.
12 |
13 | Note: The current MSE specification does not support the processing of inband events and this document is just one possible illustrative design on how MSE can be extended to support such functionality.
14 |
15 | ## Dispatch modes
16 |
17 | See https://www.w3.org/TR/media-timed-events/#event-triggering
18 |
19 | For those events that the application has subscribed to receive, the API should:
20 |
21 | * Generate a DOM event when an in-band media timed event cue is parsed from the media container or media stream (DASH-IF **on-receive** mode).
22 | * Generate DOM events when the current media playback position reaches the start time and the end time of a media timed event cue during playback (DASH-IF **on-start** mode). This applies equally to cues generated by the user agent when parsed from the media container and cues added by the web application.
23 |
24 | In general, it is not possible for the UA to know which dispatch mode to use for any given event type. So we introduce the following API so that the web application can tell the UA:
25 |
26 | ```javascript
27 | enum DispatchMode {
28 | "onstart",
29 | "onreceive"
30 | };
31 |
32 | interface InbandEventTrack extends TextTrack {
33 | undefined subscribe(DOMString eventType, DispatchMode dispatchMode);
34 | undefined unsubscribe(DOMString eventType);
35 | };
36 | ```
37 |
38 | > TODO: How to construct `eventType` from emsg `scheme_id_uri` and `value`?
39 |
40 | > TODO: How to construct an `InbandEventTrack`? Some UAs already process inband events using `TextTrack` and create the `TextTrack` automatically.
41 |
42 | ## Process@append rule
43 |
44 | The process@append rule defines how the inband events of a segment are processed, i.e. parsed, and dispatched or scheduled to dispatch at the time of appending the segment to the MSE [`SourceBuffer`](https://w3c.github.io/media-source/#sourcebuffer) using [`appendBuffer()`](https://www.w3.org/TR/media-source/#dom-sourcebuffer-appendbuffer).
45 |
46 | In the case of an inband event whose `eventType` has not been subscribed by the web application:
47 |
48 | 1. Discard the event
49 |
50 | In the case of an inband event whose `eventType` has been subscribed by the web application with the **onreceive** dispatch mode:
51 |
52 | 1. If the event end time is not smaller than the current playback position, and
53 | 2. If this event or an equivalent has not been dispatched before,
54 |
55 | Then the dispatcher dispatches the event immediately.
56 |
57 | In the case of an inband event whose `eventType` has been subscribed by the web application with the **onstart** dispatch mode:
58 |
59 | 1. If the current playback position is not smaller than the event start time, and
60 | 2. If the current playback position is not equal or larger than the event end time, and
61 | 3. If this event or an equivalent has not been dispatched before,
62 |
63 | Then the event is stored in a dispatch buffer for dispatching at the event start time.
64 |
65 | ## Dispatch buffer timing model
66 |
67 | Figure 1 demonstrates an inband event with the **onstart** dispatch mode relative to the MSE timing model.
68 |
69 |
70 |
71 |
72 |
73 | Figure 1. Media source and in-band event dispatch buffers
74 |
75 | ## Implementation
76 |
77 | Figure 2 demonstrates an example of the overlapping events with **onstart** dispatch mode.
78 |
79 |
80 |
81 |
82 |
83 | Figure 2. Event buffer model example for **onstart** events
84 |
85 | As is shown above, emsgs E0, E1, and E2 are mapped to the dispatch buffer. With the initial appending of the S1 media segment to the media buffer, the ranges between the event's start and the event's end are marked in the dispatch buffer for E0 and E1.
86 |
87 | When S2 is appended to the media buffer, in this case since E2 overlaps with E1 in the dispatch buffer, the corresponding range in the dispatched first is divided into 3 subranges, as shown in the figure.
88 |
89 | Figure 3 demonstrates an example of an overwrite, in which the segment S2 is overwritten by a new segment S2' (that does not contain any emsgs) and has a duration that only covers a portion of S2 in the media buffer.
90 |
91 |
92 |
93 |
94 |
95 | Figure 3. Overwrite of a part of a segment with events having **onstart** dispatch mode
96 |
97 | As shown, since the event E2 has the **onstart** dispatch mode, its range in the dispatch buffer is unchanged.
98 |
99 | ## Algorithms
100 |
101 | ### Initialization
102 |
103 | 1. Applications inputs to DASH client
104 | 1. Subscribe SchemeIdURI/value
105 | 2. Provide dispatch mode
106 | 2. Event buffer initialization:
107 | 1. Event dispatch (range of event purge may go beyond media buffer)
108 | 3. Set Presentation Time Offset (PTO)
109 |
110 | ### Append
111 |
112 | Add the following steps to the MSE [Segment Parser Loop](https://www.w3.org/TR/media-source/#sourcebuffer-segment-parser-loop), which is called from the [Buffer Append Algorithm](https://www.w3.org/TR/media-source/#sourcebuffer-buffer-append). After step 6.2., which describes handling of coded frames, add:
113 |
114 | 1. If the input buffer contains one or more __inband event messages__, then run the __inband event processing algorithm__.
115 |
116 | ### Inband Event Processing
117 |
118 | The __inband event processing algorithm__ is a new algorithm which we propose to add to MSE.
119 |
120 | When __inband event messages__ have been parsed by the segment parser loop then the following steps are run:
121 |
122 | 1. For each inband event message in the media segment run the following steps:
123 | 1. Parse the emsg
124 | 2. Generate the `eventType` from the emsg.scheme_id_uri and emsg.value
125 | 3. Look up the `eventType` in the `InbandEventTrack`'s list of subscribed event types
126 | 1. If not present, discard the emsg and abort these steps
127 | 4. Calculate the `startTime` and `endTime` values for the `DataCue`:
128 | 1. For emsg v0 (esmg.version = 0): startTime = segment_start + emsg.presentation_time_delta / emsg.time_scale
129 | 2. For emsg v1 (emsg.version = 1): startTime = emsg.presentation_time / emsg.timescale
130 | 3. If emsg.duration is 0xFFFFFFFF then endTime = +Infinity, else endTime = startTime + emsg.duration / emsg.timescale
131 | 5. If there is an equivalent event message in the `InbandEventTrack`'s [list of text track cues](https://html.spec.whatwg.org/multipage/media.html#text-track-list-of-cues), discard the event message and abort these steps. An event message is equivalent if its `id`, `scheme_id_uri`, and `value` values are the same those of any existing
132 | 5. Construct a `DataCue` instance with the following attributes:
133 | 1. startTime (as calculated above)
134 | 2. endTime (as calculated above)
135 | 3. id = emsg.id
136 | 4. pauseOnExit = false
137 | 5. type = emsg.scheme_id_uri
138 | 6. value = { data: emsg.message_data, emsgValue: emsg.value }
139 | 5. If the subscription's dispatch mode is **onreceive**, queue a task to fire an event named `addcue` at the `InbandEventTrack` object with the `cue` attribute initialized to the new `DataCue` object
140 | 6. If the subscription's dispatch mode is **onstart**, run the HTML [`addCue()` steps](https://html.spec.whatwg.org/multipage/media.html#dom-texttrack-addcue) with the new `DataCue` object
141 |
142 | ### Dispatch
143 |
144 | > TODO: Firing of cue `enter` and `exit` for event messages in **onstart** dispatch mode is handled by the HTML [time marches on steps](https://html.spec.whatwg.org/multipage/media.html#time-marches-on)
145 |
146 | 1. Find the events occurring in the dispatch buffer at the playback position.
147 | 2. For each event
148 | 1. If its emsg.id is not in the "already-dispatched" table,
149 | 1. Dispatch the event
150 | 2. Add its emsg.id to the "already-dispatched" table
151 | 3. Remove the event from the dispatch buffer
152 | 2. Otherwise, remove the event from the dispatch buffer
153 |
154 | ### Purge
155 |
156 | > TODO: Purging is controlled by the web application, by calling [SourceBuffer.remove(startTime, endTime)](https://www.w3.org/TR/media-source/#dom-sourcebuffer-remove). The MSE [Range Removal](https://www.w3.org/TR/media-source/#sourcebuffer-range-removal) algorithm applies. Should this algorithm also remove cues that lie in the removed time range?
157 |
158 | In a purge operation, either a range from the start or a range from the end of the media buffer is purged. This range is referred to as the "purged-range" in this subclause.
159 |
160 | 1. If any event in the dispatch buffer overlaps with the purged-range
161 | 1. Split the event into two events around the purge-range boundary
162 | 2. Remove the purged-range from the dispatch buffer
163 |
--------------------------------------------------------------------------------
/explainer.md:
--------------------------------------------------------------------------------
1 | # DataCue
2 |
3 | DataCue is a proposed web API to allow support for timed metadata, i.e., metadata information that is synchronized to audio or video media.
4 |
5 | Timed metadata can be used to support use cases such as dynamic content replacement, ad insertion, or presentation of supplemental content alongside the audio or video, or more generally, making changes to a web page, or executing application code triggered from JavaScript events, at specific points on the media timeline of an audio or video media stream.
6 |
7 | ## Use cases
8 |
9 | Timed metadata can be used to support use cases such as dynamic content replacement, ad insertion, or presentation of supplemental content alongside the audio or video, or more generally, making changes to a web page, or executing application code triggered from JavaScript events, at specific points on the media timeline of an audio or video media stream. The following sections describe some specific use cases in more detail.
10 |
11 | ### Lecture recording with slideshow
12 |
13 | An HTML page contains title and information about the course or lecture, and two frames: a video of the lecturer in one and their slides in the other. Each timed metadata cue contains the URL of the slide to be presented, and the cue is active for the time range over which the slide should be visible.
14 |
15 | ### Media stream with video and synchronized graphics
16 |
17 | A website wants to provide synchronized graphical elements that may be rendered next to or on top of a video.
18 |
19 | For example, in a talk show this could be a banner, shown in the lower third of the video, that displays the name of the guest. In a sports event, the graphics could show the latest lap times or current score, or highlight the location of the current active player. It could even be a full-screen overlay, to blend from one part of the program to another.
20 |
21 | The graphical elements are described in a stream or file containing cues that describe the start and end time of each graphical element, similar to a subtitle stream or file. The web application takes this data as input and renders it on top of the video image according to the cues.
22 |
23 | The purpose of rendering the graphical elements on the client device, rather than rendering them directly into the video image, is to allow the graphics to be optimized for the device's display parameters, such as aspect ratio and orientation. Another use case is adapting to user preferences, for localization or to improve accessibility.
24 |
25 | This use case requires frame accurate synchronization of the content being rendered over the video.
26 |
27 | ### Synchronized map animations
28 |
29 | A user records footage with metadata, including geolocation, on a mobile video device such as a drone or dashcam, to share on the web alongside a map, e.g., OpenStreetMap.
30 |
31 | WebVMT is an open format for metadata cues, synchronized with audio or video media, that can be used to drive an online map rendered in a separate HTML element alongside the media element on the web page. The media playhead position controls presentation and animation of the map, e.g., pan and zoom, and allows annotations to be added and removed, e.g., markers, at specified times during media playback. Control can also be overridden by the user with the usual interactive features of the map at any time, e.g., zoom. Concrete examples are provided by the [tech demos at the WebVMT website](http://webvmt.org/demos).
32 |
33 | ### Media metadata search results
34 |
35 | A user searches for online media matching certain metadata conditions, for example within a given distance of a geographic location or an acceleration profile corresponding to a traffic accident. Results are returned from a remote server using a RESTful API as a list in JSON format.
36 |
37 | It should be possible for search results to be represented as media in the user agent, with linked metadata presented as `DataCue` objects programmatically to provide a common interface within the client web browser. Further details are given in the video metadata search experiments, proposed in the [OGC](http://www.opengeospatial.org) Ideas GitHub, to return [frames](https://github.com/opengeospatial/ideas/issues/91) and [clips](https://github.com/opengeospatial/ideas/issues/92).
38 |
39 | > NOTE: Whether this use case requires any changes to the user agent or not is unclear without further investigation. If no changes are required, this capability should be demonstrated and the use case listed as a non-goal.
40 |
41 | ## Event delivery
42 |
43 | HTTP Live Streaming (HLS) and MPEG Dynamic Adaptive Streaming over HTTP (MPEG-DASH) are the two main adaptive streaming formats in use on the web today. The media industry is coverging on the use of [MPEG Common Media Application Format (CMAF)](https://www.iso.org/standard/71975.html) as the common media delivery format. HLS, MPEG-DASH, and MPEG CMAF all support delivery of timed metadata, i.e., metadata information that is synchronized to the audio or video media.
44 |
45 | Both HLS and MPEG-DASH use a combination of encoded media files and manifest files that identify the available streams their respective URLs.
46 |
47 | Some user agents (notably Safari and HbbTV) include native support for adaptive streaming playback, rather than through use of [Media Source Extensions](https://www.w3.org/TR/media-source/). In these cases, we need the user agent to expose to web applications any timed metadata cues that are carried either in-band with the media (i.e., delivered within the audio or video media container or multiplexed with the media stream), or out-of-band via the manifest document.
48 |
49 | ## Proposed API
50 |
51 | The proposed API is based on the existing [text track support](https://html.spec.whatwg.org/multipage/media.html#timed-text-tracks) in HTML and WebKit's `DataCue`. This extends the [HTML5 `DataCue` API](https://www.w3.org/TR/2018/WD-html53-20181018/semantics-embedded-content.html#text-tracks-exposing-inband-metadata) with two attributes to support non-text metadata, `type` and `value` that replace the existing `data` attribute. We also add a constructor that allows these fields to be initialized by web applications.
52 |
53 | ```webidl
54 | interface DataCue : TextTrackCue {
55 | constructor(double startTime, unrestricted double endTime, any value, optional DOMString type);
56 |
57 | // Propose to deprecate / remove this attribute.
58 | attribute ArrayBuffer data;
59 |
60 | // Proposed extensions.
61 | attribute any value;
62 | readonly attribute DOMString type;
63 | };
64 | ```
65 |
66 | `value`: Contains the message data, which may be in any arbitrary data structure.
67 |
68 | `type`: A string that can be used to identify the structure and content of the cue's `value`.
69 |
70 | ## User agent-generated DataCue instances
71 |
72 | Some user agents may automatically generate `DataCue` timed metadata cues while playing media. For example, WebKit supports several kinds of timed metadata in HLS streams, using the following `type` values:
73 |
74 | | Type | Purpose |
75 | | -------------------------- | ------------------- |
76 | | `com.apple.quicktime.udta` | QuickTime User Data |
77 | | `com.apple.quicktime.mdta` | QuickTime Metadata |
78 | | `com.apple.itunes` | iTunes metadata |
79 | | `org.mp4ra` | MPEG-4 metadata |
80 | | `org.id3` | ID3 metadata |
81 |
82 | Additional information about existing support in WebKit can be found in [the IDL](https://trac.webkit.org/browser/webkit/trunk/Source/WebCore/html/track/DataCue.idl) and in [this layout test](https://trac.webkit.org/browser/webkit/trunk/LayoutTests/http/tests/media/track-in-band-hls-metadata.html), which loads various types of ID3 metadata from an HLS stream.
83 |
84 | This proposal does not seek to standardize UA-generated `DataCue` schemas, but the proposed API is intended to support this usage.
85 |
86 | Other proposals may be developed for this purpose, e.g., for the above or MPEG-DASH timed metadata events.
87 |
88 | ## Code examples
89 |
90 | ### Create an unbounded DataCue with geolocation data
91 |
92 | ```javascript
93 | const video = document.getElementById('video');
94 | const textTrack = video.addtrack('metadata');
95 | // Create a cue from 5 secs to end of media
96 | const data = { "moveto": { "lat": 51.504362, "lng": -0.076153 } };
97 | const cue = new DataCue(5.0, Infinity, data, 'org.webvmt');
98 | textTrack.addCue(cue);
99 | ```
100 |
101 | ### Create a DataCue from an in-band DASH 'emsg' box
102 |
103 | ```javascript
104 | // Parse the media segment to extract timed metadata cues
105 | // contained in DASH 'emsg' boxes
106 | function extractEmsgBoxes(mediaSegment) {
107 | // etc.
108 | }
109 |
110 | // video.currentTime has reached the cue start time
111 | // through normal playback progression
112 | const cueEnterHandler = (event) => {
113 | const cue = event.target;
114 | console.log('cueEnter', cue.startTime, cue.endTime);
115 | };
116 |
117 | // video.currentTime has reached the cue end time
118 | // through normal playback progression
119 | const cueExitHandler = (event) => {
120 | const cue = event.target;
121 | console.log('cueExit', cue.startTime, cue.endTime);
122 | };
123 |
124 | function createDataCues(events, textTrack) {
125 | events.forEach(event => {
126 | const cue = new DataCue(
127 | event.startTime,
128 | event.endTime,
129 | event.payload,
130 | event.schemeIdUri
131 | );
132 |
133 | // Attach enter/exit event handlers
134 | cue.onenter = cueEnterhandler;
135 | cue.onexit = cueExitHandler;
136 |
137 | textTrack.addCue(cue);
138 | });
139 | }
140 |
141 | // Append the segment to the MSE SourceBuffer
142 | function appendSegment(segment) {
143 | // etc.
144 | sourceBuffer.appendBuffer(segment);
145 | }
146 |
147 | const video = document.getElementById('video');
148 | const textTrack = video.addtrack('metadata');
149 |
150 | // Fetch a media segment, parse and create DataCue instances,
151 | // and append the segment for playback using Media Source Extensions.
152 | fetch('/media-segments/12345.m4s')
153 | .then(response => response.arrayBuffer())
154 | .then(buffer => {
155 | const events = extractEmsgBoxes(buffer);
156 | createDataCues(events, textTrack)
157 |
158 | appendSegment(buffer);
159 | });
160 | ```
161 |
162 | ### Create a DataCue from a DASH MPD event
163 |
164 | > TODO: Add example code showing how a web application can construct `DataCue` objects with start and end times, event type, and data payload from a DASH MPD event, where the MPD is parsed by the web application
165 |
166 | ## Considered alternatives
167 |
168 | ### WebVTT metadata cues
169 |
170 | Web applications today can use WebVTT metadata cues (the [`VTTCue`](https://www.w3.org/TR/webvtt1/#vttcue) API) to schedule timed metadata events by serializing the data to a string format (JSON, for example) when creating the cue, and deserializing the data when the cue's `onenter` event is fired. Although this works in practice, `DataCue` avoids the need for the serialization/deserialization steps.
171 |
172 | `DataCue` is also sementically consistent with timed metadata use cases, where `VTTCue` is designed for subtitles and captions. `VTTCue` contains a lot of API surface related to caption layout and presentation, which are not relevant to timed metadata cues.
173 |
174 | ## Event synchronization
175 |
176 | The Media Timed Events Task Force of the Media and Entertainment Interest Group has also [identified requirements for synchronization accuracy of event triggering](https://w3c.github.io/me-media-timed-events/#synchronization), which suggest changes to the [time marches on](https://html.spec.whatwg.org/multipage/media.html#time-marches-on) steps in HTML. These will be followed up separately to this `DataCue` proposal.
177 |
178 | ## References
179 |
180 | This explainer is based on content from a [Note](https://w3c.github.io/me-media-timed-events/) written by the W3C Media and Entertainment Interest Group, and from a number of associated discussions, including the [TPAC breakout session on video metadata cues](https://github.com/w3c/strategy/issues/113#issuecomment-432971265). It is also closely related to the DASH-IF [DASH Player's Application Events and Timed Metadata Processing Models and APIs](https://dashif-documents.azurewebsites.net/Events/master/event.html) document.
181 |
182 | ## Acknowledgements
183 |
184 | Thanks to Eric Carlson, François Daoust, Charles Lo, Nigel Megitt, Jon Piesing, Rob Smith, and Mark Vickers for their contribution and input to this document.
185 |
--------------------------------------------------------------------------------
/inband-events-using-datacue.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/WICG/datacue/dac8f4ece6f84aaf61f2dc6257d105319066a324/inband-events-using-datacue.png
--------------------------------------------------------------------------------
/inband-events-using-vttcue.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/WICG/datacue/dac8f4ece6f84aaf61f2dc6257d105319066a324/inband-events-using-vttcue.png
--------------------------------------------------------------------------------
/index.bs:
--------------------------------------------------------------------------------
1 |
2 | Title: DataCue API
3 | Shortname: datacue
4 | Level: 1
5 | Status: CG-DRAFT
6 | ED: https://wicg.github.io/datacue/
7 | Group: WICG
8 | Repository: WICG/datacue
9 | Editor: Chris Needham, BBC https://www.bbc.co.uk, chris.needham@bbc.co.uk
10 | Abstract: This document describes an API that allows web pages to associate
11 | arbitrary timed data with audio or video media resources, and for exposing
12 | timed data from media resources to web pages.
13 | !Participate: Git Repository.
14 | !Participate: File an issue.
15 | !Version History: https://github.com/WICG/datacue/commits
16 |
17 |
18 | # Introduction # {#introduction}
19 |
20 | *This section is non-normative*
21 |
22 | Media resources often contain one or more media-resource-specific tracks
23 | containing data that browsers don't render, but want to expose to script to
24 | allow being dealt with.
25 |
26 | TODO: ...
27 |
28 | # Security and privacy considerations # {#security-and-privacy}
29 |
30 | *This section is non-normative.*
31 |
32 | TODO: ...
33 |
34 | # API # {#api}
35 |
36 | ## The DataCue interface ## {#datacue-interface}
37 |
38 |
39 | [Exposed=Window]
40 | interface DataCue : TextTrackCue {
41 | constructor(double startTime, unrestricted double endTime,
42 | any value, optional DOMString type);
43 | attribute any value;
44 | readonly attribute DOMString type;
45 | };
46 |
47 |
48 | # In-band event mappings # {#in-band-event-mappings}
49 |
50 | The following sections describe how various in-band message formats are mapped to the {{DataCue}} API.
51 |
52 | ## MPEG-DASH emsg ## {#mpeg-dash-emsg}
53 |
54 | The emsg data structure is defined in section 5.10.3.3 of [[!MPEGDASH]]. Use of emsg within CMAF media is defined in section 7.4.5 of [[!MPEGCMAF]].
55 |
56 | There are two versions in use, version 0 and 1:
57 |
58 |
This document describes an API that allows web pages to associate
594 |
595 | arbitrary timed data with audio or video media resources, and for exposing
596 | timed data from media resources to web pages.
Media resources often contain one or more media-resource-specific tracks
656 | containing data that browsers don’t render, but want to expose to script to
657 | allow being dealt with.
Conformance requirements are expressed
708 | with a combination of descriptive assertions
709 | and RFC 2119 terminology.
710 | The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “MAY”, and “OPTIONAL”
711 | in the normative parts of this document
712 | are to be interpreted as described in RFC 2119.
713 | However, for readability,
714 | these words do not appear in all uppercase letters in this specification.
715 |
All of the text of this specification is normative
716 | except sections explicitly marked as non-normative, examples, and notes. [RFC2119]
717 |
Examples in this specification are introduced with the words “for example”
718 | or are set apart from the normative text
719 | with class="example",
720 | like this:
721 |
722 |
723 |
This is an example of an informative example.
724 |
725 |
Informative notes begin with the word “Note”
726 | and are set apart from the normative text
727 | with class="note",
728 | like this:
729 |
Note, this is an informative note.
730 |
Conformant Algorithms
731 |
Requirements phrased in the imperative as part of algorithms
732 | (such as "strip any leading space characters"
733 | or "return false and abort these steps")
734 | are to be interpreted with the meaning of the key word
735 | ("must", "should", "may", etc)
736 | used in introducing the algorithm.
737 |
Conformance requirements phrased as algorithms or specific steps
738 | can be implemented in any manner,
739 | so long as the end result is equivalent.
740 | In particular, the algorithms defined in this specification
741 | are intended to be easy to understand
742 | and are not intended to be performant.
743 | Implementers are encouraged to optimize.
834 |
840 |
--------------------------------------------------------------------------------
/requirements.md:
--------------------------------------------------------------------------------
1 | # DataCue API Requirements
2 |
3 | ## Introduction
4 |
5 | There is a need in the media industry for an API to support arbitrary data associated with points in time or periods of time in a continuous media (audio or video) presentation. This data may include:
6 |
7 | * Metadata that describes the content in some way, such as program or chapter titles, geolocation information, often referred to as timed metadata, used to drive an interactive media experience
8 | * Control messages for the media player that are expected to take effect at specific times during media playback, such as ad insertion cues
9 |
10 | This document presents the use cases and technical requirements for an API that supports media-timed metadata and event cues.
11 |
12 | ## Use cases
13 |
14 | ### Dynamic content insertion
15 |
16 | A media content provider wants to allow insertion of content, such as personalised video, local news, or advertisements, into a video media stream that contains the main program content. To achieve this, timed metadata is used to describe the points on the media timeline, known as splice points, where switching playback to inserted content is possible.
17 |
18 | [SCTE 35](https://scte-cms-resource-storage.s3.amazonaws.com/ANSI_SCTE-35-2019a-1582645390859.pdf) defines a data cue format for describing such insertion points. Use of these cues in MPEG-DASH streams is described in [SCTE 214-1](https://scte-cms-resource-storage.s3.amazonaws.com/Standards/ANSI_SCTE%20214-1%202016.pdf), [SCTE 214-2](https://scte-cms-resource-storage.s3.amazonaws.com/Standards/ANSI_SCTE%20214-2%202016.pdf), and [SCTE 214-3](https://scte-cms-resource-storage.s3.amazonaws.com/Standards/ANSI_SCTE%20214-3%202015.pdf). Use in HLS streams is described in SCTE-35 section 12.2.
19 |
20 | ### Media player control messages
21 |
22 | MPEG-DASH defines several control messages for media streaming clients (e.g., libraries such as [dash.js](https://github.com/Dash-Industry-Forum/dash.js/wiki)). Control messages exist for several scenarios, such as:
23 |
24 | * The media player should refresh or update its copy of the manifest document (MPD)
25 | * The media player should make an HTTP request to a given URL for analytics purposes
26 | * The media presentation will end at a time earlier than expected
27 |
28 | These messages may be carried as in-band `emsg` events in the media container files.
29 |
30 | ### Media stream with video and synchronized graphics
31 |
32 | A content provider wants to provide synchronized graphical elements that may be rendered next to or on top of a video.
33 |
34 | For example, in a talk show this could be a banner, shown in the lower third of the video, that displays the name of the guest. In a sports event, the graphics could show the latest lap times or current score, or highlight the location of the current active player. It could even be a full-screen overlay, to blend from one part of the program to another.
35 |
36 | The graphical elements are described in a stream or file containing cues that describe the start and end time of each graphical element, similar to a subtitle stream or file. The web application takes this data as input and renders it on top of the video image according to the cues.
37 |
38 | The purpose of rendering the graphical elements on the client device, rather than rendering them directly into the video image, is to allow the graphics to be optimized for the device's display parameters, such as aspect ratio and orientation. Another use case is adapting to user preferences, for localization or to improve accessibility.
39 |
40 | This use case requires frame accurate synchronization of the content being rendered over the video.
41 |
42 | ## Limitations of existing solutions
43 |
44 | Today, most media player libraries include support for timed metadata. Support varies between players, with some supporting only HLS timed metadata, e.g., [JWPlayer](https://www.jwplayer.com/html5-video-player/), others having support for DASH `emsg` boxes, such as [DASH.js](https://github.com/Dash-Industry-Forum/dash.js) and some that support both, e.g., [Shaka Player](https://github.com/google/shaka-player/).
45 | [Video.js](https://github.com/videojs/video.js) can be used with [mux.js](https://github.com/videojs/mux.js#metadata) to parse in-band timed metadata and captions.
46 |
47 | ### Processing efficiency
48 |
49 | On resource constrained devices such as smart TVs and streaming sticks, parsing media segments in JavaScript to extract timed metadata or event information leads to a significant performance penalty, which can have an impact on UI rendering updates if this is done on the UI thread. There can also be an impact on the battery life of mobile devices. Given that the media segments will be parsed anyway by the user agent, parsing in JavaScript is an expensive overhead that could be avoided.
50 |
51 | ### Low latency streaming
52 |
53 | Avoiding parsing in JavaScript is important for low latency video streaming applications, where it's important to minimize the time taken to pass media content through to the media element's playback buffer.
54 |
55 | If the proposed Media Source Extensions `appendStream` method (see [GitHub issue](https://github.com/w3c/media-source/issues/14)) is used to deliver media content directly from a Fetch API response to the playback buffer, application level parsing of the timed metadata or `emsg` boxes adds unnecessary delay.
56 |
57 | ## Requirements
58 |
59 | ### Subscribing to receive media timed event cues
60 |
61 | The API should allow web applications to subscribe to receive specific types of media timed event cue. For example, to support MPEG-DASH emsg and MPD events, the cue type is identified by a combination of the `scheme_id_uri` and (optional) `value`. The purpose of this is to make receiving cues of each type opt-in from the application's point of view. The user agent should deliver only those cues to a web application for which the application has subscribed. The API should also allow web applications to unsubscribe from specific cue types.
62 |
63 | ### Out-of-band events
64 |
65 | To be able to handle out-of-band media timed event cues, including MPEG-DASH MPD events, the API should allow web applications to create and add timed data cues to the media timeline, to be triggered by the user agent. The API should allow the web application to provide all necessary parameters to define the cue, including start and end times, cue type identifier, and data payload. The payload should be any data type.
66 |
67 | ### Event triggering
68 |
69 | For those events that the application has subscribed to receive, the API should:
70 |
71 | * Generate a DOM event when an in-band media timed event cue is parsed from the media container or media stream (DASH-IF _on-receive_ mode).
72 | * Generate DOM events when the current media playback position reaches the start time and the end time of a media timed event cue during playback (DASH-IF _on-start_ mode). This applies equally to cues generated by the user agent when parsed from the media container and cues added by the web application.
73 |
74 | The API should provide guarantees that no media timed event cues can be missed during linear playback of the media.
75 |
76 | ### MPEG-DASH events
77 |
78 | Implementations should support MPEG-DASH `emsg` in-band events and MPD out-of-band events, as part of their support for the MPEG Common Media Application Format (CMAF).
79 |
80 | ### Cues with unbounded duration
81 |
82 | Implementations should support media timed event cues with unknown end time, where the cue is active from its start time to the end of the media stream.
83 |
84 | ### Updating media timed event cues
85 |
86 | The API should allow media timed event cue information to be updated, such as an event's position on the media timeline, and its data payload. Where the media timed event is updated by the user agent, such as for in-band events, we recommend that the API allows the web application to be notified of any changes.
87 |
88 | ### Synchronization
89 |
90 | In order to achieve synchronization accuracy between media playback and web content rendered by a web application, media timed event cue `enter` and `exit` events should be delivered to the web application within 20 milliseconds of their positions on the media timeline.
91 |
92 | Additionally, to allow such synchronization to happen at frame boundaries, we recommend introducing a mechanism that would allow a web application to accurately predict, using the user's wall clock, when the next frame will be rendered (e.g., as done in the Web Audio API).
93 |
--------------------------------------------------------------------------------
/text-track-cue-constructor.md:
--------------------------------------------------------------------------------
1 | ## Explainer
2 |
3 | This page explains the motivation for the [proposal to expose TextTrackCue constructor in the web interface](https://github.com/WICG/datacue/issues/35).
4 |
5 | ### TextTrackCue History
6 |
7 | [VTTCue](https://www.w3.org/TR/webvtt1/#the-vttcue-interface) provides timed text support for video files on the web. This API is extended from [TextTrackCue](https://html.spec.whatwg.org/multipage/media.html#texttrackcue) which is widely supported in modern browsers.
8 |
9 | 
10 | [Screenshot from caniuse.com/textrackcue](https://caniuse.com/texttrackcue)
11 |
12 | [DataCue](https://wicg.github.io/datacue/#datacue-interface) was proposed to provide equivalent support for timed metadata and is also extended from TextTrackCue. DataCue was implemented and matured in Apple's WebKit, though that feature was subsequently dropped in accordance with W3C rules because only a single browser implemented this API.
13 |
14 | ### DataCue Design
15 |
16 | [DataCue](https://wicg.github.io/datacue/#datacue-interface) implements a simple interface with `type` and `value` attributes which represent the cue type and cue content respectively. Any form of timed metadata can be stored in `value` and this is identified using `type` so that relevant cue content can be accessed quickly and easily.
17 |
18 | ### TextTrackCue Proposal
19 |
20 | [TextTrackCue is an abstract base class](https://developer.mozilla.org/en-US/docs/Web/API/TextTrackCue) for all types of cue, which means that it is designed to be extended. Hence it does not directly specify any attributes related to cue content and expects these attributes to be defined by the programmer in the extended cue class. VTTCue and DataCue are both examples of extended cue classes which inherit the properties of TextTrackCue.
21 |
22 | However, a user-defined cue extension is not currently possible in Javascript. The extended cue's constructor is unable to call the TextTrackCue constructor because this is prohibited by the web interface.
23 |
24 | #### Extended Cue Example
25 | ````
26 | // define extended cue class
27 | class MyExtendedCue extends TextTrackCue {
28 | myCueContent; // cue content
29 |
30 | // extend constructor
31 | constructor(startTime, endTime, cueContent) {
32 | super(startTime, endTime); // inherit properties from TextTrackCue
33 | console.log('Cue start at ' + this.startTime + ', end at ' + this.endTime);
34 |
35 | this.myCueContent = cueContent; // set cue content
36 | }
37 | }
38 |
39 | // create an extended cue
40 | const cue = new MyExtendedCue(0, 1, {hello: 'extended-cue'});
41 | ````
42 | Permitting this `super` call in the web interface would enable custom cue extensions to be written in Javascript and make this widely-implemented feature accessible to the web community.
43 |
44 | ### Comparison With DataCue
45 |
46 | Custom cue extensions are functionally equivalent to the DataCue API design:
47 | * The extended cue class name is equivalent to `DataCue.type`.
48 | * The cue content defined by the extended cue class is equivalent to `DataCue.value`.
49 |
50 | In addition, an extended cue can define class functions which are not explicitly included in the DataCue API design.
51 |
52 | The change required to enable this functionality is very simple and the potential benefit to the web community is significant. As a result, browser implementers are more likely to adopt the proposed change.
53 |
54 | ### Summary
55 |
56 | This proposal yields equivalent functionality to DataCue API and addresses the challenge that caused the previous DataCue feature to be dropped.
57 |
58 | ## Demos
59 |
60 | Example code has been created to test and demonstrate how custom cue extensions can be supported in web browsers if [this proposal](https://github.com/WICG/datacue/issues/35) is accepted.
61 |
62 | ### Custom Cues Demo
63 |
64 | In this demo:
65 |
66 | 1. Two user-defined custom cues are extended from TextTrackCue:
67 | * Countdown cue contains a number;
68 | * Colour cue contains an object with `foreground` and `background` attributes.
69 | 1. A mixture of `CountdownCue` and `ColourCue` cues are created.
70 | 1. Event listeners are added to `enter` and `exit` events for each cue.
71 | 1. A TextTrack of `kind='metadata'` is attached to the `