├── .gitattributes ├── .gitignore ├── CONTRIBUTING.md ├── LICENSE.md ├── README.md ├── docs-src ├── .eleventy.js ├── _layout │ ├── htmleton.njk │ ├── page.njk │ └── spec.njk ├── convert-smil │ ├── index.html │ └── parse-smil.js ├── css │ ├── page.css │ └── spec-extra.css ├── demos │ └── raven │ │ ├── .gitattributes │ │ ├── index.html │ │ ├── line.vtt │ │ ├── raven.m4a │ │ ├── readme.md │ │ ├── stanza.vtt │ │ ├── style.css │ │ ├── sync.js │ │ └── word.vtt ├── markdown.js ├── package-lock.json ├── package.json └── pages │ ├── caveats.md │ ├── explainer.md │ ├── index.md │ ├── pages.json │ ├── sync-media-lite.md │ └── use-cases.md ├── docs ├── caveats.html ├── convert-smil │ ├── index.html │ └── parse-smil.js ├── css │ ├── page.css │ └── spec-extra.css ├── explainer.html ├── index.html ├── sync-media-lite.html └── use-cases.html ├── drafts ├── addl-reqs.md ├── functional-requirements.md ├── manifest-exts-multi-granular.md ├── manifest-exts.md ├── readium2.md ├── schema │ ├── README.md │ ├── sync-media-narration.sample.json │ └── sync-media-narration.schema.json ├── sync-narr-ideas.md ├── technologies.md ├── technology-candidates.md ├── technology-selection.md ├── use-cases.md ├── web-proposal.md └── xml-json.html ├── older-experiments ├── player-bkup │ ├── audio.js │ ├── controls.js │ ├── css │ │ ├── base.css │ │ ├── controls.css │ │ ├── player.css │ │ └── pub-default.css │ ├── events.js │ ├── iframe.js │ ├── index.html │ ├── index.js │ ├── narrator.js │ └── utils.js ├── synclib │ ├── build.sh │ ├── build │ │ └── synclib.js │ ├── rollup.config.js │ ├── src │ │ ├── index.js │ │ ├── ingestXml.js │ │ ├── notes.txt │ │ ├── process.js │ │ ├── syncMedia.js │ │ ├── timegraph.js │ │ └── utils.js │ └── tests │ │ ├── files │ │ └── standalone │ │ │ ├── complex.xml │ │ │ ├── longer-video-clips.xml │ │ │ ├── no-media.xml │ │ │ ├── partial-no-media.xml │ │ │ ├── roles.xml │ │ │ ├── simple-plus-tracks.xml │ │ │ └── simple.xml │ │ ├── run-tests.html │ │ └── test │ │ └── test.syncmedia.parsing.js └── visualizer │ ├── index.html │ └── runner.js ├── other-work ├── json │ ├── explainer.html │ ├── incorporating-synchronized-narration.html │ ├── index.html │ ├── synchronized-narration.html │ └── usecases.html ├── smil-sources │ ├── design-principles.md │ ├── examples.md │ ├── explainer.md │ ├── including-in-html.md │ ├── incorporating-into-pubmanifest.md │ ├── index.md │ ├── smil.json │ ├── standalone-packaging.md │ ├── sync-media-with-json.md │ ├── sync-media.md │ └── use-cases.md └── smil │ ├── design-principles.html │ ├── examples.html │ ├── explainer.html │ ├── including-in-html.html │ ├── incorporating-into-pubmanifest.html │ ├── index.html │ ├── standalone-packaging.html │ ├── sync-media-with-json.html │ ├── sync-media.html │ └── use-cases.html └── w3c.json /.gitattributes: -------------------------------------------------------------------------------- 1 | docs-src/demos/raven/raven.mp3 filter=lfs diff=lfs merge=lfs -text 2 | examples/emily-dickinson-poems-series-1/poems_series1_0_dickinson.mp3 filter=lfs diff=lfs merge=lfs -text 3 | examples/emily-dickinson-poems-series-1/poems_series1_2_dickinson.mp3 filter=lfs diff=lfs merge=lfs -text 4 | older-experiments/player/public/fairytale/out.mp3 filter=lfs diff=lfs merge=lfs -text 5 | docs/demos/raven/raven.mp3 filter=lfs diff=lfs merge=lfs -text 6 | examples/emily-dickinson-poems-series-1/poems_series1_1_dickinson.mp3 filter=lfs diff=lfs merge=lfs -text 7 | examples/emily-dickinson-poems-series-1/poems_series1_3_dickinson.mp3 filter=lfs diff=lfs merge=lfs -text 8 | examples/emily-dickinson-poems-series-1/poems_series1_4_dickinson.mp3 filter=lfs diff=lfs merge=lfs -text 9 | older-experiments/player/public/fairytale/Harp.mp3 filter=lfs diff=lfs merge=lfs -text 10 | -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | .DS_Store 2 | node_modules 3 | demo-src/player-bkup 4 | notes 5 | .DS_Store 6 | examples 7 | .vscode/launch.json 8 | player 9 | localhost* 10 | notes-other.md 11 | -------------------------------------------------------------------------------- /CONTRIBUTING.md: -------------------------------------------------------------------------------- 1 | # W3C Synchronized Multimedia for Publications Community Group 2 | 3 | This repository is being used for work in the W3C W3C Synchronized Multimedia for Publications Community Group, governed by the [W3C Community License Agreement (CLA)](https://www.w3.org/community/about/agreements/cla/). To make substantive contributions, you must join the BG. 4 | 5 | If you are not the sole contributor to a contribution (pull request), please identify all 6 | contributors in the pull request comment. 7 | 8 | To add a contributor (other than yourself, that's automatic), mark them one per line as follows: 9 | 10 | ``` 11 | +@github_username 12 | ``` 13 | 14 | If you added a contributor by mistake, you can remove them in a comment with: 15 | 16 | ``` 17 | -@github_username 18 | ``` 19 | 20 | If you are making a pull request on behalf of someone else but you had no part in designing the 21 | feature, you can remove yourself with the above syntax. -------------------------------------------------------------------------------- /LICENSE.md: -------------------------------------------------------------------------------- 1 | All Reports in this Repository are licensed by Contributors 2 | under the 3 | [W3C Software and Document License](https://www.w3.org/Consortium/Legal/2015/copyright-software-and-document). 4 | 5 | Contributions to Specifications are made under the 6 | [W3C CLA](https://www.w3.org/community/about/agreements/cla/). 7 | 8 | Contributions to Test Suites are made under the 9 | [W3C 3-clause BSD License](https://www.w3.org/Consortium/Legal/2008/03-bsd-license.html) -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Synchronized Multimedia for Publications 2 | 3 | This is the repository for ongoing work initiated by the [W3C Synchronized Multimedia for Publications Community Group](https://www.w3.org/community/sync-media-pub/), and 4 | then transferred to the [Publication Maintenance Working Group](https://www.w3.org/groups/wg/pm/) for further 5 | incubation. 6 | 7 | The goal is to explore and develop media synchronization techniques compatible with publishing formats on the web, 8 | including [Audiobooks](https://www.w3.org/TR/audiobooks/), [EPUB](https://www.w3.org/publishing/groups/epub-wg/), and standalone [HTML](https://www.w3.org/html/). 9 | 10 | ## What's currently available 11 | 12 | * [Documents homepage](https://w3c.github.io/sync-media-pub) 13 | 14 | Updated December 2024. -------------------------------------------------------------------------------- /docs-src/.eleventy.js: -------------------------------------------------------------------------------- 1 | import { AllHtmlEntities } from 'html-entities'; 2 | import prettier from "prettier"; 3 | import path from "path"; 4 | import { default as markdown } from './markdown.js'; 5 | 6 | function init(eleventyConfig) { 7 | eleventyConfig.setLibrary("md", markdown()); 8 | eleventyConfig.addPassthroughCopy({"css": "css"}); 9 | eleventyConfig.addPassthroughCopy({"convert-smil": "convert-smil"}); 10 | 11 | eleventyConfig.addFilter('dump', obj => { 12 | return util.inspect(obj) 13 | }); 14 | eleventyConfig.addFilter('json', value => { 15 | const jsonString = JSON.stringify(value, null, 4).replace(/ { 20 | const entities = new AllHtmlEntities(); 21 | return `
${entities.encode(content)}
`; 22 | }); 23 | 24 | eleventyConfig.addTransform("prettier", function (content, outputPath) { 25 | const extname = path.extname(outputPath); 26 | switch (extname) { 27 | case ".html": 28 | case ".json": 29 | // Strip leading period from extension and use as the Prettier parser. 30 | const parser = extname.replace(/^./, ""); 31 | return prettier.format(content, { parser, tabWidth: 4 }); 32 | 33 | default: 34 | return content; 35 | } 36 | }); 37 | if (process.env.HTTPS) { 38 | eleventyConfig.setServerOptions({ 39 | https: { 40 | key: "./localhost.key", 41 | cert: "./localhost.cert", 42 | }, 43 | showVersion: true, 44 | }); 45 | } 46 | 47 | 48 | return { 49 | dir: { 50 | input: "pages", 51 | output: "../docs", 52 | includes: "../_layout" 53 | } 54 | }; 55 | }; 56 | 57 | export default init; -------------------------------------------------------------------------------- /docs-src/_layout/htmleton.njk: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | Synchronized Media for Publications CG: {{title}} 5 | 6 | 7 | {%- block head -%} 8 | {%- endblock -%} 9 | 10 | 11 | {%- block body -%} 12 | {%- endblock -%} 13 | 14 | -------------------------------------------------------------------------------- /docs-src/_layout/page.njk: -------------------------------------------------------------------------------- 1 | {%- extends 'htmleton.njk' -%} 2 | {%- block head -%} 3 | 4 | {%- endblock -%} 5 | {%- block body -%} 6 |
7 |

{{title}}

8 |

Last updated {{ page.date.toDateString() }}

9 | {{ content | safe }} 10 |
11 | {%- endblock -%} 12 | -------------------------------------------------------------------------------- /docs-src/_layout/spec.njk: -------------------------------------------------------------------------------- 1 | {%- extends 'htmleton.njk' -%} 2 | {%- block head -%} 3 | 4 | 5 | 26 | {%- endblock -%} 27 | {%- block body -%} 28 | {{ content | safe }} 29 | {%- endblock -%} 30 | -------------------------------------------------------------------------------- /docs-src/convert-smil/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | Convert Media Overlays to SyncMediaLite 5 | 6 | 7 | 47 | 48 | 49 |

Convert Media Overlays to SyncMediaLite

50 | 51 |

Paste SMIL below and press

52 | 53 | 54 | 55 | 56 |
57 |
58 | 59 | 60 |
61 | 62 |
63 | 64 | 65 |
66 |
67 | 68 | 150 | 151 | -------------------------------------------------------------------------------- /docs-src/convert-smil/parse-smil.js: -------------------------------------------------------------------------------- 1 | // smil is an xml string 2 | export function parseSmil(smil) { 3 | if (!smil || smil.trim() == '') { 4 | throw new Error("Bad input"); 5 | } 6 | let smilModel = parse(smil); 7 | let smilPars = visit(smilModel.body, accumulatePars, []); 8 | smilPars = smilPars.filter(item => item != null); 9 | return smilPars; 10 | } 11 | // convert to a list of TextTrackCues 12 | export function convertToTextTrackCues(smilPars) { 13 | let audioUrl = ''; 14 | let startOffset = 0; 15 | let endOffset = 0; 16 | if (smilPars.length > 0) { 17 | let firstAudio = smilPars[0].media.find(item => item.type == 'audio'); 18 | if (firstAudio) { 19 | startOffset = firstAudio.clipBegin; 20 | } 21 | let lastAudio = smilPars.reverse()[0].media.find(item => item.type == 'audio'); 22 | if (lastAudio) { 23 | endOffset = lastAudio.clipEnd; 24 | } 25 | else { 26 | console.error("Could not process SMIL"); 27 | return null; 28 | } 29 | smilPars.reverse(); // unreverse them 30 | } 31 | else { 32 | console.error("Could not process SMIL"); 33 | return null; 34 | } 35 | 36 | let cues = smilPars.map(item => { 37 | let audio = item.media.find(media => media.type == 'audio'); 38 | let text = item.media.find(media => media.type == 'text'); 39 | return new VTTCue( 40 | parseFloat(audio.clipBegin), 41 | parseFloat(audio.clipEnd), 42 | JSON.stringify({selector: {type:"FragmentSelector",value: text.src.split('#')[1]}}) 43 | ); 44 | }); 45 | 46 | return cues; 47 | 48 | } 49 | function accumulatePars(node) { 50 | if (node.type == 'par') { 51 | return node; 52 | } 53 | else { 54 | return null; 55 | } 56 | } 57 | // Visit a tree of objects with media children 58 | function visit(node, fn, collectedData) { 59 | let retval = fn(node); 60 | if (node?.media) { 61 | return [retval, ...node.media.map(n => visit(n, fn, collectedData)).flat()]; 62 | } 63 | else { 64 | return retval; 65 | } 66 | } 67 | 68 | let isMedia = name => name == "text" || name == "audio" 69 | || name == "ref" || name == "video" 70 | || name == "img"; 71 | 72 | 73 | function parse(xml) { 74 | let model = {}; 75 | let domparser = new DOMParser(); 76 | let doc = domparser.parseFromString(xml, "application/xml"); 77 | let bodyElm = doc.documentElement.getElementsByTagName("body"); 78 | if (bodyElm.length > 0) { 79 | model.body = parseNode(bodyElm[0]); 80 | } 81 | return model; 82 | } 83 | 84 | function parseNode(node) { 85 | if (node.nodeName == "body" || node.nodeName == "seq" || node.nodeName == "par") { 86 | // body has type "seq" 87 | let type = node.nodeName == "body" || node.nodeName == "seq" ? "seq" : "par"; 88 | let obj = { 89 | type 90 | }; 91 | if (node.id) { 92 | obj.id = node.getAttribute("id"); 93 | } 94 | if (node.hasAttribute('epub:type')) { 95 | obj.epubType = node.getAttribute('epub:type').split(' '); 96 | } 97 | obj.media = Array.from(node.children).map(n => parseNode(n)); 98 | return obj; 99 | } 100 | else if (isMedia(node.nodeName)) { 101 | let obj = { 102 | type: node.nodeName, 103 | src: node.getAttribute("src"), 104 | }; 105 | if (node.id) { 106 | obj.id = node.getAttribute("id"); 107 | } 108 | if (node.nodeName == "audio") { 109 | obj.clipBegin = parseClockValue(node.getAttribute("clipBegin")); 110 | obj.clipEnd = parseClockValue(node.getAttribute("clipEnd")); 111 | } 112 | obj.xmlString = node.outerHTML.replace('xmlns="http://www.w3.org/ns/SMIL"', ''); 113 | return obj; 114 | } 115 | } 116 | 117 | // parse the timestamp and return the value in seconds 118 | // supports this syntax: https://www.w3.org/publishing/epub/epub-mediaoverlays.html#app-clock-examples 119 | function parseClockValue(value) { 120 | if (!value) { 121 | return null; 122 | } 123 | let hours = 0; 124 | let mins = 0; 125 | let secs = 0; 126 | 127 | if (value.indexOf("min") != -1) { 128 | mins = parseFloat(value.substr(0, value.indexOf("min"))); 129 | } 130 | else if (value.indexOf("ms") != -1) { 131 | var ms = parseFloat(value.substr(0, value.indexOf("ms"))); 132 | secs = ms/1000; 133 | } 134 | else if (value.indexOf("s") != -1) { 135 | secs = parseFloat(value.substr(0, value.indexOf("s"))); 136 | } 137 | else if (value.indexOf("h") != -1) { 138 | hours = parseFloat(value.substr(0, value.indexOf("h"))); 139 | } 140 | else { 141 | // parse as hh:mm:ss.fraction 142 | // this also works for seconds-only, e.g. 12.345 143 | let arr = value.split(":"); 144 | secs = parseFloat(arr.pop()); 145 | if (arr.length > 0) { 146 | mins = parseFloat(arr.pop()); 147 | if (arr.length > 0) { 148 | hours = parseFloat(arr.pop()); 149 | } 150 | } 151 | } 152 | let total = hours * 3600 + mins * 60 + secs; 153 | return total; 154 | } 155 | 156 | export function secondsToHMSMS(seconds) { 157 | // Calculate hours, minutes, seconds, and milliseconds 158 | let hours = Math.floor(seconds / 3600); 159 | let minutes = Math.floor((seconds % 3600) / 60); 160 | let sec = seconds % 60; 161 | let milliseconds = Math.round((sec - Math.floor(sec)) * 1000); 162 | 163 | // Extract whole seconds 164 | let secondsInt = Math.floor(sec); 165 | 166 | // Format the output as hh:mm:ss.ttt 167 | return `${padZero(hours)}:${padZero(minutes)}:${padZero(secondsInt)}.${padZero(milliseconds, 3)}`; 168 | } 169 | 170 | // Helper function to pad single digits with leading zeroes 171 | function padZero(num, length = 2) { 172 | return num.toString().padStart(length, '0'); 173 | } -------------------------------------------------------------------------------- /docs-src/css/page.css: -------------------------------------------------------------------------------- 1 | body { 2 | font-family: sans-serif; 3 | width: 80%; 4 | margin: auto; 5 | line-height: 1.5; 6 | } 7 | h1 { 8 | font-size: xx-large; 9 | } 10 | ul { 11 | line-height: 2; 12 | } 13 | 14 | 15 | .wip { 16 | background-color: antiquewhite; 17 | border: darkorange thick solid; 18 | padding: 1rem; 19 | } 20 | 21 | table { 22 | line-height: 2; 23 | border-collapse: collapse; 24 | } 25 | td { 26 | border: thin black solid; 27 | padding: 5px; 28 | } 29 | 30 | .lookhere { 31 | font-size: larger; 32 | } 33 | .lookhere::after { 34 | content: "NEW 🧨"; 35 | font-variant: small-caps; 36 | font-size: x-small; 37 | padding-left: 8px; 38 | background-color: yellow; 39 | } 40 | .note { 41 | font-style: italic; 42 | border: thin black solid; 43 | padding: .5rem; 44 | border-radius: 5px; 45 | } 46 | .note::before { 47 | content: "Note: "; 48 | font-style: normal; 49 | } 50 | .stale::after { 51 | background-color: gray; 52 | } 53 | .c2021::after { 54 | content: "last updated 2021 🏴‍☠️"; 55 | } 56 | .c2020::after { 57 | content: "last updated 2020 🏴‍☠️"; 58 | } -------------------------------------------------------------------------------- /docs-src/css/spec-extra.css: -------------------------------------------------------------------------------- 1 | .TODO { 2 | background-color: antiquewhite; 3 | 4 | } 5 | .deemph { 6 | font-style: italic; 7 | font-weight: lighter; 8 | font-size: smaller; 9 | } 10 | div.TODO { 11 | border: darkorange thick solid; 12 | padding: 1rem; 13 | } 14 | 15 | table > thead { 16 | background-color: rgba(0,0,0,0.7); 17 | color: white; 18 | } 19 | 20 | table td { 21 | padding: 1rem; 22 | } 23 | 24 | table th { 25 | padding: 0.5rem 1rem 0.5rem 1rem; 26 | } 27 | 28 | table th:first-child { 29 | width: 10rem; 30 | } 31 | 32 | table td:last-child { 33 | width: 30rem; 34 | } 35 | 36 | table td:not(:first-child):not(:last-child) { 37 | width: 20rem; 38 | } 39 | 40 | table td { 41 | vertical-align: top; 42 | } 43 | 44 | table td ul { 45 | margin-top: 0; 46 | padding-left: 0; 47 | } 48 | table td ul li:first-child{ 49 | margin-top: 0; 50 | } 51 | 52 | #syncmedia-presentation { 53 | width: 100%; 54 | display: grid; 55 | grid-template-rows: 10% auto; 56 | grid-template-columns: 50% 50%; 57 | gap: 1rem; 58 | } 59 | #syncmedia-presentation > h4 { 60 | grid-row: 1; 61 | grid-column: 1/3 62 | } 63 | #syncmedia-presentation > section:first-child { 64 | grid-column: 1; 65 | } 66 | 67 | #syncmedia-presentation > section:last-child { 68 | grid-column: 2; 69 | } -------------------------------------------------------------------------------- /docs-src/demos/raven/.gitattributes: -------------------------------------------------------------------------------- 1 | *.m4a filter=lfs diff=lfs merge=lfs -text 2 | -------------------------------------------------------------------------------- /docs-src/demos/raven/raven.m4a: -------------------------------------------------------------------------------- 1 | version https://git-lfs.github.com/spec/v1 2 | oid sha256:c1a25857dd5a36094d6021612dc88aa487a9897184c33bfe1bec76883a1373cd 3 | size 3581970 4 | -------------------------------------------------------------------------------- /docs-src/demos/raven/readme.md: -------------------------------------------------------------------------------- 1 | # About this demo 2 | 3 | ## What it is 4 | 5 | * "The Raven" by Edgar Allen Poe, narrated by FergusRossFerrier 6 | * This self-contained synchronized presentation runs in a standard browser 7 | * It was made with very little javascript and is not a heavy custom application, it's a declarative multimedia document 8 | * Features include word-level audio synchronization, and next/previous navigation of words, lines, and stanzas 9 | 10 | ## What it's made out of 11 | 12 | * HTML 13 | * CSS 14 | * WebVTT (of track kind ["metadata"](https://www.w3.org/TR/webvtt1/#introduction-metadata)) 15 | * [CSS Highlights API](https://www.w3.org/TR/css-highlight-api-1/) 16 | * Javascript 17 | 18 | ## Wishlist (tech does not exist yet) 19 | 20 | * Animated highlights (no animation properties are available for ::highlight pseudo-elements though) 21 | * Screen-reader synchronicity (switch from narration to screenreader and don't lose your place) 22 | 23 | ## Wishlist of work for me to do to this demo 24 | 25 | * keyboard commands 26 | * user control over highlight options 27 | -------------------------------------------------------------------------------- /docs-src/demos/raven/stanza.vtt: -------------------------------------------------------------------------------- 1 | WEBVTT 2 | 3 | 1095 4 | 00:00:01.200 --> 00:00:04.800 5 | {"selector":{"type":"CssSelector","value":".poem-info"}} 6 | 7 | 1096 8 | 00:00:07.800 --> 00:00:31.100 9 | {"selector":{"type":"CssSelector","value":":nth-child(1 of .stanza)"}} 10 | 11 | 1097 12 | 00:00:32.300 --> 00:00:55.800 13 | {"selector":{"type":"CssSelector","value":":nth-child(2 of .stanza)"}} 14 | 15 | 1098 16 | 00:00:57.117 --> 00:01:15.017 17 | {"selector":{"type":"CssSelector","value":":nth-child(3 of .stanza)"}} 18 | 19 | 1099 20 | 00:01:16.117 --> 00:01:35.117 21 | {"selector":{"type":"CssSelector","value":":nth-child(4 of .stanza)"}} 22 | 23 | 1100 24 | 00:01:36.287 --> 00:01:57.987 25 | {"selector":{"type":"CssSelector","value":":nth-child(5 of .stanza)"}} 26 | 27 | 1101 28 | 00:01:57.987 --> 00:02:16.587 29 | {"selector":{"type":"CssSelector","value":":nth-child(6 of .stanza)"}} 30 | 31 | 1102 32 | 00:02:17.780 --> 00:02:39.780 33 | {"selector":{"type":"CssSelector","value":":nth-child(7 of .stanza)"}} 34 | 35 | 1103 36 | 00:02:40.580 --> 00:03:01.580 37 | {"selector":{"type":"CssSelector","value":":nth-child(8 of .stanza)"}} 38 | 39 | 1104 40 | 00:03:02.734 --> 00:03:23.934 41 | {"selector":{"type":"CssSelector","value":":nth-child(9 of .stanza)"}} 42 | 43 | 1105 44 | 00:03:24.934 --> 00:03:47.434 45 | {"selector":{"type":"CssSelector","value":":nth-child(10 of .stanza)"}} 46 | 47 | 1106 48 | 00:03:49.482 --> 00:04:11.682 49 | {"selector":{"type":"CssSelector","value":":nth-child(11 of .stanza)"}} 50 | 51 | 1107 52 | 00:04:11.682 --> 00:04:33.782 53 | {"selector":{"type":"CssSelector","value":":nth-child(12 of .stanza)"}} 54 | 55 | 1108 56 | 00:04:34.382 --> 00:04:58.582 57 | {"selector":{"type":"CssSelector","value":":nth-child(13 of .stanza)"}} 58 | 59 | 1109 60 | 00:05:01.169 --> 00:05:23.469 61 | {"selector":{"type":"CssSelector","value":":nth-child(14 of .stanza)"}} 62 | 63 | 1110 64 | 00:05:24.769 --> 00:05:45.469 65 | {"selector":{"type":"CssSelector","value":":nth-child(15 of .stanza)"}} 66 | 67 | 1111 68 | 00:05:46.840 --> 00:06:09.740 69 | {"selector":{"type":"CssSelector","value":":nth-child(16 of .stanza)"}} 70 | 71 | 1112 72 | 00:06:11.340 --> 00:06:36.240 73 | {"selector":{"type":"CssSelector","value":":nth-child(17 of .stanza)"}} 74 | 75 | 1113 76 | 00:06:38.446 --> 00:07:09.946 77 | {"selector":{"type":"CssSelector","value":":nth-child(18 of .stanza)"}} 78 | -------------------------------------------------------------------------------- /docs-src/demos/raven/style.css: -------------------------------------------------------------------------------- 1 | :root { 2 | --aside-color: rgb(230, 230, 220); 3 | --highlight-stanza: lightyellow; 4 | --highlight-line: yellow; 5 | --checkbox-color: green; 6 | } 7 | body { 8 | display: grid; 9 | align-items: center; 10 | } 11 | main { 12 | display: flex; 13 | flex-direction: column; 14 | gap: 2rem; 15 | font-family:'Courier New', Courier, monospace; 16 | line-height: 1.3; 17 | } 18 | summary { 19 | cursor: pointer; 20 | } 21 | 22 | summary h2 { 23 | display: inline-block; 24 | } 25 | 26 | .stanza { 27 | display: flex; 28 | flex-direction: column; 29 | gap: 1rem; 30 | } 31 | #about { 32 | font-size: smaller; 33 | font-family:'Gill Sans', 'Gill Sans MT', Calibri, 'Trebuchet MS', sans-serif; 34 | } 35 | #narration { 36 | font-family:'Segoe UI', Tahoma, Geneva, Verdana, sans-serif; 37 | background-color: var(--aside-color); 38 | 39 | display: flex; 40 | flex-direction: column; 41 | gap: 1rem; 42 | align-items: center; 43 | 44 | position: sticky; 45 | bottom: 0; 46 | width: 100%; 47 | 48 | padding: 1rem; 49 | } 50 | 51 | #narration-highlight { 52 | display: flex; 53 | flex-direction: row; 54 | gap: 1rem; 55 | align-items: center; 56 | } 57 | #narration h2 { 58 | font-size: medium; 59 | margin: 0; 60 | } 61 | #narration h3 { 62 | margin: 0; 63 | font-size: 90%; 64 | display: inline-block; 65 | } 66 | #narration label, input[type=checkbox] { 67 | font-size: 80%; 68 | } 69 | #narration input{ 70 | accent-color: var(--checkbox-color); 71 | } 72 | 73 | ::highlight(stanzas) { 74 | background-color: var(--highlight-stanza); 75 | } 76 | 77 | ::highlight(lines) { 78 | text-decoration: none; 79 | background-color: var(--highlight-line); 80 | } 81 | ::highlight(words) { 82 | text-decoration: underline; 83 | text-decoration-style:dotted; 84 | text-decoration-thickness: 2px; 85 | background-color: inherit; 86 | } 87 | @supports(-moz-appearance:none) { 88 | ::highlight(words) { 89 | color: darkgreen; 90 | } 91 | } -------------------------------------------------------------------------------- /docs-src/demos/raven/sync.js: -------------------------------------------------------------------------------- 1 | let trackHighlights = {}; 2 | let audioElm; 3 | 4 | export function setupTracks(audio) { 5 | audioElm = audio; 6 | let tracks = Array.from(audio.textTracks); 7 | tracks.map(track => { 8 | let cues = Array.from(track.cues); 9 | cues.map(cue => cue.onenter = e => enterCue(e)); 10 | }); 11 | } 12 | export function applyHighlights() { 13 | let trackIds = Object.keys(trackHighlights); 14 | for (let trackId of trackIds) { 15 | if (document.querySelector(`#highlight-option-${trackId}`).checked) { 16 | CSS.highlights.set(trackId, trackHighlights[trackId]); 17 | } 18 | } 19 | } 20 | export function removeHighlight(trackId) { 21 | if (CSS.highlights.has(trackId)) { 22 | CSS.highlights.delete(trackId); 23 | } 24 | } 25 | export function nextCue(trackId) { 26 | navigateCues(trackId, 1); 27 | } 28 | 29 | export function prevCue(trackId) { 30 | navigateCues(trackId, -1); 31 | } 32 | 33 | function enterCue(event) { 34 | let cue = event.target; 35 | try { 36 | let cueMeta = JSON.parse(cue.text); 37 | let elmRange = createRange(cueMeta.selector) 38 | let newHighlight = new Highlight(elmRange); 39 | trackHighlights[cue.track.id] = newHighlight; 40 | applyHighlights(); 41 | let node = elmRange.startContainer.nodeType == 1 ? elmRange.startContainer : elmRange.startContainer.parentNode; 42 | if (!isInViewport(node)) { 43 | node.scrollIntoView(); 44 | } 45 | } 46 | catch(err) { 47 | console.debug("ERROR ", cue.text); 48 | } 49 | } 50 | // dir = 1: next, -1: prev 51 | function navigateCues(trackId, dir) { 52 | let track = Array.from(audioElm.textTracks).find(track => track.id == trackId); 53 | if (!track.activeCues.length) { 54 | return; 55 | } 56 | let activeCue = track.activeCues[0]; 57 | console.log("Current #", activeCue.id, `(time ${activeCue.startTime})`); 58 | let sortedCues = sortCuesByTime(track); 59 | if (dir < 0) { 60 | sortedCues.reverse(); 61 | } 62 | let idx = sortedCues.findIndex(cue => cue.id == activeCue.id); 63 | if (idx < sortedCues.length - 1) { 64 | let targetCue = sortedCues[idx+1]; 65 | console.debug("Skipping to ", targetCue.startTime, " #", targetCue.id); 66 | document.querySelector(`#${dir > 0 ? 'next' : 'prev'}`).disabled = true; 67 | audioElm.addEventListener("timeupdate", e => { 68 | document.querySelector(`#${dir > 0 ? 'next' : 'prev'}`).disabled = false; 69 | audioElm.play(); 70 | }, { once: true }); 71 | audioElm.currentTime = targetCue.startTime; 72 | } 73 | } 74 | 75 | // to know what 'next' and 'prev' mean 76 | function sortCuesByTime(track) { 77 | let sortedCues = Array.from(track.cues).sort((a,b) => { 78 | return a.startTime < b.startTime ? -1 : a.startTime > b.startTime ? 1 : 0; 79 | }); 80 | return sortedCues; 81 | } 82 | 83 | 84 | function isInViewport(elm) { 85 | let bounding = elm.getBoundingClientRect(); 86 | let doc = elm.ownerDocument; 87 | return ( 88 | bounding.top >= 0 && 89 | bounding.left >= 0 && 90 | bounding.bottom <= (doc.defaultView.innerHeight || doc.documentElement.clientHeight) && 91 | bounding.right <= (doc.defaultView.innerWidth || doc.documentElement.clientWidth) 92 | ); 93 | } 94 | 95 | // for CssSelector optionally with TextPositionSelector as its refinedBy 96 | function createRange(rangeSelector) { 97 | let node = document.querySelector(rangeSelector.value); 98 | let startOffset = 0; 99 | let endOffset = 0; 100 | if (rangeSelector.hasOwnProperty('refinedBy')) { 101 | startOffset = rangeSelector.refinedBy.start; 102 | endOffset = rangeSelector.refinedBy.end; 103 | 104 | return new StaticRange({ 105 | startContainer: node.firstChild, 106 | startOffset, 107 | endContainer: node.firstChild, 108 | endOffset: endOffset + 1 109 | }); 110 | } 111 | 112 | return new StaticRange({ 113 | startContainer: node, 114 | startOffset: 0, 115 | endContainer: node.nextSibling, 116 | endOffset: 0 117 | }); 118 | } 119 | -------------------------------------------------------------------------------- /docs-src/markdown.js: -------------------------------------------------------------------------------- 1 | import markdownit from 'markdown-it'; 2 | import markdownitanchor from 'markdown-it-anchor'; 3 | import markdownitattrs from 'markdown-it-attrs'; 4 | import markdownItHeaderSections from 'markdown-it-header-sections'; 5 | import markdownItTableOfContents from 'markdown-it-table-of-contents'; 6 | import markdownItDeflist from 'markdown-it-deflist'; 7 | import markdownItDiv from 'markdown-it-div'; 8 | 9 | function markdown() { 10 | let markdownLib = markdownit({ html: true }) 11 | .use(markdownitanchor) 12 | .use(markdownitattrs) 13 | .use(markdownItHeaderSections) 14 | .use(markdownItTableOfContents, { 15 | "includeLevel": [2], 16 | "containerHeaderHtml": `

Contents:

` 17 | }) 18 | .use(markdownItDeflist) 19 | .use(markdownItDiv); 20 | return markdownLib; 21 | }; 22 | 23 | export default markdown; -------------------------------------------------------------------------------- /docs-src/package.json: -------------------------------------------------------------------------------- 1 | { 2 | "name": "drafts-src", 3 | "version": "1.0.0", 4 | "description": "", 5 | "main": ".eleventy.js", 6 | "type": "module", 7 | "scripts": { 8 | "serve": "cross-env DEBUG=Eleventy* eleventy --serve", 9 | "serve-https": "cross-env HTTPS=true DEBUG=Eleventy* eleventy --serve", 10 | "build": "npm run clean && eleventy", 11 | "buildwatch": "eleventy --watch", 12 | "clean": "rimraf docs", 13 | "http-serve": "cd ../docs && http-serve", 14 | "dev": "npm run clean && npm run serve" 15 | }, 16 | "keywords": [], 17 | "author": "", 18 | "license": "ISC", 19 | "devDependencies": { 20 | "@11ty/eleventy": "^3.0.0-beta.1", 21 | "cross-env": "^7.0.3", 22 | "fs-extra": "^11.2.0", 23 | "html-entities": "^1.3.1", 24 | "markdown-it": "^14.1.0", 25 | "markdown-it-anchor": "^9.0.1", 26 | "markdown-it-attrs": "^4.1.6", 27 | "markdown-it-deflist": "^2.0.3", 28 | "markdown-it-div": "^1.1.0", 29 | "markdown-it-header-sections": "^1.0.0", 30 | "markdown-it-table-of-contents": "^0.4.4", 31 | "npm-run-all": "^4.1.5", 32 | "nunjucks": "^3.2.2", 33 | "prettier": "^3.3.2", 34 | "rimraf": "^5.0.7", 35 | "sharp": "^0.33.4", 36 | "slugify": "^1.4.5" 37 | } 38 | } 39 | -------------------------------------------------------------------------------- /docs-src/pages/caveats.md: -------------------------------------------------------------------------------- 1 | --- 2 | title: SyncMediaLite caveats 3 | --- 4 | 5 | # Caveats when going from EPUB Media Overlays to SyncMediaLite 6 | 7 | When adopting a more modern synchronization strategy, as described in [SyncMediaLite](sync-media-lite), some adaptation is required. It may be that existing `.smil` files have to be transformed into `.vtt` files before being distributed to a system that expects `.vtt`. Or the user agent may be loading `.smil` files and internally transforming them to `TextTrackCues` so as to avoid writing a SMIL engine. 8 | 9 | In any case where EPUB Media Overlays need to be transformed to work in a WebVTT-based playback scenario, there are some differences to be aware of. 10 | 11 | ## Multiple audio files. 12 | 13 | It's theoretically permitted in EPUB Media Overlays to have sync points in the same SMIL file referencing different audio files (in practice this isn't common). 14 | 15 | ## Non-contiguous audio segments. 16 | 17 | Say we have an audio file of someone saying "Three one two". Our HTML text, though, says "1 2 3". Theoretically, SMIL can handle this, though it's worth mentioning that this type of content is not commonly found: 18 | 19 | {% example "SMIL markup of non-contiguous audio segments" %} 20 | 21 | 24 | 25 | 28 | 29 | 32 | {% endexample %} 33 | 34 | You would see the highlight and audio start with "1" and proceed to "2", then "3", since each `` indicates what portion of audio to render. 35 | 36 | Now if you try to represent this in WebVTT, you would get: 37 | 38 | {% example "WebVTT version" %} 39 | 10 40 | 00:00:01.000 --> 00:00:02.000 41 | {"selector":{"type": "FragmentSelector", "value": "one"}} 42 | 43 | 20 44 | 00:00:02.000 --> 00:00:03.000 45 | {"selector":{"type": "FragmentSelector", "value": "two"}} 46 | 47 | 30 48 | 00:00:00.000 --> 00:00:01.000 49 | {"selector":{"type": "FragmentSelector", "value": "three"}} 50 | 51 | {% endexample %} 52 | 53 | But you would hear and see highlighted "3", followed by "1", then "2", since the audio playback is only based on the audio file, from start to end. 54 | 55 | ## Solutions 56 | 57 | In both cases, resolving the difference requires either additional special handling by the user agent, or audio file reformulation by the producer. 58 | -------------------------------------------------------------------------------- /docs-src/pages/explainer.md: -------------------------------------------------------------------------------- 1 | --- 2 | title: SyncMediaLite explainer 3 | --- 4 | ## History: EPUB and DAISY 5 | 6 | The use case of reading with narration and synchronized highlight has long been a part of electronic publishing, and is already supported by existing standards ([DAISY](https://daisy.org/activities/standards/daisy/), [EPUB Media Overlays](https://www.w3.org/TR/epub/#sec-media-overlays)). Under the hood, these standards use [SMIL](https://www.w3.org/TR/SMIL3/) to synchronize an audio file with an [HTML](https://html.spec.whatwg.org/multipage/) file, by pairing timestamps with phrase IDs. 7 | 8 | ## Issues 9 | 10 | * SMIL is seen as complicated and outdated. 11 | 12 | Its roots are from the web's early days, before HTML supported native audio and video; and the full SMIL language is indeed quite complex. However, the usage of SMIL in EPUB Media Overlays is minimal and, with a few more restrictions, could be translated into a more modern format and be more easily implemented. 13 | 14 | * Synchronized text and audio is expensive to produce. 15 | 16 | Production of audio narrated text is a lot of work and hence not as common as standalone text or audio books. Now with more powerful speech and language processing tools, automated synchronization is becoming feasible. However, it's not fast enough to do on the client side (yet), so book producers are still going to have to create pre-synchronized contents. But advances in their own tools are going to make it easier for them to do this. 17 | 18 | ## Synchronization on the modern web 19 | 20 | The same user experience is achieved with a more modern approach that is easier to implement. This is what is described in [SyncMediaLite](sync-media-lite). 21 | 22 | ### Media playback 23 | 24 | Today, the HTMLMediaElement has built-in cue synchronization. When loaded with a series of TextTrackCues, the MediaElement will automatically fire off cue events at the right times, so unlike SMIL, it does not require hand-coding a timing engine. 25 | 26 | ### Highlighting 27 | 28 | The CSS Highlight API makes it easy to register highlights, which are then available for styling as pseudo-elements. There is then no need to add and remove class attributes throughout the DOM. 29 | 30 | 31 | ### Referencing text 32 | 33 | In EPUB Media Overlays, this is done with fragment identifiers. By expanding this to include the use of [selectors](https://www.w3.org/TR/selectors-states/#selectors), we have a more flexible way to reference text, without requiring IDs on all the text, and can even go to the character level. 34 | 35 | 36 | ## An upgrade path 37 | 38 | EPUB Media Overlays could be replaced with SyncMediaLite, with the following modifications: 39 | 40 | * Restrict: there can be one audio file per HTML document. 41 | * Restrict: the audio file must play in the correct order by default. 42 | * Expand: allow additional selectors, not just fragment IDs. 43 | 44 | See [caveats](caveats) related to going from EPUB Media Overlays to SyncMediaLite. 45 | -------------------------------------------------------------------------------- /docs-src/pages/index.md: -------------------------------------------------------------------------------- 1 | --- 2 | title: SyncMedia Community Group Overview 3 | --- 4 | This is an overview of work being done by the [Synchronized Media for Publications Community Group](#about-this-group). 5 | 6 | * [Github repository](https://github.com/w3c/sync-media-pub) 7 | * [Mailing list archives](https://lists.w3.org/Archives/Public/public-sync-media-pub/) 8 | 9 | 10 | ## Latest work: SyncMediaLite 11 | 12 | * Uses the browser's inbuilt cue synchronization capabilities via WebVTT 13 | * Easy to implement a synchronized text highlight 14 | * Works well for common use cases of audio narrated HTML documents 15 | 16 | ### Documents 17 | 18 | * [Explainer](explainer) 19 | * [Use cases](use-cases) 20 | * [Draft spec](sync-media-lite) 21 | * Demos: 22 | * [Accessible Books in Browsers](https://daisy.github.io/accessible-books-in-browsers/#demos): 23 | _Self-playing text and audio books. The books here are converted automatically from DAISY 2.02/EPUB into multi-chapter sets of HTML files with built-in playback for SyncMediaLite_ 24 | * [The Raven](https://raven-highlight-demo.netlify.app/) 25 | _Poem with multi-level highlighting. This is a single-page document with built-in playback for SyncMediaLite and advanced highlighting features_ ([see sources](https://github.com/w3c/sync-media-pub/tree/main/docs-src/demos/raven)) 26 | * [Using TextTrackCues to play EPUB Media Overlays](https://marisademeglio.github.io/mo-player/) 27 | _Using the same techniques as SyncMediaLite playback, this Media Overlays `.smil` file can be played_ 28 | 31 | * Try this experimental tool to [convert Media Overlays to SyncMediaLite](convert-smil) 32 | 33 | See [other work](https://github.com/w3c/sync-media-pub/tree/main/other-work) for more ideas this group has had over the years, including experiments with SMIL, and a syntactically-light JSON format. 34 | 35 | ## About this group 36 | 37 | ### History 38 | The [Synchronized Media for Publications Community Group](https://www.w3.org/community/sync-media-pub/) was formed to recommend the best way to synchronize media with document formats that were being developed by the 39 | Publishing Working Group, in order to make publications accessible to people with different types of reading requirements. 40 | 41 | ### Present 42 | The [Publishing Maintenance Working Group](https://www.w3.org/groups/wg/pm/) is now primarily focused on spec maintenance; however, the work here continues to be to explore and develop synchronization techniques compatible with publishing formats on the web, including [Audiobooks](https://www.w3.org/TR/audiobooks/), [EPUB](https://www.w3.org/publishing/groups/epub-wg/), and standalone [HTML](https://www.w3.org/html/). 43 | 44 | 45 | 46 | -------------------------------------------------------------------------------- /docs-src/pages/pages.json: -------------------------------------------------------------------------------- 1 | { 2 | "layout": "page.njk", 3 | "permalink": "{{ page.filePathStem }}.html", 4 | "date": "Last Modified" 5 | } -------------------------------------------------------------------------------- /docs-src/pages/use-cases.md: -------------------------------------------------------------------------------- 1 | --- 2 | title: Use cases 3 | --- 4 | ## Convert existing content 5 | 6 | A library has a lot of narrated talking book content. They want to convert it from DAISY and/or EPUB with Media Overlays to something more modern like WebVTT. They need to port over the SMIL-based audio clip timing information plus HTML document selectors. 7 | 8 | 9 | ## Produce content without IDs on every text element 10 | 11 | A content producer does not want to destructively mark up their beautiful HTML document by putting ID values on every element that is to be synchronized. They wonder why they can't use CSS selectors instead. 12 | 13 | ## Play directly in a browser 14 | 15 | Sam has a web browser and wants to use it to listen to a narrated document that is located at a URL. Sam does not want to install an app. 16 | 17 | ## Have multiple levels of highlighting 18 | 19 | The book publisher wants a paragraph to have a pink background while it's playing, and each word in it should be green as it plays. 20 | 21 | ## Navigate by item 22 | 23 | Sunny wants to listen to a poem and navigate by word, line, or stanza. As they navigate by any of those types of things, the audio and text highlight follow. 24 | 25 | ## Control audio playback rate 26 | 27 | Bertha is listening to important information in a language not native to her; she wants to slow the rate to improve her comprehension. 28 | 29 | ## Adjust visual properties 30 | 31 | Gregor is near-sighted and needs to enlarge text in order to read it. He prefers that the audio playback and text highlight work seamlessly after he uses the browser text enlargement feature. 32 | -------------------------------------------------------------------------------- /docs/caveats.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | Synchronized Media for Publications CG: SyncMediaLite caveats 6 | 7 | 8 | 9 | 10 | 11 | 12 |
13 |

SyncMediaLite caveats

14 |

Last updated Tue Oct 01 2024

15 |
19 |

20 | Caveats when going from EPUB Media Overlays to SyncMediaLite 21 |

22 |

23 | When adopting a more modern synchronization strategy, as 24 | described in SyncMediaLite, 25 | some adaptation is required. It may be that existing 26 | .smil files have to be transformed into 27 | .vtt files before being distributed to a system 28 | that expects .vtt. Or the user agent may be 29 | loading .smil files and internally transforming 30 | them to TextTrackCues so as to avoid writing a 31 | SMIL engine. 32 |

33 |

34 | In any case where EPUB Media Overlays need to be transformed 35 | to work in a WebVTT-based playback scenario, there are some 36 | differences to be aware of. 37 |

38 |
39 |

Multiple audio files.

40 |

41 | It's theoretically permitted in EPUB Media Overlays to 42 | have sync points in the same SMIL file referencing 43 | different audio files (in practice this isn't common). 44 |

45 |
46 |
47 |

Non-contiguous audio segments.

48 |

49 | Say we have an audio file of someone saying "Three 50 | one two". Our HTML text, though, says "1 2 51 | 3". Theoretically, SMIL can handle this, though 52 | it's worth mentioning that this type of content is not 53 | commonly found: 54 |

55 |
 59 | <par>
 60 |     <audio src="audio.mp3" clipBegin="1s" clipEnd="2s"/>
 61 |     <text src="file.html#one"/>
 62 | </par>
 63 | <par>
 64 |     <audio src="audio.mp3" clipBegin="2s" clipEnd="3s"/>
 65 |     <text src="file.html#two"/>
 66 | </par>
 67 | <par>
 68 |     <audio src="audio.mp3" clipBegin="0s" clipEnd="1s"/>
 69 |     <text src="file.html#three"/>
 70 | </par>
 71 | 
73 |

74 | You would see the highlight and audio start with 75 | "1" and proceed to "2", then 76 | "3", since each 77 | <par> indicates what portion of audio 78 | to render. 79 |

80 |

81 | Now if you try to represent this in WebVTT, you would 82 | get: 83 |

84 |
 85 | 10
 86 | 00:00:01.000 --> 00:00:02.000
 87 | {"selector":{"type": "FragmentSelector", "value": "one"}}
 88 | 
 89 | 20
 90 | 00:00:02.000 --> 00:00:03.000
 91 | {"selector":{"type": "FragmentSelector", "value": "two"}}
 92 | 
 93 | 30
 94 | 00:00:00.000 --> 00:00:01.000
 95 | {"selector":{"type": "FragmentSelector", "value": "three"}}
 96 | 
 97 | 
99 |

100 | But you would hear and see highlighted "3", 101 | followed by "1", then "2", since the 102 | audio playback is only based on the audio file, from 103 | start to end. 104 |

105 |
106 |
107 |

Solutions

108 |

109 | In both cases, resolving the difference requires either 110 | additional special handling by the user agent, or audio 111 | file reformulation by the producer. 112 |

113 |
114 |
115 |
116 | 117 | 118 | -------------------------------------------------------------------------------- /docs/convert-smil/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | Convert Media Overlays to SyncMediaLite 5 | 6 | 7 | 47 | 48 | 49 |

Convert Media Overlays to SyncMediaLite

50 | 51 |

Paste SMIL below and press

52 | 53 | 54 | 55 | 56 |
57 |
58 | 59 | 60 |
61 | 62 |
63 | 64 | 65 |
66 |
67 | 68 | 150 | 151 | -------------------------------------------------------------------------------- /docs/convert-smil/parse-smil.js: -------------------------------------------------------------------------------- 1 | // smil is an xml string 2 | export function parseSmil(smil) { 3 | if (!smil || smil.trim() == '') { 4 | throw new Error("Bad input"); 5 | } 6 | let smilModel = parse(smil); 7 | let smilPars = visit(smilModel.body, accumulatePars, []); 8 | smilPars = smilPars.filter(item => item != null); 9 | return smilPars; 10 | } 11 | // convert to a list of TextTrackCues 12 | export function convertToTextTrackCues(smilPars) { 13 | let audioUrl = ''; 14 | let startOffset = 0; 15 | let endOffset = 0; 16 | if (smilPars.length > 0) { 17 | let firstAudio = smilPars[0].media.find(item => item.type == 'audio'); 18 | if (firstAudio) { 19 | startOffset = firstAudio.clipBegin; 20 | } 21 | let lastAudio = smilPars.reverse()[0].media.find(item => item.type == 'audio'); 22 | if (lastAudio) { 23 | endOffset = lastAudio.clipEnd; 24 | } 25 | else { 26 | console.error("Could not process SMIL"); 27 | return null; 28 | } 29 | smilPars.reverse(); // unreverse them 30 | } 31 | else { 32 | console.error("Could not process SMIL"); 33 | return null; 34 | } 35 | 36 | let cues = smilPars.map(item => { 37 | let audio = item.media.find(media => media.type == 'audio'); 38 | let text = item.media.find(media => media.type == 'text'); 39 | return new VTTCue( 40 | parseFloat(audio.clipBegin), 41 | parseFloat(audio.clipEnd), 42 | JSON.stringify({selector: {type:"FragmentSelector",value: text.src.split('#')[1]}}) 43 | ); 44 | }); 45 | 46 | return cues; 47 | 48 | } 49 | function accumulatePars(node) { 50 | if (node.type == 'par') { 51 | return node; 52 | } 53 | else { 54 | return null; 55 | } 56 | } 57 | // Visit a tree of objects with media children 58 | function visit(node, fn, collectedData) { 59 | let retval = fn(node); 60 | if (node?.media) { 61 | return [retval, ...node.media.map(n => visit(n, fn, collectedData)).flat()]; 62 | } 63 | else { 64 | return retval; 65 | } 66 | } 67 | 68 | let isMedia = name => name == "text" || name == "audio" 69 | || name == "ref" || name == "video" 70 | || name == "img"; 71 | 72 | 73 | function parse(xml) { 74 | let model = {}; 75 | let domparser = new DOMParser(); 76 | let doc = domparser.parseFromString(xml, "application/xml"); 77 | let bodyElm = doc.documentElement.getElementsByTagName("body"); 78 | if (bodyElm.length > 0) { 79 | model.body = parseNode(bodyElm[0]); 80 | } 81 | return model; 82 | } 83 | 84 | function parseNode(node) { 85 | if (node.nodeName == "body" || node.nodeName == "seq" || node.nodeName == "par") { 86 | // body has type "seq" 87 | let type = node.nodeName == "body" || node.nodeName == "seq" ? "seq" : "par"; 88 | let obj = { 89 | type 90 | }; 91 | if (node.id) { 92 | obj.id = node.getAttribute("id"); 93 | } 94 | if (node.hasAttribute('epub:type')) { 95 | obj.epubType = node.getAttribute('epub:type').split(' '); 96 | } 97 | obj.media = Array.from(node.children).map(n => parseNode(n)); 98 | return obj; 99 | } 100 | else if (isMedia(node.nodeName)) { 101 | let obj = { 102 | type: node.nodeName, 103 | src: node.getAttribute("src"), 104 | }; 105 | if (node.id) { 106 | obj.id = node.getAttribute("id"); 107 | } 108 | if (node.nodeName == "audio") { 109 | obj.clipBegin = parseClockValue(node.getAttribute("clipBegin")); 110 | obj.clipEnd = parseClockValue(node.getAttribute("clipEnd")); 111 | } 112 | obj.xmlString = node.outerHTML.replace('xmlns="http://www.w3.org/ns/SMIL"', ''); 113 | return obj; 114 | } 115 | } 116 | 117 | // parse the timestamp and return the value in seconds 118 | // supports this syntax: https://www.w3.org/publishing/epub/epub-mediaoverlays.html#app-clock-examples 119 | function parseClockValue(value) { 120 | if (!value) { 121 | return null; 122 | } 123 | let hours = 0; 124 | let mins = 0; 125 | let secs = 0; 126 | 127 | if (value.indexOf("min") != -1) { 128 | mins = parseFloat(value.substr(0, value.indexOf("min"))); 129 | } 130 | else if (value.indexOf("ms") != -1) { 131 | var ms = parseFloat(value.substr(0, value.indexOf("ms"))); 132 | secs = ms/1000; 133 | } 134 | else if (value.indexOf("s") != -1) { 135 | secs = parseFloat(value.substr(0, value.indexOf("s"))); 136 | } 137 | else if (value.indexOf("h") != -1) { 138 | hours = parseFloat(value.substr(0, value.indexOf("h"))); 139 | } 140 | else { 141 | // parse as hh:mm:ss.fraction 142 | // this also works for seconds-only, e.g. 12.345 143 | let arr = value.split(":"); 144 | secs = parseFloat(arr.pop()); 145 | if (arr.length > 0) { 146 | mins = parseFloat(arr.pop()); 147 | if (arr.length > 0) { 148 | hours = parseFloat(arr.pop()); 149 | } 150 | } 151 | } 152 | let total = hours * 3600 + mins * 60 + secs; 153 | return total; 154 | } 155 | 156 | export function secondsToHMSMS(seconds) { 157 | // Calculate hours, minutes, seconds, and milliseconds 158 | let hours = Math.floor(seconds / 3600); 159 | let minutes = Math.floor((seconds % 3600) / 60); 160 | let sec = seconds % 60; 161 | let milliseconds = Math.round((sec - Math.floor(sec)) * 1000); 162 | 163 | // Extract whole seconds 164 | let secondsInt = Math.floor(sec); 165 | 166 | // Format the output as hh:mm:ss.ttt 167 | return `${padZero(hours)}:${padZero(minutes)}:${padZero(secondsInt)}.${padZero(milliseconds, 3)}`; 168 | } 169 | 170 | // Helper function to pad single digits with leading zeroes 171 | function padZero(num, length = 2) { 172 | return num.toString().padStart(length, '0'); 173 | } -------------------------------------------------------------------------------- /docs/css/page.css: -------------------------------------------------------------------------------- 1 | body { 2 | font-family: sans-serif; 3 | width: 80%; 4 | margin: auto; 5 | line-height: 1.5; 6 | } 7 | h1 { 8 | font-size: xx-large; 9 | } 10 | ul { 11 | line-height: 2; 12 | } 13 | 14 | 15 | .wip { 16 | background-color: antiquewhite; 17 | border: darkorange thick solid; 18 | padding: 1rem; 19 | } 20 | 21 | table { 22 | line-height: 2; 23 | border-collapse: collapse; 24 | } 25 | td { 26 | border: thin black solid; 27 | padding: 5px; 28 | } 29 | 30 | .lookhere { 31 | font-size: larger; 32 | } 33 | .lookhere::after { 34 | content: "NEW 🧨"; 35 | font-variant: small-caps; 36 | font-size: x-small; 37 | padding-left: 8px; 38 | background-color: yellow; 39 | } 40 | .note { 41 | font-style: italic; 42 | border: thin black solid; 43 | padding: .5rem; 44 | border-radius: 5px; 45 | } 46 | .note::before { 47 | content: "Note: "; 48 | font-style: normal; 49 | } 50 | .stale::after { 51 | background-color: gray; 52 | } 53 | .c2021::after { 54 | content: "last updated 2021 🏴‍☠️"; 55 | } 56 | .c2020::after { 57 | content: "last updated 2020 🏴‍☠️"; 58 | } -------------------------------------------------------------------------------- /docs/css/spec-extra.css: -------------------------------------------------------------------------------- 1 | .TODO { 2 | background-color: antiquewhite; 3 | 4 | } 5 | .deemph { 6 | font-style: italic; 7 | font-weight: lighter; 8 | font-size: smaller; 9 | } 10 | div.TODO { 11 | border: darkorange thick solid; 12 | padding: 1rem; 13 | } 14 | 15 | table > thead { 16 | background-color: rgba(0,0,0,0.7); 17 | color: white; 18 | } 19 | 20 | table td { 21 | padding: 1rem; 22 | } 23 | 24 | table th { 25 | padding: 0.5rem 1rem 0.5rem 1rem; 26 | } 27 | 28 | table th:first-child { 29 | width: 10rem; 30 | } 31 | 32 | table td:last-child { 33 | width: 30rem; 34 | } 35 | 36 | table td:not(:first-child):not(:last-child) { 37 | width: 20rem; 38 | } 39 | 40 | table td { 41 | vertical-align: top; 42 | } 43 | 44 | table td ul { 45 | margin-top: 0; 46 | padding-left: 0; 47 | } 48 | table td ul li:first-child{ 49 | margin-top: 0; 50 | } 51 | 52 | #syncmedia-presentation { 53 | width: 100%; 54 | display: grid; 55 | grid-template-rows: 10% auto; 56 | grid-template-columns: 50% 50%; 57 | gap: 1rem; 58 | } 59 | #syncmedia-presentation > h4 { 60 | grid-row: 1; 61 | grid-column: 1/3 62 | } 63 | #syncmedia-presentation > section:first-child { 64 | grid-column: 1; 65 | } 66 | 67 | #syncmedia-presentation > section:last-child { 68 | grid-column: 2; 69 | } -------------------------------------------------------------------------------- /docs/explainer.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | Synchronized Media for Publications CG: SyncMediaLite explainer 6 | 7 | 8 | 9 | 10 | 11 | 12 |
13 |

SyncMediaLite explainer

14 |

Last updated Sun Oct 06 2024

15 |
16 |

History: EPUB and DAISY

17 |

18 | The use case of reading with narration and synchronized 19 | highlight has long been a part of electronic publishing, and 20 | is already supported by existing standards (DAISY, 24 | EPUB Media Overlays). Under the hood, these standards use 27 | SMIL to 28 | synchronize an audio file with an 29 | HTML 30 | file, by pairing timestamps with phrase IDs. 31 |

32 |
33 |
34 |

Issues

35 |
    36 |
  • SMIL is seen as complicated and outdated.
  • 37 |
38 |

39 | Its roots are from the web's early days, before HTML 40 | supported native audio and video; and the full SMIL language 41 | is indeed quite complex. However, the usage of SMIL in EPUB 42 | Media Overlays is minimal and, with a few more restrictions, 43 | could be translated into a more modern format and be more 44 | easily implemented. 45 |

46 |
    47 |
  • 48 | Synchronized text and audio is expensive to produce. 49 |
  • 50 |
51 |

52 | Production of audio narrated text is a lot of work and hence 53 | not as common as standalone text or audio books. Now with 54 | more powerful speech and language processing tools, 55 | automated synchronization is becoming feasible. However, 56 | it's not fast enough to do on the client side (yet), so book 57 | producers are still going to have to create pre-synchronized 58 | contents. But advances in their own tools are going to make 59 | it easier for them to do this. 60 |

61 |
62 |
63 |

Synchronization on the modern web

64 |

65 | The same user experience is achieved with a more modern 66 | approach that is easier to implement. This is what is 67 | described in SyncMediaLite. 68 |

69 |
70 |

Media playback

71 |

72 | Today, the HTMLMediaElement has built-in cue 73 | synchronization. When loaded with a series of 74 | TextTrackCues, the MediaElement will automatically fire 75 | off cue events at the right times, so unlike SMIL, it 76 | does not require hand-coding a timing engine. 77 |

78 |
79 |
80 |

Highlighting

81 |

82 | The CSS Highlight API makes it easy to register 83 | highlights, which are then available for styling as 84 | pseudo-elements. There is then no need to add and remove 85 | class attributes throughout the DOM. 86 |

87 |
88 |
89 |

Referencing text

90 |

91 | In EPUB Media Overlays, this is done with fragment 92 | identifiers. By expanding this to include the use of 93 | selectors, we have a more flexible way to reference text, 97 | without requiring IDs on all the text, and can even go 98 | to the character level. 99 |

100 |
101 |
102 |
103 |

An upgrade path

104 |

105 | EPUB Media Overlays could be replaced with SyncMediaLite, 106 | with the following modifications: 107 |

108 |
    109 |
  • 110 | Restrict: there can be one audio file per HTML document. 111 |
  • 112 |
  • 113 | Restrict: the audio file must play in the correct order 114 | by default. 115 |
  • 116 |
  • 117 | Expand: allow additional selectors, not just fragment 118 | IDs. 119 |
  • 120 |
121 |

122 | See caveats related to going from EPUB 123 | Media Overlays to SyncMediaLite. 124 |

125 |
126 |
127 | 128 | 129 | -------------------------------------------------------------------------------- /docs/use-cases.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | Synchronized Media for Publications CG: Use cases 5 | 6 | 7 | 8 | 9 | 10 |
11 |

Use cases

12 |

Last updated Tue Oct 01 2024

13 |
14 |

Convert existing content

15 |

16 | A library has a lot of narrated talking book content. They 17 | want to convert it from DAISY and/or EPUB with Media 18 | Overlays to something more modern like WebVTT. They need to 19 | port over the SMIL-based audio clip timing information plus 20 | HTML document selectors. 21 |

22 |
23 |
27 |

28 | Produce content without IDs on every text element 29 |

30 |

31 | A content producer does not want to destructively mark up 32 | their beautiful HTML document by putting ID values on every 33 | element that is to be synchronized. They wonder why they 34 | can't use CSS selectors instead. 35 |

36 |
37 |
38 |

Play directly in a browser

39 |

40 | Sam has a web browser and wants to use it to listen to a 41 | narrated document that is located at a URL. Sam does not 42 | want to install an app. 43 |

44 |
45 |
46 |

Have multiple levels of highlighting

47 |

48 | The book publisher wants a paragraph to have a pink 49 | background while it's playing, and each word in it should be 50 | green as it plays. 51 |

52 |
53 | 61 |
62 |

Control audio playback rate

63 |

64 | Bertha is listening to important information in a language 65 | not native to her; she wants to slow the rate to improve her 66 | comprehension. 67 |

68 |
69 |
70 |

Adjust visual properties

71 |

72 | Gregor is near-sighted and needs to enlarge text in order to 73 | read it. He prefers that the audio playback and text 74 | highlight work seamlessly after he uses the browser text 75 | enlargement feature. 76 |

77 |
78 |
79 | 80 | 81 | -------------------------------------------------------------------------------- /drafts/addl-reqs.md: -------------------------------------------------------------------------------- 1 | # Additional Requirements 2 | 3 | ## Technology selection 4 | In addition to meeting the requirements of the use cases, the technology selection must: 5 | 6 | * be able to be validated 7 | * integrate as seamlessly as possible into a WP/PWP 8 | * be web-friendly (includes both using existing technologies where possible and also being developer-friendly) 9 | * facilitate fast lookups by reading systems. E.g. 10 | * Given a content document, find its corresponding audio quickly. 11 | * Make reading options (e.g. don't read page numbers) easy to implement in audio playback 12 | 13 | ## Metadata 14 | 15 | Additional vocabulary to support synchronized media: 16 | 17 | * `duration` of both the publication and its components (e.g. chapters). 18 | * `narrator` 19 | * `(re)distribution rights` 20 | * `CSS "active" class` (for highlighting the currently playing HTML element) 21 | * `enumeration of available multiple synchronization granularities` (so that reading systems / user agents can allow users a list of choices) 22 | * TBD 23 | -------------------------------------------------------------------------------- /drafts/functional-requirements.md: -------------------------------------------------------------------------------- 1 | # Requirements and design options for synchronized multimedia 2 | 3 | ## Requirements 4 | 5 | The requirements here outline what is necessary to provide a comprehensive reading experience for users who interact with content in a non-visual way. We must look further than a simple audio stream version of a book to meet these needs, especially regarding navigation. And, while TTS has many advantages, there are some use cases which require pre-recorded audio. E.g. 6 | 7 | * Users with cognitive disabilities who may not process TTS as well 8 | * Not all languages have a corresponding TTS engine 9 | * Pre-recorded audio meets professional production standards (voice actors) 10 | 11 | 12 | ### Essential Requirements 13 | 14 | Organized by content type 15 | 16 | #### Text supplemented with pre-recorded audio narration 17 | * Use an authored navigation interface (e.g. TOC) to reach a point in a content document, and start audio playback from there 18 | * Move around the text content and have the audio playback keep up. E.g. 19 | * Previous/next paragraph 20 | * Previous/next page 21 | * Start playback by clicking a point mid-page 22 | * Speed up/slow down the audio while not affecting the pitch (timescale modification) 23 | * Navigate by time chunk (e.g. skip forward/back 10s) 24 | * Text chunks are highlighted in sync with the audio playback, at the authored granularity 25 | * Text display keeps up with audio. E.g. start audio playback and have the pages turn automatically. 26 | * Selectively filter out content based on semantics ("skippability" in the DAISY world). Note: might there be overlap here with WP personalization? 27 | * Jump out of complex structures and back into the reading flow ("escapability" in the DAISY world) 28 | 29 | #### Pre-recorded audio narration only (no text document) 30 | 31 | In addition to the above applicable requirements, 32 | 33 | * Use an authored navigation interface (e.g. TOC) to reach points in the audio stream and start playback 34 | * Navigate at the authored granularity, such as paragraph 35 | * Reading system functions such as bookmarking and annotation should continue to function 36 | 37 | #### Video supplemented with text transcript, standalone screenplay 38 | * Select a point in a text transcript to jump to that part of the video 39 | * Search the text of the transcript 40 | * Highlight text in transcript as video plays 41 | 42 | ### Advanced Requirements 43 | 44 | Organized by content type 45 | 46 | #### Any type of content 47 | * Support multiple granularities of navigation, switchable on the fly by the user. E.g. 48 | * Previous/next paragraph 49 | * Previous/next sentence 50 | * Previous/next word or phrase 51 | 52 | #### Text supplemented with sign-language video 53 | 54 | * Synchronize the video to the text in a way that permits user navigation (as above for audio) 55 | * Make the video full screen and overlay the text as a caption 56 | * Make the text the primary visual item and the video is reduced in size but maintains a constant position even as user scrolls/navigates 57 | * Highlight the text in sync with the video playback 58 | * Play/pause the video 59 | 60 | #### Video supplemented with descriptive audio 61 | 62 | * Users may navigate the content via an authored navigation interface (e.g. TOC) 63 | * Author may choose to have audio description and video played simultaneously for some parts, and for other parts, instruct the reading system to automatically pause the video and play audio description, and then resume the video playback 64 | * User should be able to independently control volume and speed of both the video and audio description 65 | 66 | ## Design 67 | 68 | ### Requirements 69 | 70 | The solution must 71 | 72 | * be able to be validated 73 | * integrate as seamlessly as possible into a WP/PWP 74 | * use existing web technologies as much as possible 75 | * facilitate fast lookups by reading systems whenever possible. E.g. 76 | * Given a content document, find its corresponding audio quickly. 77 | * Make reading options (e.g. don't read page numbers) easy to implement in audio playback 78 | 79 | ### Ideas 80 | 81 | * Structured sync points 82 | * Supplemental media 'overlay' has granularity and structure 83 | * What we already do in EPUB3 with SMIL 84 | * Flat list of sync points 85 | * Use the text content to inform reading system about granularity and structure 86 | * May not be the best for audio-only 87 | * Referring to clips in timed media 88 | * Start/end points (like in SMIL) 89 | * Just list the start points 90 | * One-liner reference with url + clip times all in one string 91 | * Referring to text 92 | * Url + ID 93 | * XPath 94 | * How could we reference content at the character level to support fine granularity 95 | 96 | ## Metadata 97 | 98 | In both the "audio only" (i.e. structured "talking book" without text transcript) and "full-text full-audio" synchronisation use cases, content creators / publishers must be able to specify additional metadata for the publication as a whole, as well as for individual resources located within the publication. For instance, such metadata may provide "duration" information about the audio stream corresponding to the entire default / linear reading order, as well as for individual logical content fragments such as "book chapters" (which may or may not be tied to physical artefacts like HTML files). Other examples of metadata include narrator names, synthetic speech voices, (re)distribution rights, etc. 99 | 100 | The ability to specify metadata for a publication whole or parts is not specific to the "synchronised multimedia" case. There should therefore not be any specific requirements defined here. It may however be necessary to define additional metadata vocabulary / naming schemes in order to support multimedia-related information (to be determined). 101 | -------------------------------------------------------------------------------- /drafts/manifest-exts-multi-granular.md: -------------------------------------------------------------------------------- 1 | # Proposed "sync media" solution for "multiple synchronization granularities" 2 | 3 | Note that this document is supplemental to [manifest-exts.md](./manifest-exts.md). 4 | 5 | The use-case "multiple synchronization granularities" is currently not supported by the EPUB3 Media Overlays specification. This proposal attempts to detail this new functionality, by formally describing extensions to the baseline "sync media" proposal, in terms of descriptive properties / metadata, and with respect to the "sync media" overlay JSON format. 6 | 7 | ##  -------------------------------------------------------------------------------- /drafts/readium2.md: -------------------------------------------------------------------------------- 1 | # Readium2 experimentations 2 | 3 | ## Why not just continue to use SMIL? 4 | 5 | 1) XML: implementation experience of (EPUB) reading systems tells us that XML parsing in modern and older web browsers is doable, but that there are many quirks (e.g. namespace handling) and processing overhead which may be avoided by using a format like JSON. 6 | 2) Complexity: both EPUB3 Media Overlays and DAISY Talking Books (the historical format that preceded EPUB MO) use only a small subset of the SMIL standard, but the cognitive overhead (e.g. time / clock syntax) and lack of adoption in the Open Web Platform indicate that other solutions should be considered, particularly as there is now a broader choice of technologies for handling media / timing in web browsers (which wasn't the case at the time SMIL was chosen to fulfil the timing and synchronization requirements of audio / talking books). 7 | 3) Architecture: The convergence of publishing standards with the broader, distributed Open Web Platform offers an opportunity to define a system better suited for the HTTP client-server request-response design model, and the notion of URL service endpoint. 8 | 9 | ## System overview 10 | 11 | In Readium2, EPUB3 publications are ingested and translated to an internal JSON format (structurally and semantically defined using JSON-LD) which carries information about the publication (i.e. metadata, collection of documents, resources, etc.). 12 | 13 | In addition to exposing links to static publication assets such as images, stylesheets, etc., this central publication "manifest" also advertises URLs via which additional services can be accessed (e.g. search, indexing, pagination, etc.). These would typically be HTTP links, decorated with well-defined metadata in order to enable discovery (`rel` and `type` properties). 14 | 15 | The Media Overlay data for the publication as a whole, or for specific documents within the publication, can be resolved using such links. The mechanism of URI Template is used to pass parameters such as the canonical path of a publication document, so that only the Media Overlay for that particular document is returned. 16 | 17 | In other words, a client that consumes a publication for processing or rendering purposes, can issue a series of (HTTP) requests in order to incrementally access the desired data. Obviously, behind the scenes the content server may also implement "lazy loading" strategies to avoid unnecessary overheads. 18 | 19 | Standard HTTP caching methods can be used to optimise performance, and Readium2 server implementations (aka "streamers") generate ETag headers. Side node: prefetch headers are emitted in HTTP responses so that web browsers can eagerly load commonly-used resources such as stylesheets and font faces. 20 | 21 | ## Timing and synchronisation 22 | 23 | In Readium2, the Media Overlays format is JSON-based, just like the central "webpub manifest" (as it's internally called). A simplification from SMIL is the use of a "clock" syntax based on a decimal number representing seconds (and fractions of a second). This time syntax is used in metadata (`duration` fields) and in URL Media Fragments (i.e. `audio.mp3#t=1.3,2.456` instead of the `clipbegin` and `clipEnd` XML attributes in SMIL files). The net benefit is that modern web browsers natively support audio and video playback using this syntax. 24 | 25 | As in SMIL, locations within HTML documents are referenced using fragments identifiers. No change here. In fact, the very notion of "hash" (URL `#` segment) is extensible, and future revisions could add support for other audio/video/text referencing mechanisms (for example, character-level text ranges). 26 | 27 | ## Example 28 | 29 | (truncated, for brevity) 30 | 31 | ``` 32 | { 33 | @context: "http://readium.org/webpub/default.jsonld", 34 | metadata: { 35 | @type: "http://schema.org/Book", 36 | title: "Moby-Dick", 37 | ... 38 | narrator: { 39 | name: "Stuart Wills" 40 | }, 41 | duration: 860.5, 42 | media-overlay: { 43 | active-class: "-epub-media-overlay-active" 44 | } 45 | }, 46 | links: [ 47 | ... 48 | { 49 | href: "media-overlay.json?resource={path}", 50 | type: "application/vnd.readium.mo+json", 51 | templated: true, 52 | rel: "media-overlay" 53 | } 54 | ], 55 | spine: [ 56 | ... 57 | { 58 | href: "OPS/chapter_001.xhtml", 59 | type: "application/xhtml+xml", 60 | properties: { 61 | media-overlay: "media-overlay.json?resource=OPS%2Fchapter_001.xhtml" 62 | }, 63 | duration: 860.5 64 | }, 65 | { 66 | href: "OPS/chapter_002.xhtml", 67 | type: "application/xhtml+xml", 68 | properties: { 69 | media-overlay: "media-overlay.json?resource=OPS%2Fchapter_002.xhtml" 70 | }, 71 | duration: 543 72 | }, 73 | ... 74 | ] 75 | } 76 | ``` 77 | 78 | ``` 79 | { 80 | media-overlay: [ 81 | { 82 | role: [ 83 | "section" 84 | ], 85 | children: [ 86 | { 87 | text: "OPS/chapter_001.xhtml", 88 | role: [ 89 | "section", 90 | "bodymatter", 91 | "chapter" 92 | ], 93 | children: [ 94 | { 95 | text: "OPS/chapter_001.xhtml#c01h01", 96 | audio: "OPS/audio/mobydick_001_002_melville.mp4#t=24.5,29.268" 97 | }, 98 | { 99 | text: "OPS/chapter_001.xhtml#c01w00001", 100 | audio: "OPS/audio/mobydick_001_002_melville.mp4#t=29.268,29.441" 101 | }, 102 | ... 103 | ] 104 | } 105 | ] 106 | } 107 | ] 108 | } 109 | ``` 110 | 111 | ## Additional information 112 | 113 | https://github.com/readium/readium-2/tree/master/media-overlay 114 | 115 | https://github.com/readium/readium-2/blob/master/media-overlay/syntax.md 116 | -------------------------------------------------------------------------------- /drafts/schema/README.md: -------------------------------------------------------------------------------- 1 | # JSON Schema for the Sync Media Narration data format (DRAFT!) 2 | 3 | Folder contents: 4 | 5 | * `sync-media-narration.schema.json` is the JSON Schema, conforming to "draft-v7". 6 | * `sync-media-narration.sample.json` is a Sync Media Narration JSON sample, which validates against the JSON Schema. 7 | 8 | A selection of online tools to try the above JSON Schema and JSON sample: 9 | 10 | * https://jsonschemalint.com/#/version/draft-07/markup/json 11 | * https://www.jsonschemavalidator.net 12 | -------------------------------------------------------------------------------- /drafts/schema/sync-media-narration.sample.json: -------------------------------------------------------------------------------- 1 | { 2 | "narration": [ 3 | { 4 | "text": "chapter1.html#id1", 5 | "audio": "chapter1.mp3#t=0.0,1.2" 6 | }, 7 | { 8 | "text": "chapter1.html#id2", 9 | "audio": "chapter1.mp3#t=1.2,3.4" 10 | }, 11 | { 12 | "role": "footnote-ref", 13 | "text": "chapter1.html#id3", 14 | "audio": "chapter1.mp3#t=3.4,5.6" 15 | }, 16 | { 17 | "role": [ 18 | "aside", 19 | "some-other-type" 20 | ], 21 | "narration": [ 22 | { 23 | "text": "chapter1.html#id4", 24 | "audio": "chapter1.mp3#t=5.6,7.8" 25 | }, 26 | { 27 | "text": "chapter1.html#id5", 28 | "audio": "chapter1.mp3#t=7.8,9.1" 29 | }, 30 | { 31 | "text": "chapter1.html#id6", 32 | "audio": "chapter1.mp3#t=9.1,10.1" 33 | } 34 | ] 35 | }, 36 | { 37 | "text": "chapter1.html#id7", 38 | "audio": "chapter1.mp3#t=10.1,11.2" 39 | }, 40 | { 41 | "text": "chapter1.html#id8", 42 | "audio": "chapter1.mp3#t=11.2,13.3" 43 | }, 44 | { 45 | "role": "footnote", 46 | "narration": [ 47 | { 48 | "text": "chapter1.html#id9", 49 | "audio": "chapter1.mp3#t=13.3,14.4" 50 | }, 51 | { 52 | "text": "chapter1.html#id10", 53 | "audio": "chapter1.mp3#t=14.4,17.4" 54 | } 55 | ] 56 | } 57 | ] 58 | } -------------------------------------------------------------------------------- /drafts/schema/sync-media-narration.schema.json: -------------------------------------------------------------------------------- 1 | { 2 | "$schema": "http://json-schema.org/draft-07/schema#", 3 | "$id": "https://w3c.github.io/sync-media-pub/sync-media-narration.schema.json", 4 | "title": "JSON Schema for Narration Sync Media", 5 | "type": "object", 6 | "properties": { 7 | "role": { 8 | "description": "Type associated with this synchronized narration sequence, similar to EPUB3 'epub:type' attribute semantics, e.g. 'aside'", 9 | "type": [ 10 | "string", 11 | "array" 12 | ], 13 | "items": { 14 | "type": "string" 15 | } 16 | }, 17 | "narration": { 18 | "description": "Ordered list of children, similar to SMIL 'seq' element (recursive JSON Schema property)", 19 | "type": "array", 20 | "items": { 21 | "anyOf": [ 22 | { 23 | "$ref": "sync-media-narration.schema.json" 24 | }, 25 | { 26 | "type": "object", 27 | "properties": { 28 | "role": { 29 | "description": "Type associated with this synchronized text/audio pair, similar to EPUB3 'epub:type' attribute semantics, e.g. 'footnote-ref'", 30 | "type": [ 31 | "string", 32 | "array" 33 | ], 34 | "items": { 35 | "type": "string" 36 | } 37 | }, 38 | "text": { 39 | "description": "Document reference, similar to SMIL 'text' element, e.g. 'chapter1.html#paragraph1'", 40 | "type": "string", 41 | "format": "uri-reference" 42 | }, 43 | "audio": { 44 | "description": "Audio reference, similar to SMIL 'audio' element, e.g. 'chapter1.mp3?t=0,123'", 45 | "type": "string", 46 | "format": "uri-reference" 47 | } 48 | }, 49 | "required": [ 50 | "text", 51 | "audio" 52 | ] 53 | } 54 | ] 55 | } 56 | } 57 | }, 58 | "required": [ 59 | "narration" 60 | ] 61 | } -------------------------------------------------------------------------------- /drafts/sync-narr-ideas.md: -------------------------------------------------------------------------------- 1 | # Synchronized Narration: New Ideas 2 | 3 | * [Concepts](#concepts) 4 | * [Example](#example) 5 | * [Notes about SMIL](#notes-about-smil) 6 | 7 | ## Concepts 8 | 9 | ### `syncNarration` 10 | Root-level container. Children are: 11 | * [`assets`](#assets) (one) 12 | * [`sequence`](#sequence) (one) 13 | 14 | ### `assets` 15 | Declaration of all media assets. Children are: 16 | * [`asset`](#asset) (one or more) 17 | 18 | ### `asset` 19 | References a single media file. 20 | 21 | Properties: 22 | * `id`: required 23 | * `src` 24 | * `mediaType` 25 | 26 | Children: 27 | * [`property`](#property) (zero or more) 28 | 29 | ### `property` 30 | 31 | Name-value pair specifying a parameter that affects media rendering, such as CSS class to apply or speed adjustment. 32 | 33 | Properties: 34 | * `name`: a single string 35 | * `value`: a list of values 36 | 37 | ### `sequence` 38 | 39 | An ordered list of playback steps. 40 | 41 | Children: 42 | * [`step`](#step) (one or more) 43 | 44 | ### `step` 45 | 46 | A collection of media nodes, to be rendered at the same time. The step is done rendering when all of its child media have finished rendering. 47 | 48 | Children: 49 | * [`media`](#media) (one or more) 50 | * [`sequence`](#sequence) (zero or more) 51 | 52 | ### `media` 53 | 54 | A segment of an asset. 55 | 56 | Properties: 57 | * `assetRef`: the ID of the asset 58 | * `selector`: segment to render 59 | 60 | _Idea_: Add a `selectorType` property to [`media`](#media), which would have as its value a string expressing the type of the selector, to align with [Selectors](https://www.w3.org/TR/selectors-states/#selectors). 61 | 62 | _Idea_: allow `selector` to be a list of the segments of the asset to render. If there is more than one segment, render them at the same time. 63 | 64 | Children: 65 | * [`property`](#property) (zero or more): overrides default `asset` properties. 66 | 67 | ## Playback rules 68 | 69 | ### Rendering a Sync Narration document 70 | * Starts at beginning of top-level `sequence` and renders it. 71 | 72 | ### Rendering a `sequence` 73 | * Render the first `step` 74 | * When that `step` is done, render the next `step` 75 | * Proceed as such until the end is reached. 76 | 77 | ### Rendering a `step` 78 | * Each `media` and `sequence` node is rendered in parallel 79 | 80 | ### Rendering `media` 81 | * Render `media` by locating the `selector` of the referenced asset and displaying it or playing it 82 | * When `media` is rendered, default and node-specific `property` values are applied, according to the [property rules](#property-rules) 83 | 84 | ### Property rules 85 | * An `asset` may have one or more default properties 86 | * A `media` node may override or add properties 87 | * An overridden `property` replaces the default value for the `media` node on which it was declared 88 | * An added `property` applies only to the `media` node on which it was declared 89 | 90 | ## Example 91 | 92 | ``` 93 | 94 | 95 | 96 | 97 | 98 | 99 | 100 | 101 | 102 | 103 | 104 | 105 | 106 | 107 | 108 | 109 | 110 | 111 | 112 | 113 | 114 | 115 | 116 | 117 | 118 | 119 | 120 | 121 | 122 | 123 | 124 | 125 | 126 | 127 | ``` 128 | 129 | If you're curious how this would look in JSON, here is a [comparison](https://raw.githack.com/w3c/sync-media-pub/master/drafts/xml-json.html). 130 | 131 | ## Notes about SMIL 132 | 133 | If you're familiar with [SMIL](https://www.w3.org/TR/SMIL3/), you will notice some similarities: 134 | * [`sequence`](#sequence) and [`SMIL seq`](https://www.w3.org/TR/SMIL3/smil-timing.html#edef-seq) 135 | * [`step`](#step) and [`SMIL par`](https://www.w3.org/TR/SMIL3/smil-timing.html#edef-par) 136 | * [`media`](#media) and [`SMIL ref`](https://www.w3.org/TR/SMIL3/smil-extended-media-object.html#edef-ref) 137 | * [`property`](#property) and [`SMIL param`](https://www.w3.org/TR/SMIL3/smil-extended-media-object.html#edef-param) 138 | 139 | It's true that SMIL thought of everything, far ahead of its time! The main arguments against using SMIL, however, are: 140 | 141 | * Its non-intuitive names for elements 142 | * Complicated timing models and events 143 | 144 | [EPUB3 Media Overlays](https://www.w3.org/publishing/epub/epub-mediaoverlays.html#sec-overlays-def) solved the second objection by using a very minimal subset of SMIL3; so minimal, in fact, that it fell below the threshold of SMIL's own "most basic" profile. However, as the SMIL working group had dissolved, this was deemed to be ok. 145 | 146 | Another way forward for us could be to build on the existing EPUB3 Media Overlays spec. Starting from that super minimal version of SMIL, we could build up to what we have here by way of the following modifications: 147 | 148 | * Introduce `assets` and give them `params` or `paramGroups`. 149 | * Allow generic `ref` as a media reference 150 | * Allow a `ref` to reference an `asset` instead of having its own `src` to a file. 151 | * Replace `epub:type` with `role` or some as-of-yet-not-thought-of not-loaded term. 152 | * Use [`selectors`](https://www.w3.org/TR/selectors-states/#selectors) 153 | 154 | Just for fun, see an [example](https://raw.githack.com/w3c/sync-media-pub/master/drafts/xml-json.html#smil) of what this could look like. 155 | 156 | -------------------------------------------------------------------------------- /drafts/technologies.md: -------------------------------------------------------------------------------- 1 | # Related existing / emerging technologies 2 | 3 | ## Skippability 4 | 5 | DPUB a11y: 6 | https://www.w3.org/TR/dpub-accessibility/#skippability 7 | 8 | EPUB 3 Media Overlays: 9 | http://www.idpf.org/epub/31/spec/epub-mediaoverlays.html#sec-skippability 10 | 11 | ## Escapability 12 | 13 | DPUB a11y: 14 | https://www.w3.org/TR/dpub-accessibility/#escapability 15 | 16 | EPUB 3 Media Overlays: 17 | http://www.idpf.org/epub/31/spec/epub-mediaoverlays.html#sec-escabaility 18 | 19 | ## Time / clock values 20 | 21 | EPUB 3 Media Overlays: 22 | http://www.idpf.org/epub/31/spec/epub-mediaoverlays.html#app-clock-examples 23 | 24 | SMIL 3: 25 | https://www.w3.org/TR/SMIL3/smil-timing.html#q22 26 | 27 | TT: 28 | https://www.w3.org/AudioVideo/TT/ 29 | 30 | TTML 2: 31 | https://www.w3.org/TR/ttml2/#timing-value-time-expression 32 | 33 | VTT 1: 34 | https://www.w3.org/TR/webvtt1/#collect-a-webvtt-timestamp 35 | 36 | ## Other relevant work: 37 | 38 | SMIL Animation: 39 | https://www.w3.org/TR/smil-animation/ 40 | 41 | Web Animations: 42 | https://w3c.github.io/web-animations/ 43 | 44 | SMIL Timesheets: 45 | https://www.w3.org/TR/timesheets/ 46 | 47 | Media Fragments URI: 48 | https://www.w3.org/TR/media-frags/#naming-time 49 | 50 | Web Media API (CG): 51 | https://www.w3.org/community/webmediaapi/ 52 | 53 | Cloud Browser arch (note): 54 | https://w3c.github.io/Web-and-TV-IG/cloud-browser-tf/ 55 | 56 | Multi Device Timing: 57 | http://webtiming.github.io 58 | https://www.w3.org/community/webtiming/ 59 | See mailing-list discussion: 60 | http://lists.w3.org/Archives/Public/public-sync-media-pub/2018Mar/0003.html 61 | 62 | Also see: https://github.com/w3c/publ-a11y/wiki/Publishing-issues-for-Silver#7-media-overlays 63 | 64 | ## Examples of similar technologies 65 | 66 | * [National Geographic Archives](https://archive.org/details/nationalgeograph21890nati/page/108/mode/2up) 67 | * TTS 68 | * image of a page with a moving highlight 69 | * speed up/slow down audio 70 | * change page and move thru presentation via slider control 71 | -------------------------------------------------------------------------------- /drafts/technology-candidates.md: -------------------------------------------------------------------------------- 1 | --- 2 | Title: Technology Candidates 3 | --- 4 | # SyncMedia Technology Candidates 5 | 6 | The primary considerations when choosing a language to represent the concepts required for the [use cases](use-cases.html) were: 7 | * __Has declarative syntax__: As opposed to a purely scripted custom solution, a declarative syntax provides a more rigid framework for content that will be played on a variety of systems, and will persist in publisher and library collections for years to come. 8 | * __Supports nested structures__: Required for putting complex content (e.g. tables) in a subtree, out of the way of the main presentation, and offering users options for _escaping_. 9 | * __External media references__: The media objects in a SyncMedia presentation exist on their own and do not need to be duplicated in the presentation format. They just need to be referenced. 10 | 11 | That said, here are the candidates and how each fares regarding the requirements. 12 | 13 | ## SMIL 14 | [SMIL3](https://www.w3.org/TR/SMIL3/) {.link} 15 | 16 | 17 | ### Pros 18 | * Successfully used in EPUB3 Media Overlays 19 | * Declarative syntax 20 | * Supports nesting 21 | 22 | ### Cons 23 | * Never was broadly adopted 24 | * WG is no longer active to propose changes to 25 | 26 | ## Timing Object 27 | [Timing Object](http://webtiming.github.io/timingobject/) 28 | 29 | ### Pros 30 | 31 | * Capable of complex media synchronization 32 | 33 | ### Cons 34 | * No declarative syntax 35 | * Spec is incomplete 36 | 37 | ## TTML2 38 | 39 | [TTML2](https://www.w3.org/TR/ttml2/) 40 | 41 | ### Pros 42 | 43 | Capable of complex media synchronization 44 | 45 | ### Cons 46 | 47 | Text lives in the same file as the timing information -- pointing to an external text document is not supported. 48 | 49 | 50 | ## WebVTT 51 | 52 | [WebVTT](https://www.w3.org/TR/webvtt1/) 53 | 54 | ### Pros 55 | 56 | Browser support 57 | 58 | ### Cons 59 | * No external text referencing 60 | * No nested structures 61 | 62 | 63 | ## WebAnimations 64 | [WebAnimations](https://www.w3.org/TR/web-animations-1/) 65 | 66 | ### Pros 67 | 68 | Enables timing and playback 69 | 70 | ### Cons 71 | 72 | No declarative syntax 73 | 74 | 75 | ## WebAnnotations 76 | 77 | [WebAnnotations](https://www.w3.org/annotation/) 78 | 79 | ### Pros 80 | 81 | Good range of selectors 82 | 83 | ### Cons 84 | * No nesting 85 | * No processing model for playback 86 | 87 | 88 | ## Custom 89 | 90 | ### Pros 91 | 92 | Complete control 93 | 94 | ### Cons 95 | 96 | Risk reinventing the wheel 97 | 98 | ## Existing language + custom extensions 99 | 100 | ### Pros 101 | * Take advantage of what exists 102 | * Add what's missing 103 | 104 | ### Cons 105 | * Inherit complexity of existing language 106 | * Risk of additions being short-sighted 107 | -------------------------------------------------------------------------------- /drafts/technology-selection.md: -------------------------------------------------------------------------------- 1 | # Selection Process 2 | 3 | This document provides an overview of the pros and cons of a range of web technologies that were considered for expressing synchronized media in [Web Publications](https://www.w3.org/TR/wpub/). Refer to the [Use Cases](use-cases.md) document to see the use cases under consideration. 4 | 5 | ## Features 6 | 7 | The following aspects were considered: 8 | * Browser support 9 | * Ability to point to text in an external HTML5 document 10 | * Developer-friendly syntax (XML is ok; JSON is better) 11 | * Nested structure support, used for skip/escape features and multiple granularities 12 | 13 | | Name | Browser support | External text | Syntax | Nesting | 14 | |:--------------------------------|:----------------|:--------------|:-------|:--------| 15 | | [SMIL](#smil) | No | Yes | XML | Yes | 16 | | [TTML2](#ttml2) | No | No* | XML | Yes | 17 | | [WebVTT](#webvtt) | Yes | No* | Text | No | 18 | | [WebAnimations](#webanimations) | Yes | n/a | n/a | n/a | 19 | | [WebAnnotations](#webannotations)| No | Yes | JSON | No | 20 | | [Custom](#custom) | No | Yes | JSON | Yes | 21 | 22 | 23 | ### SMIL 24 | [https://www.w3.org/AudioVideo/](https://www.w3.org/AudioVideo/) 25 | 26 | While SMIL was successfully used in EPUB3 Media Overlays, it has a verbose syntax and no specific advantages that would make us keep using it. 27 | 28 | ### TTML2 29 | [https://www.w3.org/TR/ttml2/](https://www.w3.org/TR/ttml2/) 30 | 31 | TTML2 is capable of complex media synchronization beyond text + video. However, the text lives in the same file as the timing information -- it does not support pointing to external text documents. It is possible to use custom metadata or hack ID values to insert a reference, but this still means no out of the box support for what we need. This makes it hard to integrate into the Web Publications environment. 32 | 33 | ### WebVTT 34 | [https://www.w3.org/TR/webvtt1/](https://www.w3.org/TR/webvtt1/) 35 | 36 | The only way to reference external text references would be via custom metadata that sits alongside audio sync points. However, we can't leverage existing browser support - our custom metadata would not be recognized by any implementations except ours. Nesting is also not supported, so we couldn't represent multiple granularities or skippable/escapable structures. 37 | 38 | ### WebAnimations 39 | [https://www.w3.org/TR/web-animations-1/](https://www.w3.org/TR/web-animations-1/) 40 | 41 | While web animations provides good timing and playback support, the lack of a declarative syntax makes it not an option. 42 | 43 | ### WebAnnotations 44 | [https://www.w3.org/annotation/](https://www.w3.org/annotation/) 45 | 46 | Web annotations could represent everything that we need it to, possibly with some customization for nesting, but there's no associated processing model for playback as a sequence of audio clips. 47 | 48 | ### Custom 49 | None of the candidates above that have browser support can adequately express what we require at minimum, so we can consider creating our own format that is developer-friendly, compact, and easily supports the features we want. We can learn from [Readium2 experiments](drafts/readium2.md) in representing Media Overlays in JSON. We can also tie our work into the Audio TF work, because we really don't want audio-only vs audio + text books to be done in wildly different ways. 50 | -------------------------------------------------------------------------------- /drafts/use-cases.md: -------------------------------------------------------------------------------- 1 | # In-scope use cases 2 | 3 | ## Text supplemented with pre-recorded audio narration 4 | * Use an authored navigation interface (e.g. TOC) to reach a point in a content document, and start audio playback from there 5 | * Move around the text content and have the audio playback keep up. E.g. 6 | * Previous/next paragraph 7 | * Previous/next page 8 | * Start playback by clicking a point mid-page 9 | * Speed up/slow down the audio while not affecting the pitch (timescale modification) 10 | * Navigate by time chunk (e.g. skip forward/back 10s) 11 | * Text chunks are highlighted in sync with the audio playback, at the authored granularity 12 | * Text display keeps up with audio. E.g. start audio playback and have the pages turn automatically. 13 | * Selectively filter out content based on semantics ("skippability" in the DAISY world). Note: might there be overlap here with WP personalization? 14 | * Jump out of complex structures and back into the reading flow ("escapability" in the DAISY world) 15 | * Support multiple granularities of navigation, switchable on the fly by the user. E.g. 16 | * Previous/next paragraph 17 | * Previous/next sentence 18 | * Previous/next word or phrase 19 | 20 | ## Pre-recorded audio narration with ToC but no other text contents 21 | 22 | * Most points from above apply here 23 | * Opportunity to collaborate with the (Audio TF)[https://www.w3.org/TR/wpub/#audiobook] 24 | 25 | ## Integrating audio and text editions of a book 26 | 27 | A publisher has made parallel editions of a publication: one is audio, and the other is text. Adding a synchronized media "glue" layer gives the user a playback experience like what's described in the text+audio example above. 28 | 29 | # Out of scope use cases 30 | 31 | ## Video supplemented with text transcript, standalone screenplay 32 | * Select a point in a text transcript to jump to that part of the video 33 | * Search the text of the transcript 34 | * Highlight text in transcript as video plays 35 | 36 | 37 | ## Text supplemented with sign-language video 38 | 39 | * Synchronize the video to the text in a way that permits user navigation (as above for audio) 40 | * Make the video full screen and overlay the text as a caption 41 | * Make the text the primary visual item and the video is reduced in size but maintains a constant position even as user scrolls/navigates 42 | * Highlight the text in sync with the video playback 43 | * Play/pause the video 44 | 45 | ## Video supplemented with descriptive audio 46 | 47 | * Users may navigate the content via an authored navigation interface (e.g. TOC) 48 | * Author may choose to have audio description and video played simultaneously for some parts, and for other parts, instruct the reading system to automatically pause the video and play audio description, and then resume the video playback 49 | * User should be able to independently control volume and speed of both the video and audio description 50 | -------------------------------------------------------------------------------- /older-experiments/player-bkup/audio.js: -------------------------------------------------------------------------------- 1 | import * as Events from './events.js'; 2 | 3 | /* Audio events: 4 | Play 5 | Pause 6 | PositionChange 7 | ClipDone 8 | */ 9 | 10 | let settings = { 11 | volume: 0.8, 12 | rate: 1.0 13 | }; 14 | let clip = { 15 | start: 0, 16 | end: 0, 17 | file: '', 18 | isLastClip: false, 19 | autoplay: true 20 | }; 21 | let audio = null; 22 | let waitForSeek = false; 23 | 24 | function loadFile(file) { 25 | log.debug("Audio Player: file = ", file); 26 | clip.file = file; 27 | let wasMuted = false; 28 | if (audio) { 29 | audio.pause(); 30 | wasMuted = audio.muted; 31 | } 32 | audio = new Audio(file); 33 | audio.currentTime = 0; 34 | audio.muted = wasMuted; 35 | audio.volume = settings.volume; 36 | audio.playbackRate = settings.rate; 37 | audio.addEventListener('progress', e => { onAudioProgress(e) }); 38 | audio.addEventListener('timeupdate', e => { onAudioTimeUpdate(e) }); 39 | } 40 | 41 | function playClip(file, autoplay, start = 0, end = -1, isLastClip = false) { 42 | clip.start = parseFloat(start); 43 | clip.end = parseFloat(end); 44 | clip.isLastClip = isLastClip; 45 | clip.autoplay = autoplay; 46 | if (file != clip.file) { 47 | loadFile(file); 48 | } 49 | else { 50 | waitForSeek = true; 51 | // check that the current time is far enough from the desired start time 52 | // otherwise it stutters due to the coarse granularity of the browser's timeupdate event 53 | if (audio.currentTime < clip.start - .10 || audio.currentTime > clip.start + .10) { 54 | audio.currentTime = clip.start; 55 | } 56 | else { 57 | // log.debug("Audio Player: close enough, not resetting"); 58 | } 59 | } 60 | } 61 | 62 | async function pause() { 63 | if (audio) { 64 | Events.trigger('Audio.Pause'); 65 | await audio.pause(); 66 | } 67 | } 68 | 69 | async function resume() { 70 | Events.trigger('Audio.Play'); 71 | await audio.play(); 72 | } 73 | 74 | function isPlaying() { 75 | return !!(audio.currentTime > 0 76 | && !audio.paused 77 | && !audio.ended 78 | && audio.readyState > 2); 79 | } 80 | 81 | 82 | // this event fires when the file downloads/is downloading 83 | async function onAudioProgress(event) { 84 | // if the file is playing while the rest of it is downloading, 85 | // this function will get called a few times 86 | // we don't want it to reset playback so check that current time is zero before proceeding 87 | if (audio.currentTime == 0 && !isPlaying()) { 88 | log.debug("Audio Player: starting playback"); 89 | audio.currentTime = clip.start; 90 | 91 | if (clip.autoplay) { 92 | Events.trigger('Audio.Play'); 93 | await audio.play(); 94 | } 95 | else { 96 | Events.trigger('Audio.Pause'); 97 | } 98 | } 99 | } 100 | 101 | // this event fires when the playback position changes 102 | async function onAudioTimeUpdate(event) { 103 | Events.trigger('Audio.PositionChange', audio.currentTime, audio.duration); 104 | 105 | if (waitForSeek) { 106 | waitForSeek = false; 107 | Events.trigger('Audio.Play'); 108 | await audio.play(); 109 | } 110 | else { 111 | if (clip.end != -1 && audio.currentTime >= clip.end) { 112 | if (clip.isLastClip) { 113 | Events.trigger('Audio.Pause'); 114 | audio.pause(); 115 | } 116 | Events.trigger("Audio.ClipDone", clip.file); 117 | } 118 | else if (audio.currentTime >= audio.duration && audio.ended) { 119 | Events.trigger('Audio.Pause'); 120 | audio.pause(); 121 | log.debug("Audio Player: element ended playback"); 122 | Events.trigger("Audio.ClipDone", clip.file); 123 | } 124 | } 125 | } 126 | 127 | function setRate(val) { 128 | settings.rate = val; 129 | if (audio) { 130 | audio.playbackRate = val; 131 | } 132 | } 133 | 134 | function setPosition(val) { 135 | if (audio) { 136 | if (val < 0){ 137 | audio.currentTime = 0; 138 | } 139 | else if (val > audio.duration) { 140 | audio.currentTime = audio.duration; 141 | } 142 | else { 143 | audio.currentTime = val; 144 | } 145 | } 146 | } 147 | 148 | function setVolume(val) { 149 | settings.volume = val; 150 | if (audio) { 151 | audio.volume = val; 152 | } 153 | } 154 | 155 | function getPosition() { 156 | if (audio) { 157 | return audio.currentTime; 158 | } 159 | else { 160 | return 0; 161 | } 162 | } 163 | 164 | function mute() { 165 | if (audio) { 166 | audio.muted = true; 167 | } 168 | } 169 | 170 | function unmute() { 171 | if (audio) { 172 | audio.muted = false; 173 | } 174 | } 175 | function isMuted() { 176 | if (audio) { 177 | return audio.muted; 178 | } 179 | return false; 180 | } 181 | export { 182 | playClip, 183 | isPlaying, 184 | pause, 185 | resume, 186 | setRate, 187 | setPosition, 188 | getPosition, 189 | setVolume, 190 | mute, 191 | unmute, 192 | isMuted 193 | }; 194 | -------------------------------------------------------------------------------- /older-experiments/player-bkup/controls.js: -------------------------------------------------------------------------------- 1 | import * as Events from './events.js'; 2 | import * as Audio from './audio.js'; 3 | import * as Narrator from './narrator.js'; 4 | import * as Utils from './utils.js'; 5 | 6 | let isPlaying = false; 7 | 8 | function init() { 9 | document.querySelector("#current-position").textContent = '--'; 10 | 11 | document.querySelector("#rate").addEventListener("input", 12 | e => setPlaybackRate(e.target.value)); 13 | document.querySelector("#volume").addEventListener("input", 14 | e => setPlaybackVolume(e.target.value)); 15 | 16 | document.querySelector("#reset-rate").addEventListener("click", 17 | e => setPlaybackRate(100)); 18 | document.querySelector("#mute").addEventListener("click", e => toggleMute()); 19 | 20 | document.querySelector("#next").addEventListener("click", e => next()); 21 | document.querySelector("#prev").addEventListener("click", e => prev()); 22 | 23 | document.querySelector("#rate").value = 100; 24 | setPlaybackRate(100); 25 | document.querySelector("#volume").value = 80; 26 | setPlaybackVolume(80); 27 | 28 | Events.on("Audio.PositionChange", onPositionChange); 29 | Events.on("Audio.Play", onPlay); 30 | Events.on("Audio.Pause", onPause); 31 | 32 | document.querySelector("#play-pause").addEventListener("click", e => { 33 | if (isPlaying) { 34 | Audio.pause(); 35 | } 36 | else { 37 | Audio.resume(); 38 | } 39 | }); 40 | } 41 | 42 | function next() { 43 | Narrator.next(); 44 | } 45 | function prev() { 46 | Narrator.prev(); 47 | } 48 | function toggleMute() { 49 | if (Audio.isMuted()) { 50 | document.querySelector("#volume-wrapper").classList.remove("disabled"); 51 | document.querySelector("#volume").disabled = false; 52 | document.querySelector("#mute").setAttribute("title", "Mute"); 53 | document.querySelector("#mute").setAttribute("aria-label", "Mute"); 54 | // make the x disappear on the icon 55 | Array.from(document.querySelectorAll(".mute-x")).map(node => node.classList.remove("muted")); 56 | Audio.unmute(); 57 | } 58 | else { 59 | document.querySelector("#volume-wrapper").classList.add("disabled"); 60 | document.querySelector("#volume").disabled = true; 61 | document.querySelector("#mute").setAttribute("title", "Unmute"); 62 | document.querySelector("#mute").setAttribute("aria-label", "Unmute"); 63 | // make the x appear on the icon 64 | Array.from(document.querySelectorAll(".mute-x")).map(node => node.classList.add("muted")); 65 | Audio.mute(); 66 | } 67 | } 68 | function setPlaybackRate(val) { 69 | document.querySelector("#rate-value").textContent = `${val/100}x`; 70 | if (document.querySelector('#rate').value != val) { 71 | document.querySelector("#rate").value = val; 72 | } 73 | Audio.setRate(val/100); 74 | } 75 | 76 | function setPlaybackVolume(val) { 77 | document.querySelector("#volume-value").textContent = `${val}%`; 78 | Audio.setVolume(val/100); 79 | } 80 | 81 | function onPositionChange(position, fileDuration) { 82 | 83 | let currentPosition = Utils.secondsToHms(position); 84 | let fileLength = '--'; 85 | if (!isNaN(fileDuration)) { 86 | let duration = Utils.secondsToHms(fileDuration); 87 | fileLength = Utils.secondsToHms(fileDuration); 88 | } 89 | // trim the leading zeros 90 | if (currentPosition.indexOf("00:") == 0) { 91 | currentPosition = currentPosition.slice(3); 92 | } 93 | if (fileLength.indexOf("00:") == 0) { 94 | fileLength = fileLength.slice(3); 95 | } 96 | 97 | document.querySelector("#current-position").innerHTML = `${currentPosition} of ${fileLength}`; 98 | } 99 | 100 | function onPlay() { 101 | document.querySelector("#pause").classList.remove("disabled"); 102 | document.querySelector("#play").classList.add("disabled"); 103 | document.querySelector("#play-pause").setAttribute("aria-label", "Pause"); 104 | document.querySelector("#play-pause").setAttribute("title", "Pause"); 105 | isPlaying = true; 106 | } 107 | 108 | function onPause() { 109 | document.querySelector("#pause").classList.add("disabled"); 110 | document.querySelector("#play").classList.remove("disabled"); 111 | document.querySelector("#play-pause").setAttribute("aria-label", "Play"); 112 | document.querySelector("#play-pause").setAttribute("title", "Play"); 113 | isPlaying = false; 114 | } 115 | 116 | 117 | export { 118 | init 119 | } -------------------------------------------------------------------------------- /older-experiments/player-bkup/css/base.css: -------------------------------------------------------------------------------- 1 | @import '../../common/colors.css'; 2 | 3 | body { 4 | font-family: Arial; 5 | background-color: var(--bk); 6 | color: var(--text); 7 | } 8 | 9 | a, a:visited { 10 | color: var(--text); 11 | padding-left: 1vh; 12 | text-decoration: none; 13 | } 14 | a:hover { 15 | text-decoration: underline !important; 16 | } 17 | .disabled { 18 | display: none; 19 | } 20 | button { 21 | cursor: pointer; 22 | background: transparent; 23 | border: none; 24 | color: var(--text); 25 | } 26 | -------------------------------------------------------------------------------- /older-experiments/player-bkup/css/controls.css: -------------------------------------------------------------------------------- 1 | @import "../../common/colors.css"; 2 | 3 | :root { 4 | --enabled-opacity: 80%; 5 | --disabled-opacity: 30%; 6 | } 7 | 8 | footer { 9 | display: grid; 10 | grid-column-gap: 1rem; 11 | grid-template-columns: 20% auto 20%; 12 | grid-template-rows: 1fr; 13 | grid-template-areas: 14 | "adjust transport status"; 15 | } 16 | #adjust { 17 | grid-area: adjust; 18 | display: grid; 19 | } 20 | #transport { 21 | grid-area: transport; 22 | justify-self: center; 23 | display: flex; 24 | align-self: center; 25 | align-items: center; 26 | } 27 | #chapter-progress { 28 | grid-area: status; 29 | align-self: center; 30 | justify-self: right; 31 | } 32 | footer div.slider { 33 | display: grid; 34 | grid-template-columns: 20% 50% 15% 10%; 35 | grid-column-gap: 0.5rem; 36 | align-items: center; 37 | /*! position: relative; */ 38 | /*! left: -48px; */ 39 | } 40 | footer input { 41 | background: var(--workaroundbk); 42 | } 43 | footer input::-moz-range-track{ 44 | background-color: lightgray; 45 | opacity: var(--enabled-opacity); 46 | height: 5px; 47 | border-radius: 5px; 48 | } 49 | footer input::-webkit-slider-runnable-track { 50 | background-color: lightgray; 51 | opacity: var(--enabled-opacity); 52 | height: 5px; 53 | border-radius: 5px; 54 | } 55 | footer input::-ms-track { 56 | background-color: lightgray; 57 | opacity: var(--enabled-opacity); 58 | height: 5px; 59 | border-radius: 5px; 60 | } 61 | footer input::-moz-range-thumb { 62 | height: 20px; 63 | cursor: pointer; 64 | border: none; 65 | } 66 | footer input::-webkit-slider-thumb { 67 | height: 20px; 68 | cursor: pointer; 69 | margin: -4px; /* alignment fix for chrome */ 70 | } 71 | footer input::-ms-thumb { 72 | height: 20px; 73 | cursor: pointer; 74 | } 75 | div.slider.disabled label, div.slider.disabled span { 76 | opacity: var(--disabled-opacity); 77 | } 78 | div.slider.disabled input::-moz-range-track { 79 | opacity: var(--disabled-opacity); 80 | } 81 | div.slider.disabled input::-webkit-slider-runnable-track { 82 | opacity: var(--disabled-opacity); 83 | } 84 | div.slider.disabled input::-ms-track { 85 | opacity: var(--disabled-opacity); 86 | } 87 | div.slider.disabled input::-moz-range-thumb { 88 | opacity: var(--disabled-opacity); 89 | cursor: default; 90 | } 91 | div.slider.disabled input::-webkit-slider-thumb { 92 | opacity: var(--disabled-opacity); 93 | cursor: default; 94 | } 95 | div.slider.disabled input::-ms-thumb { 96 | opacity: var(--disabled-opacity); 97 | cursor: default; 98 | } 99 | 100 | footer button { 101 | width: 4rem; 102 | height: 4rem; 103 | } 104 | 105 | #adjust button { 106 | text-align: left; 107 | height: min-content; 108 | width: 2rem; 109 | height: 2rem; 110 | } 111 | #adjust button svg { 112 | padding: 4px; 113 | } 114 | path { 115 | stroke: white; 116 | stroke-linejoin: round; 117 | stroke-linecap: round; 118 | } 119 | svg { 120 | opacity: 90%; 121 | pointer-events: all; 122 | } 123 | #transport svg:hover { 124 | fill: var(--hover); 125 | stroke: var(--hover); 126 | opacity: 80% !important; 127 | } 128 | #bookmark svg:hover line { 129 | stroke: var(--bk); 130 | } 131 | #mute, #reset-rate { 132 | width: min-content; 133 | padding: 0; 134 | } 135 | line.mute-x:not(.muted) { 136 | display: none; 137 | } 138 | #play-pause { 139 | width: 6rem; 140 | height: 6rem; 141 | } 142 | #chapter-progress span { 143 | display: block; 144 | } 145 | @media (max-width: 768px) { 146 | footer { 147 | display: grid; 148 | grid-template-columns: 1fr; 149 | grid-template-rows: auto; 150 | grid-row-gap: 0.5rem; 151 | grid-template-areas: 152 | "transport" 153 | "adjust" 154 | "status"; 155 | justify-items: center; 156 | } 157 | #transport button:not(#play-pause) { 158 | width: 3rem; 159 | height: 3rem; 160 | } 161 | #play-pause { 162 | width: 4rem; 163 | height: 4rem; 164 | } 165 | #adjust label { 166 | visibility: hidden; 167 | } 168 | #chapter-progress { 169 | margin: 0; 170 | font-size: smaller; 171 | justify-self: center; 172 | } 173 | #chapter-progress .label { 174 | display: none; 175 | } 176 | #volume-wrapper { 177 | display: none; 178 | } 179 | #rate-wrapper { 180 | grid-template-rows: 50% auto auto auto; 181 | } 182 | } 183 | 184 | @media (orientation: landscape) and (max-height: 500px) { 185 | #transport button:not(#play-pause) { 186 | width: 3rem; 187 | height: 3rem; 188 | } 189 | #play-pause { 190 | width: 4rem; 191 | height: 4rem; 192 | } 193 | #adjust { 194 | position: relative; 195 | left: -3rem; 196 | } 197 | #rate-wrapper { 198 | display: grid; 199 | grid-template-areas: 200 | "label slider value" 201 | "reset reset reset"; 202 | grid-column-gap: 0.5rem; 203 | grid-row-gap: 0; 204 | align-items: center; 205 | } 206 | #rate-wrapper label { 207 | grid-area: label; 208 | } 209 | #rate-wrapper input { 210 | grid-area: slider; 211 | } 212 | #rate-wrapper span { 213 | grid-area: value; 214 | } 215 | #rate-wrapper button { 216 | grid-area: reset; 217 | justify-self: center; 218 | } 219 | #adjust label { 220 | visibility: hidden; 221 | } 222 | #volume-wrapper { 223 | display: none; 224 | } 225 | } -------------------------------------------------------------------------------- /older-experiments/player-bkup/css/player.css: -------------------------------------------------------------------------------- 1 | html { 2 | height: 100%; 3 | } 4 | body { 5 | height: 99%; 6 | display: grid; 7 | grid-template-areas: 8 | "header header header" 9 | "nav main aside" 10 | "footer footer footer"; 11 | grid-template-rows: min-content minmax(auto, 78%) min-content; 12 | grid-template-columns: 15% auto 20%; 13 | grid-gap: 1rem; 14 | overflow: hidden; 15 | } 16 | main { 17 | grid-area: main; 18 | overflow: scroll; 19 | } 20 | aside { 21 | grid-area: aside; 22 | overflow: scroll; 23 | } 24 | body > nav { 25 | grid-area: nav; 26 | } 27 | body > nav details { 28 | height: 100%; 29 | } 30 | body > nav details div { 31 | height: 100%; 32 | } 33 | 34 | body > nav div { 35 | overflow-y: scroll; 36 | } 37 | 38 | header { 39 | border-bottom: 2px solid gray; 40 | align-content: center; 41 | grid-area: header; 42 | display: grid; 43 | grid-template-areas: "title nav"; 44 | grid-template-columns: 80% auto; 45 | } 46 | header h1 { 47 | margin-top: 1rem; 48 | margin-bottom: 1rem; 49 | } 50 | header nav { 51 | grid-area: nav; 52 | display: grid; 53 | } 54 | header div#pub-info { 55 | grid-area: title; 56 | } 57 | header nav ul { 58 | list-style-type: none; 59 | display: flex; 60 | justify-content: space-evenly; 61 | padding: 0; 62 | align-items: center; 63 | } 64 | footer { 65 | grid-area: footer; 66 | } 67 | 68 | div#player-page { 69 | grid-column: 2/3; 70 | justify-self: center; 71 | height: 100%; 72 | } 73 | 74 | #cover-image-container { 75 | display: grid; 76 | } 77 | #cover-image-container img { 78 | justify-self: center; 79 | } 80 | iframe { 81 | height: 100%; 82 | background-color: white; 83 | border-width: 0; 84 | display: block; 85 | } 86 | #player-page iframe { 87 | width: 100%; 88 | } 89 | #bookmarks li { 90 | display: flex; 91 | margin-bottom: 1rem; 92 | } 93 | #bookmarks li button { 94 | margin-left: auto; 95 | } 96 | #bookmarks button { 97 | border: thin white solid; 98 | border-radius: 4px; 99 | padding: 0.25rem; 100 | } 101 | #player-captions { 102 | font-size: 3rem; 103 | text-align: center; 104 | margin-top: 2rem; 105 | color: var(--hltext); 106 | opacity: 0.9; 107 | } 108 | #edit-bookmarks { 109 | margin: auto; 110 | } 111 | #bookmarks nav { 112 | display: grid; 113 | } 114 | @media (max-width: 768px) { 115 | body { 116 | grid-template-areas: 117 | "header" 118 | "main" 119 | "footer" 120 | "nav" 121 | "aside"; 122 | grid-template-rows: min-content auto min-content min-content min-content; 123 | grid-template-columns: 0.9fr; 124 | grid-gap: 0.5rem 125 | 126 | } 127 | header h1 { 128 | font-size: medium; 129 | } 130 | header nav ul { 131 | flex-direction: column; 132 | font-size: medium; 133 | justify-self: right; 134 | margin-top: 1vh; 135 | } 136 | header nav li { 137 | line-height: 1.5; 138 | } 139 | #player-toc { 140 | margin-top: 1rem; 141 | margin-bottom: 1rem; 142 | font-weight: bold; 143 | } 144 | #edit-bookmarks { 145 | margin: 0; 146 | width: 3rem; 147 | } 148 | } 149 | 150 | @media (orientation: landscape) and (max-height: 500px) { 151 | header h1 { 152 | font-size: medium; 153 | } 154 | header nav { 155 | font-size: medium; 156 | } 157 | #player-toc iframe { 158 | width: 99%; 159 | } 160 | } -------------------------------------------------------------------------------- /older-experiments/player-bkup/css/pub-default.css: -------------------------------------------------------------------------------- 1 | @import 'base.css'; 2 | @import '../../common/colors.css'; 3 | 4 | .-active-element { 5 | color: var(--hltext); 6 | opacity: 0.9; 7 | transition: color .4s; 8 | } 9 | 10 | .-document-playing { 11 | 12 | } 13 | main { 14 | line-height: 1.5; 15 | } 16 | a, a:visited { 17 | color: var(--text); 18 | padding-left: 1vh; 19 | text-decoration: none; 20 | } 21 | a:hover { 22 | text-decoration: underline !important; 23 | } 24 | nav[role=doc-toc] ol { 25 | /* list-style-type: none; */ 26 | line-height: 5vh; 27 | padding-left: 0; 28 | color: var(--text); 29 | grid-column: 1; 30 | } 31 | 32 | nav[role=doc-toc] a.current, nav[role=doc-toc] a.current:visited { 33 | color: var(--hltext); 34 | opacity: 0.9; 35 | transition: color 1s; 36 | } 37 | nav[role=doc-toc] li { 38 | margin-top: 1rem; 39 | margin-bottom: 1rem; 40 | } -------------------------------------------------------------------------------- /older-experiments/player-bkup/events.js: -------------------------------------------------------------------------------- 1 | // global event manager 2 | let eventHandlers = {}; 3 | 4 | function on(eventName, handler) { 5 | if (!eventHandlers[eventName]) { 6 | eventHandlers[eventName] = []; 7 | } 8 | eventHandlers[eventName].push(handler); 9 | } 10 | 11 | function off(eventName, handler) { 12 | let handlers = eventHandlers[eventName]; 13 | if (!handlers) return; 14 | for (let i = 0; i < handlers.length; i++) { 15 | if (handlers[i] === handler) { 16 | handlers.splice(i--, 1); 17 | } 18 | } 19 | } 20 | 21 | function trigger(eventName, ...args) { 22 | if (!eventHandlers[eventName]) { 23 | return; // no handlers for that event name 24 | } 25 | 26 | // call the handlers 27 | eventHandlers[eventName].forEach(handler => handler.apply(this, args)); 28 | } 29 | 30 | export { 31 | on, off, trigger 32 | }; -------------------------------------------------------------------------------- /older-experiments/player-bkup/iframe.js: -------------------------------------------------------------------------------- 1 | import * as Events from './events.js'; 2 | 3 | async function openUrl(url, parentSelector) { 4 | return new Promise((resolve, reject) => { 5 | let content = document.querySelector(parentSelector); 6 | content.innerHTML = ''; 7 | // disable the iframe parent element while we change the content and apply a stylesheet 8 | // but if it's already disabled, don't re-enable it at the end of this function 9 | // because it means we're in captions mode and we want it to stay disabled 10 | let wasAlreadyDisabled = content.classList.contains('disabled') && 11 | !document.querySelector("#player-captions").classList.contains('disabled'); 12 | if (!wasAlreadyDisabled) { 13 | content.classList.add('disabled'); 14 | } 15 | let iframe = document.createElement('iframe'); 16 | iframe.onload = () => { 17 | log.debug(`iframe loaded ${url}`); 18 | if (iframe.contentDocument) { 19 | if (localStorage.getItem("fontsize")) { 20 | iframe.contentDocument.querySelector("body").style.fontSize = localStorage.getItem("fontsize"); 21 | } 22 | 23 | // a bit hacky but ensures we are only listening for clicks in the main text area 24 | // and not the TOC 25 | if (parentSelector.indexOf("player-page") != -1) { 26 | let allSyncedElms = Array.from(iframe.contentDocument.querySelectorAll("*[id]")); 27 | allSyncedElms.map(elm => { 28 | elm.addEventListener("click", e => { 29 | Events.trigger("Document.Click", e.target.getAttribute("id")); 30 | }); 31 | }); 32 | } 33 | 34 | resolve(iframe.contentDocument); 35 | } 36 | else { 37 | log.warn("Can't access iframe content doc"); 38 | resolve(null); 39 | } 40 | // a short delay prevents the screen from flashing as it becomes un-disabled 41 | setTimeout(() => { 42 | if (!wasAlreadyDisabled) { 43 | content.classList.remove('disabled') 44 | } 45 | }, 300); 46 | }; 47 | iframe.setAttribute('src', url); 48 | content.appendChild(iframe); 49 | }); 50 | } 51 | 52 | export { openUrl }; -------------------------------------------------------------------------------- /older-experiments/player-bkup/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | Sync Player 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 |
15 |
16 |
17 |

18 |

19 |
20 | 21 | 90 | 91 | -------------------------------------------------------------------------------- /older-experiments/player-bkup/index.js: -------------------------------------------------------------------------------- 1 | import { Sync } from './sync/index.js'; 2 | import * as Audio from './audio.js'; 3 | import * as Controls from './controls.js'; 4 | import * as Events from './events.js'; 5 | 6 | var syncgraph; 7 | 8 | document.addEventListener("DOMContentLoaded", async () => { 9 | Events.on("Audio.PositionChange", onAudioPositionChange); 10 | Events.on("Narrator.Highlight", onNarratorHighlight); 11 | 12 | let urlSearchParams = new URLSearchParams(document.location.search); 13 | if (urlSearchParams.has("q")) { 14 | open(urlSearchParams.get("q")); 15 | } 16 | }); 17 | 18 | async function open(url) { 19 | let sync = new Sync(); 20 | await sync.loadUrl(url); 21 | if (sync.errors.length > 0) { 22 | console.log("Loading error(s)"); 23 | return; 24 | } 25 | else { 26 | syncgraph = sync.graph; 27 | Controls.init(); 28 | } 29 | } 30 | 31 | 32 | function onNarratorHighlight(ids, innerHTML) { 33 | 34 | } 35 | 36 | // event callback 37 | // async function chapterPlaybackDone(src) { 38 | // log.debug("Player: end of chapter", src); 39 | // // narrator sends empty strings for src values 40 | // // we really just need to check it against the manifest for audio-only chapters 41 | // if (src == '' || src == manifest.getCurrentReadingOrderItem().url) { 42 | // let readingOrderItem = manifest.gotoNextReadingOrderItem(); 43 | // if (readingOrderItem) { 44 | // await loadChapter(readingOrderItem.url); 45 | // } 46 | // else { 47 | // log.debug("Player: end of book"); 48 | // } 49 | // } 50 | // // else ignore it, sometimes the audio element generates multiple end events 51 | // } 52 | 53 | 54 | // add offset data to the last read position 55 | function onAudioPositionChange(position) { 56 | 57 | } 58 | 59 | -------------------------------------------------------------------------------- /older-experiments/player-bkup/narrator.js: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/w3c/sync-media-pub/82863347f74086aab02e31c33a69de8b104209f4/older-experiments/player-bkup/narrator.js -------------------------------------------------------------------------------- /older-experiments/player-bkup/utils.js: -------------------------------------------------------------------------------- 1 | async function fetchFile(file) { 2 | let data = await fetch(file); 3 | let text = await data.text(); 4 | return text; 5 | } 6 | 7 | function isImage(encodingFormat) { 8 | return [ 9 | 'image/jpeg', 10 | 'image/png', 11 | 'image/svg+xml', 12 | 'image/gif' 13 | ].includes(encodingFormat); 14 | } 15 | 16 | function isAudio(encodingFormat) { 17 | return [ 18 | 'audio/mpeg', 19 | 'audio/ogg', 20 | 'audio/mp-4' 21 | ].includes(encodingFormat); 22 | } 23 | function isText() { 24 | return true; 25 | } 26 | function isInViewport(elm, doc) { 27 | let bounding = elm.getBoundingClientRect(); 28 | return ( 29 | bounding.top >= 0 && 30 | bounding.left >= 0 && 31 | bounding.bottom <= (doc.defaultView.innerHeight || doc.documentElement.clientHeight) && 32 | bounding.right <= (doc.defaultView.innerWidth || doc.documentElement.clientWidth) 33 | ); 34 | } 35 | const secondsToHms = seconds => moment.utc(seconds * 1000).format('HH:mm:ss'); 36 | 37 | export { fetchFile, isImage, isAudio, isText, isInViewport, secondsToHms }; -------------------------------------------------------------------------------- /older-experiments/synclib/build.sh: -------------------------------------------------------------------------------- 1 | npx rollup -c 2 | wait 3 | cp build/synclib.js ../player/src -------------------------------------------------------------------------------- /older-experiments/synclib/rollup.config.js: -------------------------------------------------------------------------------- 1 | export default { 2 | input: './src/index.js', 3 | output: { 4 | file: './build/synclib.js', 5 | format: 'esm', 6 | name: 'synclib' 7 | } 8 | } -------------------------------------------------------------------------------- /older-experiments/synclib/src/index.js: -------------------------------------------------------------------------------- 1 | import {SyncMedia} from './syncMedia.js'; 2 | import * as Utils from './utils.js'; 3 | const VERSION = '0.1.0'; 4 | 5 | export { 6 | SyncMedia, 7 | Utils, 8 | VERSION 9 | }; -------------------------------------------------------------------------------- /older-experiments/synclib/src/ingestXml.js: -------------------------------------------------------------------------------- 1 | // input: SyncMedia XML string 2 | // output: JSON data model 3 | import * as Util from './utils.js'; 4 | 5 | function parse(xml) { 6 | let model = {}; 7 | let domparser = new DOMParser(); 8 | let doc = domparser.parseFromString(xml, "application/xml"); 9 | let headElm = doc.documentElement.getElementsByTagName("head"); 10 | if (headElm.length > 0) { 11 | model.head = parseHead(headElm[0]); 12 | } 13 | let bodyElm = doc.documentElement.getElementsByTagName("body"); 14 | if (bodyElm.length > 0) { 15 | model.body = parseNode(bodyElm[0], model); 16 | } 17 | return model; 18 | } 19 | 20 | function parseHead(node) { 21 | let obj = {}; 22 | let tracks = Array.from(node.getElementsByTagName("sync:track")); 23 | obj.tracks = tracks.map(track => { 24 | let paramElms = Array.from(track.getElementsByTagName("param")); 25 | let trackObj = { 26 | label: track.getAttribute("sync:label"), 27 | defaultSrc: track.getAttribute("sync:defaultSrc"), 28 | defaultFor: track.getAttribute("sync:defaultFor"), 29 | trackType: track.getAttribute("sync:trackType"), 30 | id: track.getAttribute("id"), 31 | param: {} 32 | }; 33 | if (paramElms.length > 0) { 34 | trackObj.params = parseParam(paramElms); 35 | } 36 | return trackObj; 37 | }); 38 | obj.metadata = {}; 39 | let metadataElm = Array.from(node.getElementsByTagName('metadata')); 40 | if (metadataElm.length > 0) { 41 | metadataElm = metadataElm[0]; // there's just one element 42 | Array.from(metadataElm.children).map(metaElm => { 43 | // support 44 | // and 45 | // value e.g. ... 46 | if (metaElm.tagName == 'meta') { 47 | let name = metaElm.getAttribute('name'); 48 | let content = metaElm.getAttribute('content'); 49 | if (name && content) { 50 | obj.metadata[name] = content; 51 | } 52 | else { 53 | // TODO issue a warning 54 | } 55 | } 56 | else { 57 | let name = metaElm.tagName; 58 | let content = metaElm.textContent; 59 | if (name && content) { 60 | obj.metadata[name] = content; 61 | } 62 | } 63 | }); 64 | } 65 | return obj; 66 | } 67 | 68 | function parseNode(node) { 69 | if (node.nodeName == "body" || node.nodeName == "seq" || node.nodeName == "par") { 70 | // body has type "seq" 71 | let type = node.nodeName == "body" || node.nodeName == "seq" ? "seq" : "par"; 72 | let obj = { 73 | type, 74 | roles: parseRoles(node.getAttribute("sync:role")), 75 | media: Array.from(node.children).map(n => parseNode(n)), 76 | id: node.getAttribute("id") 77 | }; 78 | return obj; 79 | } 80 | else if (Util.isMedia(node.nodeName)) { 81 | let paramElms = Array.from(node.getElementsByTagName("param")); 82 | let obj = { 83 | type: node.nodeName, 84 | src: node.getAttribute("src"), 85 | track: node.getAttribute("sync:track"), 86 | id: node.getAttribute("id"), 87 | clipBegin: parseClockValue(node.getAttribute("clipBegin")), 88 | clipEnd: parseClockValue(node.getAttribute("clipEnd")), 89 | params: {} 90 | } 91 | if (paramElms.length > 0) { 92 | obj.params = parseParam(paramElms); 93 | } 94 | return obj; 95 | } 96 | } 97 | 98 | function parseParam(params) { 99 | let obj = {}; 100 | params.map(param => { 101 | let name = param.getAttribute("name"); 102 | let value = param.getAttribute("value"); 103 | obj[name] = value; 104 | }); 105 | return obj; 106 | } 107 | 108 | function parseRoles(roles) { 109 | if (roles) { 110 | let roleArray = roles.split(" "); 111 | return roleArray; 112 | } 113 | else { 114 | return []; 115 | } 116 | } 117 | 118 | // parse the timestamp and return the value in seconds 119 | // supports this syntax: https://www.w3.org/publishing/epub/epub-mediaoverlays.html#app-clock-examples 120 | function parseClockValue(value) { 121 | if (!value) { 122 | return null; 123 | } 124 | let hours = 0; 125 | let mins = 0; 126 | let secs = 0; 127 | 128 | if (value.indexOf("min") != -1) { 129 | mins = parseFloat(value.substr(0, value.indexOf("min"))); 130 | } 131 | else if (value.indexOf("ms") != -1) { 132 | var ms = parseFloat(value.substr(0, value.indexOf("ms"))); 133 | secs = ms/1000; 134 | } 135 | else if (value.indexOf("s") != -1) { 136 | secs = parseFloat(value.substr(0, value.indexOf("s"))); 137 | } 138 | else if (value.indexOf("h") != -1) { 139 | hours = parseFloat(value.substr(0, value.indexOf("h"))); 140 | } 141 | else { 142 | // parse as hh:mm:ss.fraction 143 | // this also works for seconds-only, e.g. 12.345 144 | let arr = value.split(":"); 145 | secs = parseFloat(arr.pop()); 146 | if (arr.length > 0) { 147 | mins = parseFloat(arr.pop()); 148 | if (arr.length > 0) { 149 | hours = parseFloat(arr.pop()); 150 | } 151 | } 152 | } 153 | let total = hours * 3600 + mins * 60 + secs; 154 | return total; 155 | } 156 | export {parse}; -------------------------------------------------------------------------------- /older-experiments/synclib/src/notes.txt: -------------------------------------------------------------------------------- 1 | Notes 2 | --- 3 | 4 | DESCRIPTION 5 | 6 | Input: SyncMedia XML file 7 | Output: Timegraph 8 | Methods: Moving around the timegraph 9 | 10 | 11 | USAGE 12 | 13 | import {SyncMedia} from './synclib.js'' 14 | 15 | let sync = new SyncMedia(); 16 | await sync.loadUrl(url); 17 | 18 | 19 | PROPERTIES 20 | 21 | graph: An array of timegraph entries 22 | [ 23 | { 24 | timestamp: Point in the overall timeline where this entry occurs, 25 | events: [ 26 | { 27 | node: { 28 | type: element name 29 | src: full path to the media resource, 30 | selector: fragment selector, 31 | clipBegin, 32 | clipEnd, 33 | id: XML ID, 34 | _id: internal ID, 35 | params:{ 36 | name: value, 37 | ... 38 | }, 39 | roles: [], 40 | track: object, 41 | dur: duration in seconds 42 | }, 43 | type: start | end | mid, 44 | elapsed: presentation time that's elapsed since start of node 45 | }, 46 | ... 47 | ] 48 | }, 49 | ... 50 | ] 51 | 52 | model: tree model 53 | 54 | errors: empty at the moment 55 | 56 | skips: list of roles being skipped 57 | 58 | roles: all presentation roles 59 | 60 | 61 | METHODS 62 | 63 | All the following return a timegraph entry. 64 | 65 | see syncMedia.js for functions that move around the timegraph 66 | 67 | TODO 68 | 69 | Accommodate multiple roles on a node. 70 | Implement skippability 71 | Implement escapability -------------------------------------------------------------------------------- /older-experiments/synclib/src/timegraph.js: -------------------------------------------------------------------------------- 1 | import * as utils from './utils.js'; 2 | 3 | /* 4 | timegraph: 5 | ordered by timestamp 6 | [ 7 | { 8 | timestamp: 0, 9 | events: [ 10 | { 11 | node: { 12 | src, 13 | selector, 14 | roles, 15 | params, 16 | clipBegin, 17 | clipEnd, 18 | type, 19 | track, 20 | dur 21 | }, 22 | eventType: start | end | mid, 23 | elapsed: presentation time that's elapsed since start of node, 24 | timegraphEntries: all timegraph entries where this node appears in the events list 25 | } 26 | ... 27 | ] 28 | }, 29 | ... 30 | ] 31 | */ 32 | function buildTimegraph(body) { 33 | addDurations(body); 34 | fixZeroDurations(body); 35 | let timegraph = createEvents(body); 36 | let entryId = 0; 37 | // add a property: last 38 | // this indicates that the node ends after this clip or segment is done 39 | timegraph.map((entry, idx) => { 40 | let startAndMidEvents = entry.events.filter(event => event.eventType == "start" || event.eventType == "mid"); 41 | startAndMidEvents.map(event => { 42 | event.last = false; 43 | if (idx < timegraph.length - 1) { 44 | let nextEntry = timegraph[idx + 1]; 45 | let endEvents = nextEntry.events.filter(event => event.eventType == "end"); 46 | if (endEvents.find(endEvent => endEvent.node == event.node)) { 47 | event.last = true; 48 | } 49 | } 50 | }); 51 | entry._id = entryId; 52 | entryId++; 53 | }); 54 | 55 | return timegraph; 56 | } 57 | 58 | function addDurations(node) { 59 | if (node.dur) { 60 | return; 61 | } 62 | if (utils.isMedia(node.type)) { 63 | if (utils.isTimedMedia(node)) { 64 | node.dur = utils.calculateDuration(node); 65 | } 66 | else { 67 | node.dur = 0; 68 | } 69 | } 70 | else if (utils.isSequence(node.type)) { 71 | node.media?.map(item => addDurations(item)); 72 | let sum = node.media?.map(item => item.dur) 73 | .reduce((dur, acc) => acc + dur); 74 | node.dur = sum; 75 | } 76 | else if (utils.isPar(node.type)) { 77 | let most = null; 78 | if (node.media?.length) { 79 | most = node.media[0]; 80 | node.media.map(item => { 81 | addDurations(item); 82 | if (item.dur > most.dur) { 83 | most = item; 84 | } 85 | }); 86 | node.dur = most.dur; 87 | } 88 | else { 89 | node.dur = 0; 90 | } 91 | } 92 | } 93 | 94 | // any node with a duration of zero should get the duration of its parent container 95 | function fixZeroDurations(body) { 96 | utils.visit(body, node => { 97 | if (node.dur == 0) { 98 | if (node.parent) { 99 | node.dur = node.parent.dur; 100 | } 101 | } 102 | }); 103 | } 104 | 105 | function createEvents(node) { 106 | let timegraph = createStartEndEvents(node, 0); 107 | 108 | let started = []; 109 | // fill in the mid events 110 | timegraph.map(entry => { 111 | entry.events.map(event => { 112 | if (event.eventType == "end") { 113 | let idx = started.findIndex(n => n.node == event.node); 114 | started.splice(idx, 1); 115 | } 116 | }); 117 | started.map(info => { 118 | let midEvent = {node: info.node, eventType: "mid", elapsed: entry.timestamp - info.timestamp}; 119 | entry.events.push(midEvent); 120 | info.node.timegraphEntries.push(entry); 121 | }); 122 | entry.events.map(event => { 123 | if (event.eventType == "start") { 124 | started.push({node: event.node, timestamp: entry.timestamp}); 125 | } 126 | }); 127 | }); 128 | 129 | return timegraph; 130 | } 131 | 132 | function createStartEndEvents(node, wallclock = 0, timegraph = []) { 133 | let tgStartEntry = timegraph.find(entry => entry.timestamp === wallclock); 134 | let tgEndEntry = timegraph.find(entry => entry.timestamp === wallclock + node.dur); 135 | 136 | if (!tgStartEntry) { 137 | tgStartEntry = {timestamp: wallclock, events: []}; 138 | timegraph.push(tgStartEntry); 139 | } 140 | if (!tgEndEntry) { 141 | if (node.dur == 0) { 142 | tgEndEntry = tgStartEntry; 143 | } 144 | else { 145 | tgEndEntry = {timestamp: wallclock + node.dur, events: []}; 146 | timegraph.push(tgEndEntry); 147 | } 148 | 149 | } 150 | 151 | if (!node.timegraphEntries) { 152 | node.timegraphEntries = [tgStartEntry, tgEndEntry]; 153 | } 154 | 155 | let startEvent = { 156 | node, 157 | eventType: "start", 158 | elapsed: 0 159 | }; 160 | let endEvent = { 161 | node, 162 | eventType: "end", 163 | elapsed: wallclock + node.dur 164 | }; 165 | 166 | tgStartEntry.events.push(startEvent); 167 | tgEndEntry.events.push(endEvent); 168 | 169 | if (utils.isSequence(node.type)) { 170 | let elapsed = 0; 171 | node.media?.map(item => { 172 | createStartEndEvents(item, wallclock + elapsed, timegraph); 173 | elapsed += item.dur; 174 | }); 175 | } 176 | else if (utils.isPar(node.type)) { 177 | node.media?.map(item => createStartEndEvents(item, wallclock, timegraph)); 178 | } 179 | 180 | return timegraph.sort((a,b) => a.timestamp < b.timestamp ? -1 : 1); 181 | } 182 | 183 | export { buildTimegraph } -------------------------------------------------------------------------------- /older-experiments/synclib/src/utils.js: -------------------------------------------------------------------------------- 1 | function fetchXmlFile(url) { 2 | return new Promise(function (resolve, reject) { 3 | var xhr = new XMLHttpRequest(); 4 | xhr.open('get', url, true); 5 | xhr.responseType = 'document'; 6 | xhr.onload = function () { 7 | var status = xhr.status; 8 | if (status == 200) { 9 | resolve(xhr.response); 10 | } else { 11 | reject(status); 12 | } 13 | }; 14 | xhr.send(); 15 | }); 16 | } 17 | 18 | // returns {textData, contentType} 19 | async function fetchFile(url) { 20 | let res = await fetch(url); 21 | if (res && res.ok) { 22 | let contentType = getContentType(res); 23 | let textData = await res.text(); 24 | return {textData, contentType}; 25 | } 26 | else { 27 | throw new Error(`Error fetching ${url}`); 28 | } 29 | } 30 | 31 | function getContentType(res) { 32 | let contentType = res.headers.get("Content-Type"); 33 | if (contentType) { 34 | if (contentType.indexOf(';') != -1) { 35 | return contentType.split(';')[0]; 36 | } 37 | else { 38 | return contentType; 39 | } 40 | } 41 | } 42 | 43 | function calculateDuration(node) { 44 | if (node.hasOwnProperty('clipBegin') && node.hasOwnProperty('clipEnd')) { 45 | return parseFloat(node.clipEnd) - parseFloat(node.clipBegin); 46 | } 47 | // TODO analyze src for duration in the cases where clipBegin/clipEnd is missing 48 | else { 49 | return 0; 50 | } 51 | } 52 | 53 | let isSequence = name => name == "seq" || name == "body"; 54 | let isPar = name => name == "par"; 55 | let isMedia = name => name == "text" || name == "audio" 56 | || name == "ref" || name == "video" 57 | || name == "img"; 58 | 59 | let isTimedMedia = node => node.clipBegin || node.clipEnd || node.type == "audio" || node.type == "video"; 60 | 61 | let isTimeContainer = name => isSequence(name) || isPar(name); 62 | 63 | // Visit a tree of objects with media children 64 | function visit(node, fn) { 65 | fn(node); 66 | if (node?.media) { 67 | node.media.map(n => visit(n, fn)); 68 | } 69 | } 70 | 71 | const getCircularReplacer = () => { 72 | const seen = new WeakSet(); 73 | return (key, value) => { 74 | if (typeof value === "object" && value !== null) { 75 | if (seen.has(value)) { 76 | return; 77 | } 78 | seen.add(value); 79 | } 80 | return value; 81 | }; 82 | }; 83 | 84 | // stringify the object and don't print circular references 85 | function circularStringify(obj) { 86 | return JSON.stringify(obj, getCircularReplacer(), 2); 87 | } 88 | 89 | function simpleTree(node) { 90 | let n = { 91 | _id: node._id, 92 | type: node.type 93 | }; 94 | if (node.src) { 95 | n.src = node.src.href; 96 | } 97 | if (node.selector) { 98 | n.selector = node.selector; 99 | } 100 | if (node.hasOwnProperty('clipBegin')) { 101 | n.clipBegin = node.clipBegin; 102 | } 103 | if (node.hasOwnProperty('clipEnd')) { 104 | n.clipEnd = node.clipEnd; 105 | } 106 | if (node.media) { 107 | n.media = node.media.map(item => simpleTree(item)); 108 | } 109 | 110 | return n; 111 | } 112 | // return a simple version of the object, good for debugging 113 | function simpleTimegraph(graph) { 114 | let simpleGraph = graph.map(g => { 115 | let events = g.events.map(e => { 116 | let event = { 117 | eventType: e.eventType, 118 | elapsed: e.elapsed, 119 | nodeType: e.node.type, 120 | nodeId: e.node._id, 121 | nodeSrc: e.node.src, 122 | nodeSelector: e.node.selector, 123 | dur: e.node.dur 124 | }; 125 | if (e.node.hasOwnProperty('clipBegin')) { 126 | event.clipBegin = e.node.clipBegin; 127 | } 128 | if (e.node.hasOwnProperty('clipEnd')) { 129 | event.clipEnd = e.node.clipEnd; 130 | } 131 | return event; 132 | }); 133 | 134 | return { 135 | timestamp: g.timestamp, 136 | events 137 | }; 138 | }); 139 | return simpleGraph; 140 | } 141 | 142 | export { 143 | calculateDuration, 144 | fetchXmlFile, 145 | isSequence, 146 | isPar, 147 | isMedia, 148 | isTimeContainer, 149 | isTimedMedia, 150 | visit, 151 | fetchFile, 152 | circularStringify, 153 | simpleTimegraph, 154 | simpleTree 155 | }; -------------------------------------------------------------------------------- /older-experiments/synclib/tests/files/standalone/complex.xml: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | This file is supposed to challenge the parser. It's not a realistic example of sync media use. 5 | 6 | 7 | 8 | 9 | 10 | 11 | 22 | 24 | 25 | 26 | 27 | 28 | 30 | 31 | 32 | 34 | 35 | 36 | 38 | 39 | 40 | 41 | 42 | 43 | 45 | 46 | 47 | 49 | 50 | 51 | 52 | -------------------------------------------------------------------------------- /older-experiments/synclib/tests/files/standalone/longer-video-clips.xml: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | The video clips in each par have the longest duration. The <par> is not done until all its children are done. 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 18 | 19 | 23 | 24 | 28 | 29 | -------------------------------------------------------------------------------- /older-experiments/synclib/tests/files/standalone/no-media.xml: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | This file is supposed to challenge the parser. It's not a realistic example of sync media use. 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | -------------------------------------------------------------------------------- /older-experiments/synclib/tests/files/standalone/partial-no-media.xml: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | This file is supposed to challenge the parser. It's not a realistic example of sync media use. 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 20 | 21 | -------------------------------------------------------------------------------- /older-experiments/synclib/tests/files/standalone/roles.xml: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | This file is to test the parser. these aren't real roles. 5 | 6 | 7 | 8 | 9 | 14 | 15 | 20 | 21 | 26 | 27 | -------------------------------------------------------------------------------- /older-experiments/synclib/tests/files/standalone/simple-plus-tracks.xml: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | This file is for testing tracks support in the parser. 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 16 | 17 | 20 | 21 | 24 | 25 | -------------------------------------------------------------------------------- /older-experiments/synclib/tests/files/standalone/simple.xml: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | This is a very basic SyncMedia example of HTML and audio synchronization. 5 | 6 | 7 | 8 | 9 | 14 | 15 | 20 | 21 | 26 | 27 | -------------------------------------------------------------------------------- /older-experiments/synclib/tests/run-tests.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | SyncMedia-js Tests 6 | 7 | 8 | 9 | 10 |
11 | 12 | 13 | 14 | 21 | 22 | 23 | 24 | 27 | 28 | 29 | -------------------------------------------------------------------------------- /older-experiments/visualizer/runner.js: -------------------------------------------------------------------------------- 1 | var processor; 2 | var exampleFile; 3 | var libversion; 4 | var onClearAllFn; 5 | var onDisplayResultsFn; 6 | 7 | // call when page is loaded 8 | async function init(fileProcessor, exampleInput, version, onClearAll=null, onDisplayResults=null) { 9 | processor = fileProcessor; 10 | exampleFile = new URL(exampleInput, document.location.href).href; 11 | onClearAllFn = onClearAll; 12 | onDisplayResultsFn = onDisplayResults; 13 | 14 | libversion = version; 15 | 16 | createInputOutputElements(); 17 | document.querySelector("#input").setAttribute("open", "open"); 18 | document.querySelector("#inputFile").setAttribute('value', exampleFile); 19 | 20 | document.querySelector("#version").textContent = libversion; 21 | 22 | let urlSearchParams = new URLSearchParams(document.location.search); 23 | if (urlSearchParams.has("q") && urlSearchParams.get("q") != '') { 24 | document.querySelector("#inputFile").setAttribute("value", urlSearchParams.get("q")); 25 | await loadByUrl(urlSearchParams.get("q")); 26 | } 27 | 28 | document.querySelector("#loadFile").addEventListener('click', async e => { 29 | if (document.querySelector("#inputFile").value == '') { 30 | alert("Please enter a URL"); 31 | return; 32 | } 33 | await loadByUrl(document.querySelector("#inputFile").value); 34 | }); 35 | 36 | document.querySelector("#loadText").addEventListener('click', async e => { 37 | if (document.querySelector("#text").value == '') { 38 | alert("Please enter JSON data"); 39 | return; 40 | } 41 | if (document.querySelector("#baseUrl").value == '') { 42 | alert("Please enter a base URL"); 43 | return; 44 | } 45 | await loadJson(document.querySelector("#text").value, 46 | document.querySelector("#baseUrl").value); 47 | }); 48 | 49 | document.querySelector("#copyErrorsToClipboard").addEventListener('click', e => { 50 | navigator.clipboard.writeText(document.querySelector("#status").textContent); 51 | }); 52 | 53 | document.querySelector("#copyDataToClipboard").addEventListener('click', e => { 54 | navigator.clipboard.writeText(document.querySelector("#data").textContent); 55 | }); 56 | 57 | } 58 | 59 | async function loadByUrl(url) { 60 | clearAll(); 61 | document.querySelector("#output h2 span").textContent = `for 62 | ${new URL(url).pathname.split('/').reverse()[0]}`; 63 | await processor.loadUrl(url); 64 | displayResults(); 65 | } 66 | 67 | async function loadJson(text, base) { 68 | clearAll(); 69 | await processor.loadJson(JSON.parse(text), base); 70 | displayResults(); 71 | } 72 | 73 | function displayResults() { 74 | document.querySelector("#input").removeAttribute("open"); 75 | document.querySelector("#output").setAttribute("open", "open"); 76 | let errorText = "No errors"; 77 | if (processor.errors.length > 0) { 78 | errorText = JSON.stringify(processor.errors, null, 2); 79 | document.querySelector("#status").classList.add("errors"); 80 | } 81 | else { 82 | document.querySelector("#status").classList.remove("errors"); 83 | } 84 | 85 | document.querySelector("#status").innerHTML = errorText; 86 | document.querySelector("#data").innerHTML = JSON.stringify(processor.data, null, 2); 87 | 88 | document.querySelectorAll('pre code').forEach((block) => { 89 | hljs.highlightBlock(block); 90 | }); 91 | 92 | if (onDisplayResultsFn) { 93 | onDisplayResultsFn(); 94 | } 95 | } 96 | function clearAll() { 97 | document.querySelector("#output h2 span").textContent = ''; 98 | document.querySelector("#status").innerHTML = ''; 99 | document.querySelector("#data").innerHTML = ''; 100 | 101 | if (onClearAllFn) { 102 | onClearAllFn(); 103 | } 104 | } 105 | 106 | function createInputOutputElements() { 107 | document.querySelector("#io").innerHTML = ` 108 |
109 |

Input

110 | 111 |
112 |
113 | 114 | 115 | 116 |
117 | 118 | OR 119 | 120 |
121 | 122 | 123 | 124 | 125 | 126 | 127 |
128 |
129 |
130 | 131 |
132 |

Output

133 |

Errors:

134 |
135 |

Data:

136 |
137 |
138 | `; 139 | } -------------------------------------------------------------------------------- /other-work/json/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | Synchronized Media for Publications: Overview 7 | 8 | 9 | 10 |

Editors: Marisa DeMeglio (DAISY Consortium), Daniel Weck (DAISY Consortium)

11 |

Last updated: July 2020

12 | 13 |
14 |

Introduction

15 |

The Synchronized Media for Publications Community 16 | Group was formed to recommend the best way to synchronize media with document formats being developed by the 17 | Publishing Working Group, in order to make 18 | publications accessible to people with different types of reading requirements.

19 |
20 | 21 |
22 |

Documents

23 | 24 |

The following are the currently available draft documents:

25 | 26 | 32 |
33 |
34 |

Status

35 |

Changes in the approach described in the documents above are currently being considered by the group. When ready, revisions will be made available here.

36 |
37 | 38 | 39 | 40 | -------------------------------------------------------------------------------- /other-work/json/usecases.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |
10 |

These use cases provide context and expected user agent behavior for Synchronized Narration.

11 |
12 | 13 |
14 |

This draft is still under consideration within the Synchronized Media for Publications Community Group and is subject to change. The most prominent issues will be referenced in the document with links provided.

15 |
16 | 17 |
18 |

Synchronized Narration Use Cases

19 |
20 |

Text supplemented with pre-recorded audio narration

21 |
    22 |
  • Use an authored navigation interface (e.g. TOC) to reach a point in a content document, and start audio playback from there
  • 23 |
  • Move around the text content and have the audio playback keep up. E.g. 24 |
      25 |
    • Previous/next paragraph
    • 26 |
    • Previous/next page
    • 27 |
    28 |
  • 29 |
  • Start playback by clicking a point mid-page
  • 30 |
  • Speed up/slow down the audio while not affecting the pitch (timescale modification)
  • 31 |
  • Navigate by time chunk (e.g. skip forward/back 10s)
  • 32 |
  • Text chunks are highlighted in sync with the audio playback, at the authored granularity
  • 33 |
  • Text display keeps up with audio. E.g. start audio playback and have the pages turn automatically.
  • 34 |
  • Selectively filter out content based on semantics ("skippability" in the DAISY world). Note: might there be overlap here with WP personalization?
  • 35 |
  • Jump out of complex structures and back into the reading flow ("escapability" in the DAISY world)
  • 36 |
  • Support multiple granularities of navigation, switchable on the fly by the user. E.g.
      37 |
    • Previous/next paragraph
    • 38 |
    • Previous/next sentence
    • 39 |
    • Previous/next word or phrase
    • 40 |
    41 |
  • 42 |
43 | 44 |
45 | 46 |
47 |

Audio + TOC book

48 |

An audio book needs to include paragraph-level navigation in what otherwise has only chapters marked (in the TOC). Once inside a chapter, users navigate paragraph-by-paragraph.

49 |
50 | 51 |
52 |

Integrating audio and text editions of a book

53 |

A publisher has made parallel editions of a publication: one is audio, and the other is text. Adding a synchronized media "glue" layer gives the user a playback experience like what's described in the text+audio example above.

54 |
55 | 56 |
57 | 58 | 59 | 60 | 61 | -------------------------------------------------------------------------------- /other-work/smil-sources/design-principles.md: -------------------------------------------------------------------------------- 1 | --- 2 | title: Design Principles 3 | --- 4 | This document is a work in progress {.wip} 5 | 6 | ## Storing and retrieving sync documents 7 | 8 | * Package sync documents independently of the publication they correspond to (for side-loading) 9 | 10 | ## Addressibility 11 | 12 | * Reference a position in the sync document by 13 | * ID 14 | * Other selector? 15 | * Reference the sync document itself by URL 16 | 17 | ## Processing requirements 18 | 19 | * Serialization format(s) must work with off-the-shelf parsers 20 | 21 | -------------------------------------------------------------------------------- /other-work/smil-sources/examples.md: -------------------------------------------------------------------------------- 1 | --- 2 | title: SyncMedia Examples 3 | --- 4 | This document is a work in progress {.wip} 5 | 6 | 7 | ## HTML document with audio narration 8 | 9 | This is a typical example of a structured document with audio narration. It features: 10 | 11 | * Text fragments highlighted as the audio plays. 12 | * Semantic information: 13 | * There is a page number (often given in ebooks as print page equivalents), indicated via `role`. This allows Sync Media Players to offer users an option to skip page number announcements 14 | * The document contains a table, also indicated via `role`, allowing players to offer users an option to escape and return to the main reading flow. 15 | * Nested highlights: When active, the table is outlined with a yellow border, and individual rows are highlighted as they are read. 16 | 17 | 18 | ### SyncMedia presentation 19 | 20 | ``` 21 | 22 | 23 | 25 | 26 | 27 | 29 | 30 | 31 | 32 | 35 | 36 | 39 | 40 | 43 | 44 | 47 | 48 | 51 | 52 | 55 | 56 | 57 | 58 | 59 | 62 | 63 | 66 | 67 | 70 | 71 | 74 | 75 | 76 | 77 | 80 | 81 | 82 | 83 | ``` 84 | 85 | ### HTML document 86 | 87 | This is the corresponding HTML document for the above SyncMedia presentation. 88 | 89 | ``` 90 | 91 | 92 | 93 | 123 | 124 | 125 |
126 |

Los Angeles, California

127 |

Anim anim ex deserunt laboris voluptate non exercitation ad consequat tempor et.

128 |

Officia cillum commodo qui amet exercitation veniam.

129 | 4 130 |

Aliqua mollit officia commodo nulla sunt excepteur in ex nostrud dolore dolor do in.

131 |

Average Summer Temperatures in Los Angeles

132 | 133 | 134 | 135 | 136 | 137 | 138 | 139 | 140 | 141 | 142 | 143 | 144 | 145 | 146 | 147 | 148 | 149 | 150 | 151 | 152 | 153 | 154 | 155 | 156 | 157 |
MonthHighLow
June7962
July8365
August8566
158 |

Proident est veniam eu ea est culpa amet.

159 |
160 | 161 | 162 | ``` 163 | 164 | ### Audio-only presentation 165 | 166 | * Nested structures 167 | * Semantics 168 | 169 | ### Presentation with secondary audio 170 | 171 | * Sound effects 172 | * Background music 173 | 174 | ### Video with text transcript 175 | 176 | * Synchronized highlight for the transcript 177 | 178 | ### EPUB with separate audio overlay 179 | 180 | * EPUB chunks referenced with CFI 181 | * Overlay side-loaded 182 | 183 | ### SVG comic with audio narration 184 | 185 | * Zoom in on each comic panel 186 | -------------------------------------------------------------------------------- /other-work/smil-sources/explainer.md: -------------------------------------------------------------------------------- 1 | --- 2 | title: SyncMedia Explainer 3 | --- 4 | This document is a work in progress {.wip} 5 | 6 | ## Background 7 | 8 | The formal historical precedent for the concept of SyncMedia is the [EPUB3 Media Overlays specification](http://www.idpf.org/epub/31/spec/epub-mediaoverlays.html) (digital publications with synchronized text-audio playback). 9 | 10 | EPUB3 Media Overlays itself can be seen as the "mainstream publishing" alternative to the [DAISY Digital Talking Book format](http://www.daisy.org/daisypedia/daisy-digital-talking-book), which is an accessible book format for people with print disabilities. 11 | 12 | SyncMedia is the evolution of these concepts, optimized for the open web platform, and expanded to incorporate additional media types. 13 | 14 | ## Concepts 15 | 16 | A SyncMedia presentation is a linear timeline of external media objects. The timeline is arranged into parallel and sequential groupings of media references. Groupings carry semantic inflection via a `sync:role` property. 17 | 18 | Examples of SyncMedia use cases are: 19 | * HTML document synchronized with audio narration 20 | * Audio-only presentation, structured with SyncMedia to provide phrase-level navigation 21 | * SVG synchronized with audio 22 | * Video synchronized with a transcript 23 | 24 | In each of these use cases, the presentation is composed of external media objects, organized into fragments, and synchronized on a timeline. 25 | 26 | However, straight-through beginning-to-end playback is not the only way that the timeline will be consumed. Users may start in at a mid-point. They may escape out of complex structures (e.g. tables or asides). They may navigate through the presentation via an authored granularity (e.g. previous/next sentence). In addition, they may control other aspects of the presentation: lower the volume of background music, turn off sound effects, change the highlight color of text, or slow down a video. Therefore, the format must allow a standard way to expose this information to a user agent. 27 | 28 | Just like EPUB Media Overlays, Sync Media is based on [SMIL 3.0](https://www.w3.org/TR/REC-smil/smil30.html). It is designed to offer a lossless upgrade path for existing Media Overlays documents. 29 | 30 | 31 | ## Technology Selection 32 | 33 | See [technology candidates](technology-candidates.html) for an overview of the languages that were evaluated for suitability. 34 | 35 | The technology selection is to __extend SMIL 3.0__ with customizations. Given the success of SMIL with EPUB Media Overlays, it makes sense to continue down this path. And given that SMIL has not had a refresh for the modern web platform, we anticipate extending it allows these gaps to be filled. 36 | 37 | Choosing a serialization format (e.g. XML or JSON) was not part of this selection process, as the Synchronized Media for Publications CG felt [it is more desireable to define a model first](https://lists.w3.org/Archives/Public/public-sync-media-pub/2020Jul/0005.html) before deciding on one or multiple serializations. 38 | 39 | ## Relationship to SMIL3 and EPUB Media Overlays 40 | 41 | SyncMedia is, like EPUB3 Media Overlays, a subset of SMIL3 plus custom extensions. SyncMedia puts fewer restrictions on the use of SMIL3 than EPUB3 Media Overlays does, and, additionally, it incorporates more elements from SMIL3. The custom extensions in EPUB Media Overlays have been replaced in SyncMedia with more generic mechanisms. 42 | 43 | The following table compares SyncMedia features with the closest point of comparison in EPUB3 Media Overlays. 44 | 45 | | Purpose | SyncMedia feature | EPUB3 Media Overlays feature | 46 | |---------|-------------------|-----------------------------| 47 | | Semantics | `sync:role` plus [DPUB-ARIA](https://www.w3.org/TR/dpub-aria-1.0/), [WAI-ARIA Document Structure Roles](https://www.w3.org/TR/wai-aria/#document_structure_roles) and [landmark roles](https://www.w3.org/TR/wai-aria-1.1/#landmark_roles) | [`epub:type`](https://www.w3.org/publishing/epub/epub-contentdocs.html#attrdef-epub-type) plus [EPUB SSV](https://idpf.github.io/epub-vocabs/structure/)| 48 | | Nested text structures | Unrestricted use of `par` and `seq` nesting | [`epub:textref`](https://www.w3.org/publishing/epub/epub-mediaoverlays.html#attrdef-body-textref) | 49 | | Styling | `param` elements | Metadata in the EPUB Package Document | 50 | | Parallel timed media, e.g. background music | Unrestricted use of `par` and `seq` nesting | None | 51 | | Reference embedded media | Dereference src of appropriate media element (`text`, `video`, `image`, `audio`, `ref`), e.g. `