{{title}}
8 |Last updated {{ page.date.toDateString() }}
9 | {{ content | safe }} 10 |├── .gitattributes ├── .gitignore ├── CONTRIBUTING.md ├── LICENSE.md ├── README.md ├── docs-src ├── .eleventy.js ├── _layout │ ├── htmleton.njk │ ├── page.njk │ └── spec.njk ├── convert-smil │ ├── index.html │ └── parse-smil.js ├── css │ ├── page.css │ └── spec-extra.css ├── demos │ └── raven │ │ ├── .gitattributes │ │ ├── index.html │ │ ├── line.vtt │ │ ├── raven.m4a │ │ ├── readme.md │ │ ├── stanza.vtt │ │ ├── style.css │ │ ├── sync.js │ │ └── word.vtt ├── markdown.js ├── package-lock.json ├── package.json └── pages │ ├── caveats.md │ ├── explainer.md │ ├── index.md │ ├── pages.json │ ├── sync-media-lite.md │ └── use-cases.md ├── docs ├── caveats.html ├── convert-smil │ ├── index.html │ └── parse-smil.js ├── css │ ├── page.css │ └── spec-extra.css ├── explainer.html ├── index.html ├── sync-media-lite.html └── use-cases.html ├── drafts ├── addl-reqs.md ├── functional-requirements.md ├── manifest-exts-multi-granular.md ├── manifest-exts.md ├── readium2.md ├── schema │ ├── README.md │ ├── sync-media-narration.sample.json │ └── sync-media-narration.schema.json ├── sync-narr-ideas.md ├── technologies.md ├── technology-candidates.md ├── technology-selection.md ├── use-cases.md ├── web-proposal.md └── xml-json.html ├── older-experiments ├── player-bkup │ ├── audio.js │ ├── controls.js │ ├── css │ │ ├── base.css │ │ ├── controls.css │ │ ├── player.css │ │ └── pub-default.css │ ├── events.js │ ├── iframe.js │ ├── index.html │ ├── index.js │ ├── narrator.js │ └── utils.js ├── synclib │ ├── build.sh │ ├── build │ │ └── synclib.js │ ├── rollup.config.js │ ├── src │ │ ├── index.js │ │ ├── ingestXml.js │ │ ├── notes.txt │ │ ├── process.js │ │ ├── syncMedia.js │ │ ├── timegraph.js │ │ └── utils.js │ └── tests │ │ ├── files │ │ └── standalone │ │ │ ├── complex.xml │ │ │ ├── longer-video-clips.xml │ │ │ ├── no-media.xml │ │ │ ├── partial-no-media.xml │ │ │ ├── roles.xml │ │ │ ├── simple-plus-tracks.xml │ │ │ └── simple.xml │ │ ├── run-tests.html │ │ └── test │ │ └── test.syncmedia.parsing.js └── visualizer │ ├── index.html │ └── runner.js ├── other-work ├── json │ ├── explainer.html │ ├── incorporating-synchronized-narration.html │ ├── index.html │ ├── synchronized-narration.html │ └── usecases.html ├── smil-sources │ ├── design-principles.md │ ├── examples.md │ ├── explainer.md │ ├── including-in-html.md │ ├── incorporating-into-pubmanifest.md │ ├── index.md │ ├── smil.json │ ├── standalone-packaging.md │ ├── sync-media-with-json.md │ ├── sync-media.md │ └── use-cases.md └── smil │ ├── design-principles.html │ ├── examples.html │ ├── explainer.html │ ├── including-in-html.html │ ├── incorporating-into-pubmanifest.html │ ├── index.html │ ├── standalone-packaging.html │ ├── sync-media-with-json.html │ ├── sync-media.html │ └── use-cases.html └── w3c.json /.gitattributes: -------------------------------------------------------------------------------- 1 | docs-src/demos/raven/raven.mp3 filter=lfs diff=lfs merge=lfs -text 2 | examples/emily-dickinson-poems-series-1/poems_series1_0_dickinson.mp3 filter=lfs diff=lfs merge=lfs -text 3 | examples/emily-dickinson-poems-series-1/poems_series1_2_dickinson.mp3 filter=lfs diff=lfs merge=lfs -text 4 | older-experiments/player/public/fairytale/out.mp3 filter=lfs diff=lfs merge=lfs -text 5 | docs/demos/raven/raven.mp3 filter=lfs diff=lfs merge=lfs -text 6 | examples/emily-dickinson-poems-series-1/poems_series1_1_dickinson.mp3 filter=lfs diff=lfs merge=lfs -text 7 | examples/emily-dickinson-poems-series-1/poems_series1_3_dickinson.mp3 filter=lfs diff=lfs merge=lfs -text 8 | examples/emily-dickinson-poems-series-1/poems_series1_4_dickinson.mp3 filter=lfs diff=lfs merge=lfs -text 9 | older-experiments/player/public/fairytale/Harp.mp3 filter=lfs diff=lfs merge=lfs -text 10 | -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | .DS_Store 2 | node_modules 3 | demo-src/player-bkup 4 | notes 5 | .DS_Store 6 | examples 7 | .vscode/launch.json 8 | player 9 | localhost* 10 | notes-other.md 11 | -------------------------------------------------------------------------------- /CONTRIBUTING.md: -------------------------------------------------------------------------------- 1 | # W3C Synchronized Multimedia for Publications Community Group 2 | 3 | This repository is being used for work in the W3C W3C Synchronized Multimedia for Publications Community Group, governed by the [W3C Community License Agreement (CLA)](https://www.w3.org/community/about/agreements/cla/). To make substantive contributions, you must join the BG. 4 | 5 | If you are not the sole contributor to a contribution (pull request), please identify all 6 | contributors in the pull request comment. 7 | 8 | To add a contributor (other than yourself, that's automatic), mark them one per line as follows: 9 | 10 | ``` 11 | +@github_username 12 | ``` 13 | 14 | If you added a contributor by mistake, you can remove them in a comment with: 15 | 16 | ``` 17 | -@github_username 18 | ``` 19 | 20 | If you are making a pull request on behalf of someone else but you had no part in designing the 21 | feature, you can remove yourself with the above syntax. -------------------------------------------------------------------------------- /LICENSE.md: -------------------------------------------------------------------------------- 1 | All Reports in this Repository are licensed by Contributors 2 | under the 3 | [W3C Software and Document License](https://www.w3.org/Consortium/Legal/2015/copyright-software-and-document). 4 | 5 | Contributions to Specifications are made under the 6 | [W3C CLA](https://www.w3.org/community/about/agreements/cla/). 7 | 8 | Contributions to Test Suites are made under the 9 | [W3C 3-clause BSD License](https://www.w3.org/Consortium/Legal/2008/03-bsd-license.html) -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Synchronized Multimedia for Publications 2 | 3 | This is the repository for ongoing work initiated by the [W3C Synchronized Multimedia for Publications Community Group](https://www.w3.org/community/sync-media-pub/), and 4 | then transferred to the [Publication Maintenance Working Group](https://www.w3.org/groups/wg/pm/) for further 5 | incubation. 6 | 7 | The goal is to explore and develop media synchronization techniques compatible with publishing formats on the web, 8 | including [Audiobooks](https://www.w3.org/TR/audiobooks/), [EPUB](https://www.w3.org/publishing/groups/epub-wg/), and standalone [HTML](https://www.w3.org/html/). 9 | 10 | ## What's currently available 11 | 12 | * [Documents homepage](https://w3c.github.io/sync-media-pub) 13 | 14 | Updated December 2024. -------------------------------------------------------------------------------- /docs-src/.eleventy.js: -------------------------------------------------------------------------------- 1 | import { AllHtmlEntities } from 'html-entities'; 2 | import prettier from "prettier"; 3 | import path from "path"; 4 | import { default as markdown } from './markdown.js'; 5 | 6 | function init(eleventyConfig) { 7 | eleventyConfig.setLibrary("md", markdown()); 8 | eleventyConfig.addPassthroughCopy({"css": "css"}); 9 | eleventyConfig.addPassthroughCopy({"convert-smil": "convert-smil"}); 10 | 11 | eleventyConfig.addFilter('dump', obj => { 12 | return util.inspect(obj) 13 | }); 14 | eleventyConfig.addFilter('json', value => { 15 | const jsonString = JSON.stringify(value, null, 4).replace(/ { 20 | const entities = new AllHtmlEntities(); 21 | return `
${entities.encode(content)}`; 22 | }); 23 | 24 | eleventyConfig.addTransform("prettier", function (content, outputPath) { 25 | const extname = path.extname(outputPath); 26 | switch (extname) { 27 | case ".html": 28 | case ".json": 29 | // Strip leading period from extension and use as the Prettier parser. 30 | const parser = extname.replace(/^./, ""); 31 | return prettier.format(content, { parser, tabWidth: 4 }); 32 | 33 | default: 34 | return content; 35 | } 36 | }); 37 | if (process.env.HTTPS) { 38 | eleventyConfig.setServerOptions({ 39 | https: { 40 | key: "./localhost.key", 41 | cert: "./localhost.cert", 42 | }, 43 | showVersion: true, 44 | }); 45 | } 46 | 47 | 48 | return { 49 | dir: { 50 | input: "pages", 51 | output: "../docs", 52 | includes: "../_layout" 53 | } 54 | }; 55 | }; 56 | 57 | export default init; -------------------------------------------------------------------------------- /docs-src/_layout/htmleton.njk: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 |
Last updated {{ page.date.toDateString() }}
9 | {{ content | safe }} 10 |Paste SMIL below and press
52 | 53 | 54 | 55 | 56 |Contents:
` 17 | }) 18 | .use(markdownItDeflist) 19 | .use(markdownItDiv); 20 | return markdownLib; 21 | }; 22 | 23 | export default markdown; -------------------------------------------------------------------------------- /docs-src/package.json: -------------------------------------------------------------------------------- 1 | { 2 | "name": "drafts-src", 3 | "version": "1.0.0", 4 | "description": "", 5 | "main": ".eleventy.js", 6 | "type": "module", 7 | "scripts": { 8 | "serve": "cross-env DEBUG=Eleventy* eleventy --serve", 9 | "serve-https": "cross-env HTTPS=true DEBUG=Eleventy* eleventy --serve", 10 | "build": "npm run clean && eleventy", 11 | "buildwatch": "eleventy --watch", 12 | "clean": "rimraf docs", 13 | "http-serve": "cd ../docs && http-serve", 14 | "dev": "npm run clean && npm run serve" 15 | }, 16 | "keywords": [], 17 | "author": "", 18 | "license": "ISC", 19 | "devDependencies": { 20 | "@11ty/eleventy": "^3.0.0-beta.1", 21 | "cross-env": "^7.0.3", 22 | "fs-extra": "^11.2.0", 23 | "html-entities": "^1.3.1", 24 | "markdown-it": "^14.1.0", 25 | "markdown-it-anchor": "^9.0.1", 26 | "markdown-it-attrs": "^4.1.6", 27 | "markdown-it-deflist": "^2.0.3", 28 | "markdown-it-div": "^1.1.0", 29 | "markdown-it-header-sections": "^1.0.0", 30 | "markdown-it-table-of-contents": "^0.4.4", 31 | "npm-run-all": "^4.1.5", 32 | "nunjucks": "^3.2.2", 33 | "prettier": "^3.3.2", 34 | "rimraf": "^5.0.7", 35 | "sharp": "^0.33.4", 36 | "slugify": "^1.4.5" 37 | } 38 | } 39 | -------------------------------------------------------------------------------- /docs-src/pages/caveats.md: -------------------------------------------------------------------------------- 1 | --- 2 | title: SyncMediaLite caveats 3 | --- 4 | 5 | # Caveats when going from EPUB Media Overlays to SyncMediaLite 6 | 7 | When adopting a more modern synchronization strategy, as described in [SyncMediaLite](sync-media-lite), some adaptation is required. It may be that existing `.smil` files have to be transformed into `.vtt` files before being distributed to a system that expects `.vtt`. Or the user agent may be loading `.smil` files and internally transforming them to `TextTrackCues` so as to avoid writing a SMIL engine. 8 | 9 | In any case where EPUB Media Overlays need to be transformed to work in a WebVTT-based playback scenario, there are some differences to be aware of. 10 | 11 | ## Multiple audio files. 12 | 13 | It's theoretically permitted in EPUB Media Overlays to have sync points in the same SMIL file referencing different audio files (in practice this isn't common). 14 | 15 | ## Non-contiguous audio segments. 16 | 17 | Say we have an audio file of someone saying "Three one two". Our HTML text, though, says "1 2 3". Theoretically, SMIL can handle this, though it's worth mentioning that this type of content is not commonly found: 18 | 19 | {% example "SMIL markup of non-contiguous audio segments" %} 20 |Last updated Tue Oct 01 2024
15 |
23 | When adopting a more modern synchronization strategy, as
24 | described in SyncMediaLite,
25 | some adaptation is required. It may be that existing
26 | .smil
files have to be transformed into
27 | .vtt
files before being distributed to a system
28 | that expects .vtt
. Or the user agent may be
29 | loading .smil
files and internally transforming
30 | them to TextTrackCues
so as to avoid writing a
31 | SMIL engine.
32 |
34 | In any case where EPUB Media Overlays need to be transformed 35 | to work in a WebVTT-based playback scenario, there are some 36 | differences to be aware of. 37 |
38 |41 | It's theoretically permitted in EPUB Media Overlays to 42 | have sync points in the same SMIL file referencing 43 | different audio files (in practice this isn't common). 44 |
45 |49 | Say we have an audio file of someone saying "Three 50 | one two". Our HTML text, though, says "1 2 51 | 3". Theoretically, SMIL can handle this, though 52 | it's worth mentioning that this type of content is not 53 | commonly found: 54 |
55 |59 | <par> 60 | <audio src="audio.mp3" clipBegin="1s" clipEnd="2s"/> 61 | <text src="file.html#one"/> 62 | </par> 63 | <par> 64 | <audio src="audio.mp3" clipBegin="2s" clipEnd="3s"/> 65 | <text src="file.html#two"/> 66 | </par> 67 | <par> 68 | <audio src="audio.mp3" clipBegin="0s" clipEnd="1s"/> 69 | <text src="file.html#three"/> 70 | </par> 71 |73 |
74 | You would see the highlight and audio start with
75 | "1" and proceed to "2", then
76 | "3", since each
77 | <par>
indicates what portion of audio
78 | to render.
79 |
81 | Now if you try to represent this in WebVTT, you would 82 | get: 83 |
84 |85 | 10 86 | 00:00:01.000 --> 00:00:02.000 87 | {"selector":{"type": "FragmentSelector", "value": "one"}} 88 | 89 | 20 90 | 00:00:02.000 --> 00:00:03.000 91 | {"selector":{"type": "FragmentSelector", "value": "two"}} 92 | 93 | 30 94 | 00:00:00.000 --> 00:00:01.000 95 | {"selector":{"type": "FragmentSelector", "value": "three"}} 96 | 97 |99 |
100 | But you would hear and see highlighted "3", 101 | followed by "1", then "2", since the 102 | audio playback is only based on the audio file, from 103 | start to end. 104 |
105 |109 | In both cases, resolving the difference requires either 110 | additional special handling by the user agent, or audio 111 | file reformulation by the producer. 112 |
113 |Paste SMIL below and press
52 | 53 | 54 | 55 | 56 |Last updated Sun Oct 06 2024
15 |18 | The use case of reading with narration and synchronized 19 | highlight has long been a part of electronic publishing, and 20 | is already supported by existing standards (DAISY, 24 | EPUB Media Overlays). Under the hood, these standards use 27 | SMIL to 28 | synchronize an audio file with an 29 | HTML 30 | file, by pairing timestamps with phrase IDs. 31 |
32 |39 | Its roots are from the web's early days, before HTML 40 | supported native audio and video; and the full SMIL language 41 | is indeed quite complex. However, the usage of SMIL in EPUB 42 | Media Overlays is minimal and, with a few more restrictions, 43 | could be translated into a more modern format and be more 44 | easily implemented. 45 |
46 |52 | Production of audio narrated text is a lot of work and hence 53 | not as common as standalone text or audio books. Now with 54 | more powerful speech and language processing tools, 55 | automated synchronization is becoming feasible. However, 56 | it's not fast enough to do on the client side (yet), so book 57 | producers are still going to have to create pre-synchronized 58 | contents. But advances in their own tools are going to make 59 | it easier for them to do this. 60 |
61 |65 | The same user experience is achieved with a more modern 66 | approach that is easier to implement. This is what is 67 | described in SyncMediaLite. 68 |
69 |72 | Today, the HTMLMediaElement has built-in cue 73 | synchronization. When loaded with a series of 74 | TextTrackCues, the MediaElement will automatically fire 75 | off cue events at the right times, so unlike SMIL, it 76 | does not require hand-coding a timing engine. 77 |
78 |82 | The CSS Highlight API makes it easy to register 83 | highlights, which are then available for styling as 84 | pseudo-elements. There is then no need to add and remove 85 | class attributes throughout the DOM. 86 |
87 |91 | In EPUB Media Overlays, this is done with fragment 92 | identifiers. By expanding this to include the use of 93 | selectors, we have a more flexible way to reference text, 97 | without requiring IDs on all the text, and can even go 98 | to the character level. 99 |
100 |105 | EPUB Media Overlays could be replaced with SyncMediaLite, 106 | with the following modifications: 107 |
108 |122 | See caveats related to going from EPUB 123 | Media Overlays to SyncMediaLite. 124 |
125 |Last updated Tue Oct 01 2024
13 |16 | A library has a lot of narrated talking book content. They 17 | want to convert it from DAISY and/or EPUB with Media 18 | Overlays to something more modern like WebVTT. They need to 19 | port over the SMIL-based audio clip timing information plus 20 | HTML document selectors. 21 |
22 |31 | A content producer does not want to destructively mark up 32 | their beautiful HTML document by putting ID values on every 33 | element that is to be synchronized. They wonder why they 34 | can't use CSS selectors instead. 35 |
36 |40 | Sam has a web browser and wants to use it to listen to a 41 | narrated document that is located at a URL. Sam does not 42 | want to install an app. 43 |
44 |48 | The book publisher wants a paragraph to have a pink 49 | background while it's playing, and each word in it should be 50 | green as it plays. 51 |
52 |64 | Bertha is listening to important information in a language 65 | not native to her; she wants to slow the rate to improve her 66 | comprehension. 67 |
68 |72 | Gregor is near-sighted and needs to enlarge text in order to 73 | read it. He prefers that the audio playback and text 74 | highlight work seamlessly after he uses the browser text 75 | enlargement feature. 76 |
77 |18 |
19 |Errors:
134 |
135 | Data:
136 |
137 | Editors: Marisa DeMeglio (DAISY Consortium), Daniel Weck (DAISY Consortium)
11 |Last updated: July 2020
12 | 13 |The Synchronized Media for Publications Community 16 | Group was formed to recommend the best way to synchronize media with document formats being developed by the 17 | Publishing Working Group, in order to make 18 | publications accessible to people with different types of reading requirements.
19 |The following are the currently available draft documents:
25 | 26 | 32 |Changes in the approach described in the documents above are currently being considered by the group. When ready, revisions will be made available here.
36 |These use cases provide context and expected user agent behavior for Synchronized Narration.
11 |This draft is still under consideration within the Synchronized Media for Publications Community Group and is subject to change. The most prominent issues will be referenced in the document with links provided.
15 |An audio book needs to include paragraph-level navigation in what otherwise has only chapters marked (in the TOC). Once inside a chapter, users navigate paragraph-by-paragraph.
49 |A publisher has made parallel editions of a publication: one is audio, and the other is text. Adding a synchronized media "glue" layer gives the user a playback experience like what's described in the text+audio example above.
54 |Anim anim ex deserunt laboris voluptate non exercitation ad consequat tempor et.
128 |Officia cillum commodo qui amet exercitation veniam.
129 | 4 130 |Aliqua mollit officia commodo nulla sunt excepteur in ex nostrud dolore dolor do in.
131 |Month | 136 |High | 137 |Low | 138 |
June | 143 |79 | 144 |62 | 145 |
July | 148 |83 | 149 |65 | 150 |
August | 153 |85 | 154 |66 | 155 |
Proident est veniam eu ea est culpa amet.
159 |