├── .github
    └── workflows
    │   └── pr-push.yml
├── .gitignore
├── .pr-preview.json
├── CODE_OF_CONDUCT.md
├── CONTRIBUTING.md
├── LICENSE.md
├── README.md
├── index-zh-cn.bs
├── index.bs
├── text.bs
└── w3c.json


/.github/workflows/pr-push.yml:
--------------------------------------------------------------------------------
 1 | name: CI
 2 | on:
 3 |   pull_request: {}
 4 |   push:
 5 |     branches: [main]
 6 | jobs:
 7 |   main:
 8 |     name: Build, Validate and Deploy
 9 |     runs-on: ubuntu-latest
10 |     steps:
11 |       - uses: actions/checkout@v2
12 |       - name: index.bs
13 |         uses: w3c/spec-prod@v2
14 |         with:
15 |           SOURCE: index.bs
16 |           DESTINATION: index.html
17 |           TOOLCHAIN: bikeshed
18 |           BUILD_FAIL_ON: fatal
19 |           GH_PAGES_BRANCH: gh-pages
20 |       - name: text.bs
21 |         uses: w3c/spec-prod@v2
22 |         with:
23 |           SOURCE: text.bs
24 |           DESTINATION: text.html
25 |           TOOLCHAIN: bikeshed
26 |           BUILD_FAIL_ON: fatal
27 |           GH_PAGES_BRANCH: gh-pages
28 | 


--------------------------------------------------------------------------------
/.gitignore:
--------------------------------------------------------------------------------
 1 | # Logs
 2 | logs
 3 | *.log
 4 | npm-debug.log*
 5 | 
 6 | # Runtime data
 7 | pids
 8 | *.pid
 9 | *.seed
10 | 
11 | # Directory for instrumented libs generated by jscoverage/JSCover
12 | lib-cov
13 | 
14 | # Coverage directory used by tools like istanbul
15 | coverage
16 | 
17 | # nyc test coverage
18 | .nyc_output
19 | 
20 | # Grunt intermediate storage (http://gruntjs.com/creating-plugins#storing-task-files)
21 | .grunt
22 | 
23 | # node-waf configuration
24 | .lock-wscript
25 | 
26 | # Compiled binary addons (http://nodejs.org/api/addons.html)
27 | build/Release
28 | 
29 | # Dependency directories
30 | node_modules
31 | jspm_packages
32 | 
33 | # Optional npm cache directory
34 | .npm
35 | 
36 | # Optional REPL history
37 | .node_repl_history
38 | 


--------------------------------------------------------------------------------
/.pr-preview.json:
--------------------------------------------------------------------------------
1 | {
2 |     "src_file": "index.bs",
3 |     "type": "bikeshed",
4 |     "params": {
5 |         "force": 1
6 |     }
7 | }
8 | 


--------------------------------------------------------------------------------
/CODE_OF_CONDUCT.md:
--------------------------------------------------------------------------------
1 | # Code of Conduct
2 | 
3 | All documentation, code and communication under this repository are covered by the [W3C Code of Ethics and Professional Conduct](https://www.w3.org/Consortium/cepc/).
4 | 


--------------------------------------------------------------------------------
/CONTRIBUTING.md:
--------------------------------------------------------------------------------
 1 | # Web Platform Incubator Community Group
 2 | 
 3 | This repository is being used for work in the Web Platform Incubator Community Group, governed by the [W3C Community License 
 4 | Agreement (CLA)](http://www.w3.org/community/about/agreements/cla/). To contribute, you must join 
 5 | the CG. 
 6 | 
 7 | If you are not the sole contributor to a contribution (pull request), please identify all 
 8 | contributors in the pull request's body or in subsequent comments.
 9 | 
10 | To add a contributor (other than yourself, that's automatic), mark them one per line as follows:
11 | 
12 | ```
13 | +@github_username
14 | ```
15 | 
16 | If you added a contributor by mistake, you can remove them in a comment with:
17 | 
18 | ```
19 | -@github_username
20 | ```
21 | 
22 | If you are making a pull request on behalf of someone else but you had no part in designing the 
23 | feature, you can remove yourself with the above syntax.
24 | 


--------------------------------------------------------------------------------
/LICENSE.md:
--------------------------------------------------------------------------------
1 | All Reports in this Repository are licensed by Contributors under the 
2 | [W3C Software and Document
3 | License](http://www.w3.org/Consortium/Legal/2015/copyright-software-and-document). Contributions to
4 | Specifications are made under the [W3C CLA](https://www.w3.org/community/about/agreements/cla/).
5 | 
6 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
  1 | 
  2 | # Shape Detection API Specification _:stars:_:movie_camera:
  3 | 
  4 | This is the repository for `shape-detection-api`, an experimental API for detecting Shapes (e.g. Faces, Barcodes, Text) in live or still images on the Web by **using accelerated hardware/OS resources**.
  5 | 
  6 | You're welcome to contribute! Let's make the Web rock our socks off!
  7 | 
  8 | ## [Introduction](https://wicg.github.io/shape-detection-api/#introduction) :blue_book:
  9 | 
 10 | Photos and images constitute the largest chunk of the Web, and many include recognisable features, such as human faces, text or QR codes. Detecting these features is computationally expensive, but would lead to interesting use cases e.g. face tagging or detection of high saliency areas. Users interacting with WebCams or other Video Capture Devices have become accustomed to camera-like features such as the ability to focus directly on human faces on the screen of their devices. This is particularly true in the case of mobile devices, where hardware manufacturers have long been supporting these features. Unfortunately, Web Apps do not yet have access to these hardware capabilities, which makes the use of computationally demanding libraries necessary.
 11 | 
 12 | ## Use cases :camera:
 13 | 
 14 | QR/barcode/text detection can be used for:
 15 | * user identification/registration, e.g. for [voting purposes](https://twitter.com/RegistertoVote/status/733123511128981508);
 16 | * eCommerce, e.g. [Walmart Pay](https://www.slashgear.com/awalmart-announces-walmart-pay-for-qr-code-based-mobile-payments-10417912/);
 17 | * Augmented Reality overlay, e.g. [here](http://www.multidots.com/augmented-reality/);
 18 | * Driving online-to-offline engagement, fighting fakes [etc](https://www.clickz.com/why-have-qr-codes-taken-off-in-china/23662/).
 19 | 
 20 | Face detection can be used for:
 21 | * producing fun effects, e.g. [Snapchat Lenses](https://support.snapchat.com/en-US/a/lenses1);
 22 | * giving hints to encoders or auto focus routines;
 23 | * user name tagging;
 24 | * enhance accesibility by e.g. making objects appear larger as the user gets closer like [HeadTrackr](https://www.auduno.com/headtrackr/examples/targets.html);
 25 | * speeding up Face Recognition by indicating the areas of the image where faces are present.
 26 | 
 27 | 
 28 | ## Current Related Efforts and Workarounds :wrench:
 29 | 
 30 | Some Web Apps -gasp- run Detection in Javascript. A performance comparison of some such libraries can be found [here](https://github.com/mtschirs/js-objectdetect#performance) (note that this performance evaluation does not include e.g. WebCam image acquisition and/or canvas interactions).
 31 | 
 32 | Samsung Browser [has a private API](developer.samsung.com/internet) (click to unfold "Overview for Android", then search for "QR code reader").
 33 | 
 34 | **TODO**: compare a few JS/native libraries in terms of size and performance. A performance and detection comparison of some popular JS QR code scanners can be found [here](https://github.com/danimoh/qr-scanner-benchmark). `zxingjs2` has [a list of some additional JS libraries](https://github.com/ghybs/zxingjs2#other-barcode-image-processing-libraries-related-to-javascript).
 35 | 
 36 | Android Native Apps usually integrate [ZXing](https://github.com/zxing/zxing) (which amounts to adding ~560KB when counting [core.jar](http://repo1.maven.org/maven2/com/google/zxing/core/3.3.0/), [android-core.jar](http://repo1.maven.org/maven2/com/google/zxing/android-core/3.3.0/) and [android-integration.jar](http://repo1.maven.org/maven2/com/google/zxing/android-integration/3.3.0/))).
 37 | 
 38 | OCR reader in Javascript are north of 1MB of size ()
 39 | 
 40 | ## Potential for misuse :money_with_wings:
 41 | 
 42 | Face Detection is an expensive operation due to the algorithmic complexity. Many requests, or demanding systems like a live stream feed with a certain frame rate, could slow down the whole system or greatly increase power consumption.
 43 | 
 44 | ## Platform specific implementation notes :computer:
 45 | 
 46 | ## Overview
 47 | 
 48 | What platforms support what detector?
 49 | 
 50 | Encoder   | Mac| Android | Win10  | Linux   | ChromeOs |
 51 | --------- |:--:| :------:| :---:  | :------:| :------: |
 52 | Face      | sw | hw/sw   | sw     | &#10008;| &#10008; |
 53 | QR/Barcode| sw | sw      |&#10008;| &#10008;| &#10008; |
 54 | Text      | sw | sw      | sw     | &#10008;| &#10008; |
 55 | 
 56 | 
 57 | ### Android
 58 | 
 59 | Android provides both a stand alone software face detector and a interface to the hardware ones.
 60 | 
 61 | | API           |     uses...     | Release notes  |
 62 | | ------------- |:-------------:| -----:|
 63 | | [FaceDetector](https://developer.android.com/reference/android/media/FaceDetector.html)| Software based using the [Neven face detector](https://android.googlesource.com/platform/external/neven)| API Level 1, 2008|
 64 | | [Vision.Face](https://developers.google.com/android/reference/com/google/android/gms/vision/face/Face)| Software based | Google Play services 7.2, Aug 2015|
 65 | | [Camera2](https://developer.android.com/reference/android/hardware/camera2/CaptureRequest.html#STATISTICS_FACE_DETECT_MODE)| Hardware | API Level 21/Lollipop, 2014 |
 66 | | [Camera.Face](https://developer.android.com/reference/android/hardware/Camera.Face.html) (old)| Hardware | API Level 14/Ice Cream Sandwich, 2011 |
 67 | 
 68 | The availability of the actual hardware detection depends on the actual chip; according to the market share in [1H 2016](http://www.antutu.com/en/view.shtml?id=8256) Qualcomm, MediaTek, Samsung and HiSilicon are the largest individual OEMs and they all have support for Face Detection (all the top-10 phones are covered as well):
 69 | * [Qualcomm Snapdragon](https://developer.qualcomm.com/software/snapdragon-sdk-android/facial-recognition) chipset family supports it since ~2013 as part of their ISP.
 70 | * MediaTek as part of [CorePilot 2.0](http://cdn-cw.mediatek.com/White%20Papers/MediaTek_CorePilot%202.0_Final.pdf) (introduced in 2015).
 71 | * [Samsung Exynos](http://www.samsung.com/semiconductor/minisite/Exynos/data/Benefits_of_Exynos_5420_ISP_for_Enhanced_Imaging_Experience.pdf) (at least 2013).
 72 | * Huawei HiSilicon [Kirin950](http://www.androidauthority.com/huawei-hisilicon-kirin-950-official-653811) since 2015 (this fabless manufacturer is relatively new).
 73 | * It is worth noting that ARM [acquired Apical in 2016](https://www.arm.com/products/graphics-and-multimedia/computer-vision) for its computer vision expertise.
 74 | 
 75 | Barcode/QR and Text detection is available via Google Play Services [barcode](https://developers.google.com/android/reference/com/google/android/gms/vision/barcode/package-summary) and [text](https://developers.google.com/android/reference/com/google/android/gms/vision/text/package-summary), respectively.
 76 | 
 77 | ### Mac OS X / iOS
 78 | 
 79 | Mac OS X/iOS provides `CIDetector` and `Vision Framework` for Face, QR, Text and Rectangle detection in software or hardware.
 80 | 
 81 | | API           |     uses...     | Release notes  |
 82 | | ------------- |:-------------: | -----:|
 83 | | [Vision Framework, Mac OS X](https://developer.apple.com/documentation/vision)| Software and Hardware | OS X v10.13, 2017 |
 84 | | [Vision Framework, iOS](https://developer.apple.com/documentation/vision)| Software and Hardware | IOS X v11.0, 2017 |
 85 | | [CIDetector, Mac OS X](https://developer.apple.com/library/mac/documentation/CoreImage/Reference/CIDetector_Ref/)| Software | OS X v10.7, 2011 |
 86 | | [CIDetector, iOS](https://developer.apple.com/library/ios/documentation/CoreImage/Reference/CIDetector_Ref/) | Software | iOS v5.0, 2011 |
 87 | | [AVFoundation](https://developer.apple.com/reference/avfoundation/avcapturemetadataoutput?language=objc)| Hardware | iOS 6.0, 2012 |
 88 | 
 89 | Apple has supported Face Detection in hardware since the [Apple A5 processor](https://en.wikipedia.org/wiki/Apple_A5) introduced in 2011.
 90 | 
 91 | ### Windows
 92 | 
 93 | Windows 10 has a [FaceDetector](https://msdn.microsoft.com/library/windows/apps/dn974129) class and support for Text Detection [OCR](https://msdn.microsoft.com/en-us/library/windows/apps/windows.media.ocr.aspx).
 94 | 
 95 | ## Rendered URL :bookmark_tabs:
 96 | 
 97 | The rendered version of this site can be found in https://wicg.github.io/shape-detection-api (if that's not alive for some reason try the [rawgit rendering](https://rawgit.com/WICG/shape-detection-api/gh-pages/index.html)).
 98 | 
 99 | ## Examples and demos
100 | 
101 | https://wicg.github.io/shape-detection-api/#examples
102 | 
103 | ## Notes on bikeshedding :bicyclist:
104 | 
105 | To compile, run:
106 | 
107 | ```
108 | curl https://api.csswg.org/bikeshed/ -F file=@index.bs -F force=1 > index.html
109 | ```
110 | 
111 | if the produced file has a strange size (i.e. zero), then something went terribly wrong; run instead
112 | 
113 | ```
114 | curl https://api.csswg.org/bikeshed/ -F file=@index.bs -F output=err
115 | ```
116 | and try to figure out why `bikeshed` did not like the `.bs` :'(
117 | 


--------------------------------------------------------------------------------
/index-zh-cn.bs:
--------------------------------------------------------------------------------
  1 | <pre class="metadata">
  2 | Title: 加速的图形识别
  3 | Repository: wicg/shape-detection-api
  4 | Status: w3c/CG-DRAFT
  5 | ED: https://wicg.github.io/shape-detection-api
  6 | Shortname: shape-detection-api
  7 | Level: 1
  8 | Editor: Miguel Casas-Sanchez, w3cid 82825, Google Inc., mcasas@google.com
  9 | Abstract: 本文档描述了一套Chrome中针对静态和/或动态图像的图形识别（如：人脸识别）API。
 10 | Group: wicg
 11 | !Participate: <a href="https://www.w3.org/community/wicg/">Join the W3C Community Group</a>
 12 | !Participate: <a href="https://github.com/WICG/shape-detection-api">Fix the text through GitHub</a>
 13 | </pre>
 14 | 
 15 | <style>
 16 | table {
 17 |   border-collapse: collapse;
 18 |   border-left-style: hidden;
 19 |   border-right-style: hidden;
 20 |   text-align: left;
 21 | }
 22 | table caption {
 23 |   font-weight: bold;
 24 |   padding: 3px;
 25 |   text-align: left;
 26 | }
 27 | table td, table th {
 28 |   border: 1px solid black;
 29 |   padding: 3px;
 30 | }
 31 | </style>
 32 | 
 33 | # 简介 # {#introduction}
 34 | 
 35 | 照片和图像是互联网构成中最大的部分，其中相当一部分包含了可识别的特征，比如人脸，二维码或者文本。可想而之，识别这些特征的计算开销非常大，但有些很有趣场景，比如在照片中自动标记人脸，或者根据图像中的URL进行重定向。硬件厂商从很久以前就已经开始支持这些特性，但Web应用迟迟未能很好地利用上这些硬件特性，必须借助一些难用的程序库才能达到目的。
 36 | 
 37 | ## 图形识别的场景 ## {#image-sources-for-detection}
 38 | 
 39 | 请参考代码库中<a href="https://github.com/WICG/shape-detection-api/blob/gh-pages/README.md">自述/解释</a> 的文档。
 40 | 
 41 | # 图形识别API # {#api}
 42 | 
 43 | 某些特定的浏览器可能会提供识别器来标示当前硬件是否提供加速功能。
 44 | 
 45 | ## 用于识别的图像源 ## {#image-sources-for-detection}
 46 | 
 47 | <p class="note">
 48 | 本节的灵感来自 [[canvas2dcontext#image-sources-for-2d-rendering-contexts]]。
 49 | </p>
 50 | 
 51 | {{ImageBitmapSource}} 允许多种图形接口的实现对象作为图像源，进行识别处理。
 52 | 
 53 | 
 54 | * 当{{ImageBitmapSource}}对象代表{{HTMLImageElement}}的时候，该元素的图像必须用作源图像。而在特定情况下，当{{ImageBitmapSource}}对象代表{{HTMLImageElement}}中的动画图像的时候，用户代理程序(User Agent)必须显示这个动画图像的默认图像（该默认图像指的是，在动画图像被禁用或不支持动画的环境下，需要展现的图像），或者没有默认图像的话，就显示该动画图像的第一帧。
 55 | 
 56 | * 当{{ImageBitmapSource}}对象代表{{HTMLVideoElement}}的时候，该视频播放的当前帧必须用作源图像，同时，该源图像的尺寸必须是视频源的<a href="https://drafts.csswg.org/css2/conform.html#intrinsic">固有维数(intrinsic dimensions)</a>，换句话说，就是视频源经过任意比例的调整后的大小。
 57 | 
 58 | 
 59 | * 当{{ImageBitmapSource}}对象代表{{HTMLCanvasElement}}的时候，该元素的位图必须用作源图像。
 60 | 
 61 | 当用户代理程序(User Agent)被要求用某种既有的{{ImageBitmapSource}}作为识别器的<code>detect()</code>方法的输入参数的时候，必须执行以下步骤：
 62 | 
 63 | * 如果{{ImageBitmapSource}}所含的有效脚本源([[HTML#concept-origin]])和当前文档的有效脚本源不同，就拒绝对应的Promise对象，并附上一个名为{{SecurityError}}的新建{{DOMException}}对象。
 64 | 
 65 | * 如果一个{{ImageBitmapSource}}是一个处于|broken|状态的{{HTMLImageElement}}对象的话，就拒绝对应的Promise对象，并附上一个名为{{InvalidStateError}}的新建{{DOMException}}对象，同时停止之后的所有步骤。
 66 | 
 67 | 
 68 | * 如果{{ImageBitmapSource}}是一个不能完整解码的{{HTMLImageElement}}对象的话，就拒绝对应的Promise对象，并附上一个名为{{InvalidStateError}}的新建{{DOMException}}对象，同时停止之后的所有步骤。
 69 | 
 70 | * 如果一个{{ImageBitmapSource}}是一个{{HTMLVideoElement}}对象，且其<a href="https://html.spec.whatwg.org/multipage/embedded-content.html#ready-states:dom-media-readystate">|readyState|</a>属性为|HAVE_NOTHING| 或 |HAVE_METADATA|的话，就拒绝对应的Promise对象，并附上一个名为{{InvalidStateError}}的新建{{DOMException}}对象，同时停止之后的所有步骤。
 71 | 
 72 | 
 73 | * 如果一个{{ImageBitmapSource}}是一个{{HTMLCanvasElement}}对象，且其位图的|origin-clean| ([[HTML#concept-canvas-origin-clean]])标识为false的话，就拒绝对应的Promise对象，并附上一个名为{{SecurityError}}的新建{{DOMException}}对象，同时停止之后的所有步骤。
 74 | 
 75 | 
 76 | 请注意，如果一个{{ImageBitmapSource}}的水平尺寸或垂直尺寸等于0，那么对应的Promise对象就会被简单地当作一个空的已检测对象序列来处理。
 77 | 
 78 | 
 79 | ## 人脸识别API ## {#face-detection-api}
 80 | 
 81 | {{FaceDetector}}代表一个针对图像中的人脸进行识别的底层加速平台组件。创建时可以选择一个{{FaceDetectorOptions}}的Dictionary对象作为入参。它提供了一个单独的 {{FaceDetector/detect()}}方法操作{{ImageBitmapSource}}对象，并返回Promise对象。如果检测到[[#image-sources-for-detection]]中提及的用例，则该方法必须拒绝该Promise对象；否则，它可能会向{{DetectedFace}}序列推入一个新任务，这样会消耗操作系统或平台资源去依序处理该Promise，每个任务由一个{{DetectedFace/boundingBox}}所包含并界定。
 82 | 
 83 | <pre class="idl">
 84 | dictionary FaceDetectorOptions {
 85 |   unsigned short maxDetectedFaces;
 86 |   boolean fastMode;
 87 | };
 88 | </pre>
 89 | 
 90 | <dl class="domintro">
 91 |   <dt><dfn dict-member for="FaceDetectorOptions"><code>maxDetectedFaces</code></dfn></dt>
 92 |   <dd>当前场景中已识别的人脸数的最大值。</dd>
 93 |   <dt><dfn dict-member for="FaceDetectorOptions"><code>fastMode</code></dfn></dt>
 94 |   <dd>提示User Agent（UA）尝试以速度优先（于精确度）的模式，通过更小的比例尺（更靠近目标图形）或寻找更大的目标图形的办法进行识别。</dd>
 95 | </dl>
 96 | 
 97 | <pre class="idl">
 98 | [Exposed=(Window,Worker), Constructor(optional FaceDetectorOptions faceDetectorOptions)]
 99 | interface FaceDetector {
100 |   Promise&lt;sequence&lt;DetectedFace>> detect(ImageBitmapSource image);
101 | };
102 | </pre>
103 | 
104 | <dl class="domintro">
105 |   <dt><dfn constructor for="FaceDetector"><code>FaceDetector(optional FaceDetectorOptions faceDetectorOptions)</code></dfn></dt>
106 |   <dd>构建一个新的{{FaceDetector}}对象，附带可选项|faceDetectorOptions|。</dd>
107 |   <dt><dfn method for="FaceDetector"><code>detect()</code></dfn></dt>
108 |   <dd>尝试在{{ImageBitmapSource}} |图像|中识别人脸，如果识别到人脸，则返回一个{{DetectedFace}}序列。</dd>
109 | </dl>
110 | 
111 | <pre class="idl">
112 | interface DetectedFace {
113 |   [SameObject] readonly attribute DOMRectReadOnly boundingBox;
114 | };
115 | </pre>
116 | 
117 | <dl class="domintro">
118 |   <dt><dfn attribute for="DetectedFace"><code>boundingBox</code></dfn></dt>
119 |   <dd>与图像坐标轴对齐的一个矩形，该矩形标示了一个已识别特征的位置和范围。</dd>
120 | </dl>
121 | 
122 | <div class="example">
123 | 人脸识别的实现案例有：<a href="https://developer.android.com/reference/android/media/FaceDetector.html">Android FaceDetector</a>, <a href="https://developer.apple.com/reference/coreimage/cifacefeature?language=objc">Apple's CIFaceFeature</a> 或者 <a href="https://msdn.microsoft.com/library/windows/apps/windows.media.faceanalysis.facedetector.aspx">Windows 10 FaceDetector</a>。
124 | </div>
125 | 
126 | <div class="note">
127 |   Consider adding attributes such as, e.g.:
128 |   <pre>
129 |     [SameObject] readonly attribute unsigned long id;
130 |     [SameObject] readonly attribute FrozenArray&lt;Landmark>? landmarks;
131 |   </pre>
132 |   to {{DetectedFace}}.
133 | </div>
134 | 
135 | ## 条形码识别API ## {#barcode-detection-api}
136 | 
137 | {{BarcodeDetector}}代表一个针对图像中的二维码或条形码进行识别的底层加速平台组件。它提供了一个单独的{{BarcodeDetector/detect()}}方法操作{{ImageBitmapSource}}对象，并返回Promise对象。如果检测到[[#image-sources-for-detection]]中提及的情况，则该方法必须拒绝该Promise对象；否则，它可能会向{{DetectedBarcode}}序列推入一个新任务，这样会消耗操作系统或平台资源去依序处理该Promise。基本上每个任务包含{{DetectedBarcode/boundingBox}}和一系列{{Point2D}}，甚至可能还有个解码后的{{DOMString}}对象{{DetectedBarcode/rawValue}}，由它们来确定边界。
138 | 
139 | 
140 | <pre class="idl">
141 | [Exposed=(Window,Worker), Constructor()]
142 | interface BarcodeDetector {
143 |   Promise&lt;sequence&lt;DetectedBarcode>> detect(ImageBitmapSource image);
144 | };
145 | </pre>
146 | 
147 | <dl class="domintro">
148 |   <dt><dfn method for="BarcodeDetector"><code>detect(ImageBitmapSource image)</code></dfn></dt>
149 |   <dd>尝试在{{ImageBitmapSource}}图像中识别条形码。</dd>
150 | </dl>
151 | 
152 | <pre class="idl">
153 | interface DetectedBarcode {
154 |   [SameObject] readonly attribute DOMRectReadOnly boundingBox;
155 |   [SameObject] readonly attribute DOMString rawValue;
156 |   [SameObject] readonly attribute FrozenArray&lt;Point2D> cornerPoints;
157 | };
158 | </pre>
159 | 
160 | <dl class="domintro">
161 |   <dt><dfn attribute for="DetectedBarcode"><code>boundingBox</code></dfn></dt>
162 |   <dd>与图像坐标轴对齐的一个矩形，该矩形标示了一个已识别特征的位置和范围。</dd>
163 | 
164 |   <dt><dfn attribute for="DetectedBarcode"><code>rawValue</code></dfn></dt>
165 |   <dd>从条形码解码得到的DOMString对象，该值可能为多行。</dd>
166 | 
167 |   <dt><dfn attribute for="DetectedBarcode"><code>cornerPoints</code></dfn></dt>
168 |   <dd>一串已识别条形码的顶点序列（<a>sequence</a>），顺序从左上角开始，以顺时针方向排列。因为现实中透视形变的原因，该序列不一定表示的是正方形。</dd>
169 | </dl>
170 | 
171 | <div class="example">
172 | 实现了条形码/二维码识别的示例有：<a href="https://developers.google.com/android/reference/com/google/android/gms/vision/barcode/package-summary">Google Play Services</a> 或者 <a href="https://developer.apple.com/reference/coreimage/ciqrcodefeature?language=objc">Apple's CICRCodeFeature</a>.
173 | </div>
174 | 
175 | ## 文本识别API ## {#text-detection-api}
176 | 
177 | TextDetector代表一个针对图像中的文本进行识别的底层加速平台组件。它提供了一个单独的{{TextDetector/detect()}}方法操作{{ImageBitmapSource}}对象，并返回Promise对象。如果检测到[[#image-sources-for-detection]]中提及的情况，则该方法必须拒绝该Promise对象；否则，它可能会向{{DetectedText}}序列推入一个新任务，这样会消耗操作系统或平台资源去依序处理该Promise，基本上每个task包含一个{{DetectedText/rawValue}}，并由一个{{DetectedText/boundingBox}}来确定边界。
178 | 
179 | 
180 | 
181 | <pre class="idl">
182 | [
183 |     Constructor,
184 |     Exposed=(Window,Worker),
185 | ] interface TextDetector {
186 |     Promise&lt;sequence&lt;DetectedText&gt;&gt; detect(ImageBitmapSource image);
187 | };
188 | </pre>
189 | 
190 | <dl class="domintro">
191 |   <dt><dfn method for="TextDetector"><code>detect(ImageBitmapSource image)</code></dfn></dt>
192 |   <dd>尝试在{{ImageBitmapSource}} |图像|中识别文本块。.</dd>
193 | </dl>
194 | 
195 | <pre class="idl">
196 | [
197 |     Constructor,
198 | ] interface DetectedText {
199 |     [SameObject] readonly attribute DOMRect boundingBox;
200 |     [SameObject] readonly attribute DOMString rawValue;
201 | };
202 | </pre>
203 | 
204 | <dl class="domintro">
205 |   <dt><dfn attribute for="DetectedText"><code>boundingBox</code></dfn></dt>
206 |   <dd>与图像坐标轴对齐的一个矩形，该矩形标示了一个已识别特征的位置和范围。</dd>
207 | 
208 |   <dt><dfn attribute for="DetectedText"><code>rawValue</code></dfn></dt>
209 |   <dd>从图像中识别到的原始字符串。</dd>
210 | </dl>
211 | 
212 | <div class="example">
213 | 实现了文本识别的示例有：<a href="https://developers.google.com/android/reference/com/google/android/gms/vision/text/package-summary">Google Play Services</a>, <a href="https://developer.apple.com/reference/coreimage/cidetectortypetext">Apple's CIDetector</a> 或者 <a href="https://msdn.microsoft.com/en-us/library/windows/apps/windows.media.ocr.aspx">Windows 10 <abbr title="Optical Character Recognition">OCR</abbr> API</a>.
214 | </div>
215 | 
216 | # 示例 # {#examples}
217 | 
218 | <p class="note">
219 | 以下示例的微调或扩展版本，以及更多示例请参考<a href="https://codepen.io/collection/DwWVJj/">这个codepen集合</a>。
220 | </p>
221 | 
222 | ## 图形识别器的平台支持 ## {#platform-support-for-a-given-detector}
223 | 
224 | <div class="note">
225 | 以下的示例同样可以在<a href="https://codepen.io/miguelao/pen/PbYpMv?editors=0010">这个codepen</a>中找到微调的版本。
226 | </div>
227 | 
228 | <div class="example" highlight="javascript">
229 | 
230 | <pre>
231 | if (window.FaceDetector == undefined) {
232 |   console.error('Face Detection not supported on this platform');
233 | }
234 | if (window.BarcodeDetector == undefined) {
235 |   console.error('Barcode Detection not supported on this platform');
236 | }
237 | if (window.TextDetector == undefined) {
238 |   console.error('Text Detection not supported on this platform');
239 | }
240 | </pre>
241 | </div>
242 | 
243 | ## 人脸识别 ## {#face-detection}
244 | 
245 | <div class="note">
246 | 以下的示例同样可以在<a href="https://codepen.io/miguelao/pen/YGrYAm?editors=0010">这个codepen</a>(或者<a href="https://codepen.io/miguelao/pen/ORYbbm?editors=0010">这个有边界框覆盖的图像示例中</a>)找到。
247 | </div>
248 | 
249 | <div class="example" highlight="javascript">
250 | <pre>
251 | let faceDetector = new FaceDetector({fastMode: true, maxDetectedFaces: 1});
252 | // Assuming |theImage| is e.g. a &lt;img> content, or a Blob.
253 | 
254 | faceDetector.detect(theImage)
255 | .then(detectedFaces => {
256 |   for (const face of detectedFaces) {
257 |     console.log(' Face @ (${face.boundingBox.x}, ${face.boundingBox.y}),' +
258 |         ' size ${face.boundingBox.width}x${face.boundingBox.height}');
259 |   }
260 | }).catch(() => {
261 |   console.error("Face Detection failed, boo.");
262 | })
263 | </pre>
264 | </div>
265 | 
266 | ## 条形码识别 ## {#barcode-detection}
267 | 
268 | <div class="note">
269 |  以下的示例同样可以在这个<a href="https://codepen.io/miguelao/pen/ZBBxOM?editors=0010">这个codepen</a>(或者<a href="https://codepen.io/miguelao/pen/bBWOzM?editors=0010">这个覆盖了边界框的图像示例中</a>)找到。
270 | </div>
271 | 
272 | <div class="example" highlight="javascript">
273 | <pre>
274 | let barcodeDetector = new BarcodeDetector();
275 | // Assuming |theImage| is e.g. a &lt;img> content, or a Blob.
276 | 
277 | barcodeDetector.detect(theImage)
278 | .then(detectedCodes => {
279 |   for (const barcode of detectedCodes) {
280 |     console.log(' Barcode ${barcode.rawValue}' +
281 |         ' @ (${barcode.boundingBox.x}, ${barcode.boundingBox.y}) with size' +
282 |         ' ${barcode.boundingBox.width}x${barcode.boundingBox.height}');
283 |   }
284 | }).catch(() => {
285 |   console.error("Barcode Detection failed, boo.");
286 | })
287 | </pre>
288 | </div>
289 | 
290 | ## 文本识别 ## {#text-detection}
291 | 
292 | <div class="note">
293 | 以下的示例同样可以在<a href="http://s.codepen.io/xianglu_google/debug/pNGBKv">这个codepen</a> (或者<a href="https://codepen.io/miguelao/pen/VPMxrj">这个集成了视频捕捉功能的示例</a>)找到。
294 | </div>
295 | 
296 | <div class="example" highlight="javascript">
297 | <pre>
298 | let textDetector = new TextDetector();
299 | // Assuming |theImage| is e.g. a &lt;img> content, or a Blob.
300 | 
301 | textDetector.detect(theImage)
302 | .then(detectedTextBlocks => {
303 |   for (const textBlock of detectedTextBlocks) {
304 |     console.log(
305 |         'text @ (${textBlock.boundingBox.x}, ${textBlock.boundingBox.y}), ' +
306 |         'size ${textBlock.boundingBox.width}x${textBlock.boundingBox.height}');
307 |   }
308 | }).catch(() => {
309 |   console.error("Text Detection failed, boo.");
310 | })
311 | </pre>
312 | </div>
313 | 
314 | 
315 | <pre class="anchors">
316 | spec: ECMAScript; urlPrefix: https://tc39.github.io/ecma262/#
317 |     type: interface
318 |         text: Array; url: sec-array-objects
319 |         text: Promise; url:sec-promise-objects
320 |         text: TypeError; url: sec-native-error-types-used-in-this-standard-typeerror
321 | </pre>
322 | 
323 | <pre class="anchors">
324 | type: interface; text: Point2D; url: https://w3c.github.io/mediacapture-image/#Point2D;
325 | </pre>
326 | 
327 | <pre class="anchors">
328 | type: interface; text: DOMString; url: https://heycam.github.io/webidl/#idl-DOMString; spec: webidl
329 | </pre>
330 | 
331 | <pre class="link-defaults">
332 | spec: html
333 |     type: dfn
334 |         text: allowed to show a popup
335 |         text: in parallel
336 |         text: incumbent settings object
337 | </pre>
338 | 
339 | <pre class="biblio">
340 | {
341 |   "wikipedia": {
342 |       "href": "https://en.wikipedia.org/wiki/Object-class_detection",
343 |       "title": "Object-class Detection Wikipedia Entry",
344 |       "publisher": "Wikipedia",
345 |       "date": "14 September 2016"
346 |   },
347 |   "canvas2dcontext": {
348 |       "authors": [ "Rik Cabanier", "Jatinder Mann", "Jay Munro", "Tom Wiltzius",
349 |                    "Ian Hickson"],
350 |       "href": "https://www.w3.org/TR/2dcontext/",
351 |       "title": "HTML Canvas 2D Context",
352 |       "status": "REC"
353 |   }
354 | }
355 | </pre>
356 | 
357 | 


--------------------------------------------------------------------------------
/index.bs:
--------------------------------------------------------------------------------
  1 | <pre class="metadata">
  2 | Title: Accelerated Shape Detection in Images
  3 | Repository: wicg/shape-detection-api
  4 | Status: CG-DRAFT
  5 | ED: https://wicg.github.io/shape-detection-api
  6 | Shortname: shape-detection-api
  7 | Level: 1
  8 | Editor: Miguel Casas-Sanchez 82825, Google LLC https://www.google.com, mcasas@google.com
  9 | Editor: Reilly Grant 83788, Google LLC https://www.google.com, reillyg@google.com
 10 | Abstract: This document describes an API providing access to accelerated shape detectors (e.g. human faces) for still images and/or live image feeds.
 11 | Translation: zh-CN https://wicg.github.io/shape-detection-api/index-zh-cn.html
 12 | Group: wicg
 13 | Markup Shorthands: markdown yes
 14 | !Participate: <a href="https://www.w3.org/community/wicg/">Join the W3C Community Group</a>
 15 | !Participate: <a href="https://github.com/WICG/shape-detection-api">Fix the text through GitHub</a>
 16 | </pre>
 17 | 
 18 | <style>
 19 | table {
 20 |   border-collapse: collapse;
 21 |   border-left-style: hidden;
 22 |   border-right-style: hidden;
 23 |   text-align: left;
 24 | }
 25 | table caption {
 26 |   font-weight: bold;
 27 |   padding: 3px;
 28 |   text-align: left;
 29 | }
 30 | table td, table th {
 31 |   border: 1px solid black;
 32 |   padding: 3px;
 33 | }
 34 | </style>
 35 | 
 36 | # Introduction # {#introduction}
 37 | 
 38 | Photos and images constitute the largest chunk of the Web, and many include recognisable features, such as human faces or barcordes/QR codes. Detecting these features is computationally expensive, but would lead to interesting use cases e.g. face tagging, or web URL redirection. While hardware manufacturers have been supporting these features for a long time, Web Apps do not yet have access to these hardware capabilities, which makes the use of computationally demanding libraries necessary.
 39 | 
 40 | <p class="note">
 41 | Text Detection, despite being an interesting field, is not considered stable enough across neither computing platforms nor character sets to be standarized in the context of this document. For reference a sister informative specification is kept in [[TEXT-DETECTION-API]].
 42 | </p>
 43 | 
 44 | ## Shape detection use cases ## {#use-cases}
 45 | 
 46 | Please see the <a href="https://github.com/WICG/shape-detection-api/blob/master/README.md">Readme/Explainer</a> in the repository.
 47 | 
 48 | # Shape Detection API # {#api}
 49 | 
 50 | Individual browsers MAY provide Detectors indicating the availability of hardware providing accelerated operation.
 51 | 
 52 | Detecting features in an image occurs asynchronously, potentially communicating with acceleration hardware independent of the browser. Completion events use the <dfn>shape detection task source</dfn>.
 53 | 
 54 | ## Image sources for detection ## {#image-sources-for-detection}
 55 | 
 56 | <p class="note">
 57 | This section is inspired by [[2dcontext#image-sources-for-2d-rendering-contexts]].
 58 | </p>
 59 | 
 60 | {{ImageBitmapSource}} allows objects implementing any of a number of interfaces to be used as image sources for the detection process.
 61 | 
 62 | * When an {{ImageBitmapSource}} object represents an {{HTMLImageElement}}, the element's image must be used as the source image. Specifically, when an {{ImageBitmapSource}} object represents an animated image in an {{HTMLImageElement}}, the user agent must use the default image of the animation (the one that the format defines is to be used when animation is not supported or is disabled), or, if there is no such image, the first frame of the animation.
 63 | 
 64 | * When an {{ImageBitmapSource}} object represents an {{HTMLVideoElement}}, then the frame at the current playback position when the method with the argument is invoked must be used as the source image when processing the image, and the source image's dimensions must be the <a href="https://drafts.csswg.org/css2/conform.html#intrinsic">intrinsic dimensions</a> of the media resource (i.e. after any aspect-ratio correction has been applied).
 65 | 
 66 | * When an {{ImageBitmapSource}} object represents an {{HTMLCanvasElement}}, the element's bitmap must be used as the source image.
 67 | 
 68 | When the UA is required to use a given type of {{ImageBitmapSource}} as input argument for the `detect()` method of whichever detector, it MUST run these steps:
 69 | 
 70 | * If any {{ImageBitmapSource}} have an effective script origin ([=origin=]) which is not the same as the Document's effective script origin, then reject the Promise with a new {{DOMException}} whose name is {{SecurityError}}.
 71 | 
 72 | * If the {{ImageBitmapSource}} is an {{HTMLImageElement}} object that is in the `Broken` (<a href="https://html.spec.whatwg.org/multipage/#img-error">HTML Standard §img-error</a>) state, then reject the Promise with a new {{DOMException}} whose name is {{InvalidStateError}}, and abort any further steps.
 73 | 
 74 | * If the {{ImageBitmapSource}} is an {{HTMLImageElement}} object that is not fully decodable then reject the Promise with a new {{DOMException}} whose name is {{InvalidStateError}}, and abort any further steps
 75 | 
 76 | * If the {{ImageBitmapSource}} is an {{HTMLVideoElement}} object whose {{HTMLMediaElement/readyState}} attribute is either {{HAVE_NOTHING}} or {{HAVE_METADATA}} then reject the Promise with a new {{DOMException}} whose name is {{InvalidStateError}}, and abort any further steps.
 77 | 
 78 | * If the {{ImageBitmapSource}} argument is an {{HTMLCanvasElement}} whose bitmap's `origin-clean` (<a href="https://html.spec.whatwg.org/multipage/#concept-canvas-origin-clean">HTML Standard §concept-canvas-origin-clean</a>) flag is false, then reject the Promise with a new {{DOMException}} whose name is {{SecurityError}}, and abort any further steps.
 79 | 
 80 | Note that if the {{ImageBitmapSource}} is an object with either a horizontal dimension or a vertical dimension equal to zero, then the Promise will be simply resolved with an empty sequence of detected objects.
 81 | 
 82 | ## Face Detection API ## {#face-detection-api}
 83 | 
 84 | {{FaceDetector}} represents an underlying accelerated platform's component for detection of human faces in images. It can be created with an optional Dictionary of {{FaceDetectorOptions}}. It provides a single {{FaceDetector/detect()}} operation on an {{ImageBitmapSource}} which result is a Promise. This method MUST reject this promise in the cases detailed in [[#image-sources-for-detection]]; otherwise it MAY queue a task that utilizes the OS/Platform resources to resolve the Promise with a Sequence of {{DetectedFace}}s, each one essentially consisting on and delimited by a {{DetectedFace/boundingBox}}.
 85 | 
 86 | <div class="example">
 87 | Example implementations of face detection are e.g. <a href="https://developer.android.com/reference/android/media/FaceDetector.html">Android FaceDetector</a> (or the <a href="https://developers.google.com/android/reference/com/google/android/gms/vision/face/Face">Google Play Services vision library</a>), Apple's <a href="https://developer.apple.com/reference/coreimage/cifacefeature?language=objc">CIFaceFeature</a> / <a href="https://developer.apple.com/documentation/vision/vndetectfacelandmarksrequest?language=objc">VNDetectFaceLandmarksRequest</a> or <a href="https://msdn.microsoft.com/library/windows/apps/windows.media.faceanalysis.facedetector.aspx">Windows 10 FaceDetector</a>.
 88 | </div>
 89 | 
 90 | <xmp class="idl">
 91 | [Exposed=(Window,Worker),
 92 |  SecureContext]
 93 | interface FaceDetector {
 94 |   constructor(optional FaceDetectorOptions faceDetectorOptions = {});
 95 |   Promise<sequence<DetectedFace>> detect(ImageBitmapSource image);
 96 | };
 97 | </xmp>
 98 | 
 99 | <dl class="domintro">
100 |   <dt><dfn constructor for="FaceDetector"><code>FaceDetector(optional FaceDetectorOptions |faceDetectorOptions|)</code></dfn></dt>
101 |   <dd>Constructs a new {{FaceDetector}} with the optional |faceDetectorOptions|.
102 |     <div class="note">
103 |     Detectors may potentially allocate and hold significant resources. Where possible, reuse the same {{FaceDetector}} for several detections.
104 |     </div>
105 |   </dd>
106 |   <dt><dfn method for="FaceDetector"><code>detect(ImageBitmapSource |image|)</code></dfn></dt>
107 |   <dd>Tries to detect human faces in the {{ImageBitmapSource}} |image|.  The detected faces, if any, are returned as a sequence of {{DetectedFace}}s.</dd>
108 | </dl>
109 | 
110 | ### {{FaceDetectorOptions}} ### {#facedetectoroptions-section}
111 | 
112 | <xmp class="idl">
113 | dictionary FaceDetectorOptions {
114 |   unsigned short maxDetectedFaces;
115 |   boolean fastMode;
116 | };
117 | </xmp>
118 | 
119 | <dl class="domintro">
120 |   <dt><dfn dict-member for="FaceDetectorOptions">`maxDetectedFaces`</dfn></dt>
121 |   <dd>Hint to the UA to try and limit the amount of detected faces on the scene to this maximum number.</dd>
122 |   <dt><dfn dict-member for="FaceDetectorOptions">`fastMode`</dfn></dt>
123 |   <dd>Hint to the UA to try and prioritise speed over accuracy by e.g. operating on a reduced scale or looking for large features.</dd>
124 | </dl>
125 | 
126 | ### {{DetectedFace}} ### {#detectedface-section}
127 | 
128 | <xmp class="idl">
129 | dictionary DetectedFace {
130 |   required DOMRectReadOnly boundingBox;
131 |   required sequence<Landmark>? landmarks;
132 | };
133 | </xmp>
134 | 
135 | <dl class="domintro">
136 |   <dt><dfn dict-member for="DetectedFace">`boundingBox`</dfn></dt>
137 |   <dd>A rectangle indicating the position and extent of a detected feature aligned to the image axes.</dd>
138 |   <dt><dfn dict-member for="DetectedFace">`landmarks`</dfn></dt>
139 |   <dd>A series of features of interest related to the detected feature.</dd>
140 | </dl>
141 | 
142 | <xmp class="idl">
143 | dictionary Landmark {
144 |   required sequence<Point2D> locations;
145 |   LandmarkType type;
146 | };
147 | </xmp>
148 | 
149 | <dl class="domintro">
150 |   <dt><dfn dict-member for="Landmark">`locations`</dfn></dt>
151 |   <dd>A point in the center of the detected landmark, or a <a>sequence</a> of points defining the vertices of a simple polygon surrounding the landmark in either a clockwise or counter-clockwise direction.</dd>
152 |   <dt><dfn dict-member for="Landmark">`type`</dfn></dt>
153 |   <dd>Type of the landmark, if known.</dd>
154 | </dl>
155 | 
156 | <xmp class="idl">
157 | enum LandmarkType {
158 |   "mouth",
159 |   "eye",
160 |   "nose"
161 | };
162 | </xmp>
163 | 
164 | <dl class="domintro">
165 |   <dt><dfn enum-value for="LandmarkType">`mouth`</dfn></dt>
166 |   <dd>The landmark is identified as a human mouth.</dd>
167 |   <dt><dfn enum-value for="LandmarkType">`eye`</dfn></dt>
168 |   <dd>The landmark is identified as a human eye.</dd>
169 |   <dt><dfn enum-value for="LandmarkType">`nose`</dfn></dt>
170 |   <dd>The landmark is identified as a human nose.</dd>
171 | </dl>
172 | 
173 | <div class="note">
174 |   Consider adding attributes such as, e.g.:
175 |   <pre>
176 |     [SameObject] readonly attribute unsigned long id;
177 |   </pre>
178 |   to {{DetectedFace}}.
179 | </div>
180 | 
181 | ## Barcode Detection API ## {#barcode-detection-api}
182 | 
183 | {{BarcodeDetector}} represents an underlying accelerated platform's component for detection of linear or two-dimensional barcodes in images.  It provides a single {{BarcodeDetector/detect()}} operation on an {{ImageBitmapSource}} which result is a Promise.  This method MUST reject this Promise in the cases detailed in [[#image-sources-for-detection]]; otherwise it MAY queue a task using the OS/Platform resources to resolve the Promise with a sequence of {{DetectedBarcode}}s, each one essentially consisting on and delimited by a {{DetectedBarcode/boundingBox}} and a series of {{Point2D}}s, and possibly a {{DetectedBarcode/rawValue}} decoded {{DOMString}}.
184 | 
185 | <div class="example">
186 | Example implementations of Barcode/QR code detection are e.g. <a href="https://developers.google.com/android/reference/com/google/android/gms/vision/barcode/package-summary">Google Play Services</a> or Apple's <a href="https://developer.apple.com/reference/coreimage/ciqrcodefeature?language=objc">CIQRCodeFeature</a> / <a href="https://developer.apple.com/documentation/vision/vndetectbarcodesrequest?language=objc">VNDetectBarcodesRequest</a>.
187 | </div>
188 | 
189 | <xmp class="idl">
190 | [Exposed=(Window,Worker),
191 |  SecureContext]
192 | interface BarcodeDetector {
193 |   constructor(optional BarcodeDetectorOptions barcodeDetectorOptions = {});
194 |   static Promise<sequence<BarcodeFormat>> getSupportedFormats();
195 | 
196 |   Promise<sequence<DetectedBarcode>> detect(ImageBitmapSource image);
197 | };
198 | </xmp>
199 | 
200 | <dl class="domintro">
201 |   <dt><dfn constructor for="BarcodeDetector"><code>BarcodeDetector(optional BarcodeDetectorOptions |barcodeDetectorOptions|)</code></dfn></dt>
202 |   <dd>Constructs a new {{BarcodeDetector}} with |barcodeDetectorOptions|.
203 |     * If |barcodeDetectorOptions|.{{BarcodeDetectorOptions/formats}} is present and empty, then throw a new {{TypeError}}.
204 |     * If |barcodeDetectorOptions|.{{BarcodeDetectorOptions/formats}} is present and contains {{unknown}}, then throw a new {{TypeError}}.
205 |   <div class="note">
206 |     Detectors may potentially allocate and hold significant resources. Where possible, reuse the same {{BarcodeDetector}} for several detections.
207 |   </div>
208 |   </dd>
209 | 
210 |   <dt><dfn method for="BarcodeDetector">`getSupportedFormats()`</dfn></dt>
211 |   <dd>This method, when invoked, MUST return a new {{Promise}} |promise| and run the following steps <a>in parallel</a>:
212 |     <ol>
213 |       <li>Let |supportedFormats| be a new {{Array}}.</li>
214 |       <li>If the UA does not support barcode detection, [=queue a global task=] on the [=relevant global object=] of [=this=] using the [=shape detection task source=] to [=resolve=] |promise| with |supportedFormats| and abort these steps.</li>
215 |       <li>Enumerate the {{BarcodeFormat}}s that the UA understands as potentially detectable in images. Add these to |supportedFormats|.
216 |       <div class="note">
217 |         The UA cannot give a definitive answer as to whether a given barcode format will <i>always</i> be recognized on an image due to e.g. positioning of the symbols or encoding errors. If a given barcode symbology is not in |supportedFormats| array, however, it should not be detectable whatsoever.
218 |       </div>
219 |       </li>
220 |       <li>[=Queue a global task=] on the [=relevant global object=] of [=this=] using the [=shape detection task source=] to [=resolve=] |promise| with |supportedFormats|.</li>
221 |    </ol>
222 |   <div class="note">
223 |     The list of supported {{BarcodeFormat}}s is platform dependent, some examples are the ones supported by <a href="https://developers.google.com/android/reference/com/google/android/gms/vision/barcode/BarcodeDetector.Builder.html#setBarcodeFormats(int)">Google Play Services</a> and <a href="https://developer.apple.com/documentation/coreimage/ciqrcodefeature?preferredLanguage=occ#overview">Apple's QICRCodeFeature</a>.
224 |   </div>
225 |   </dd>
226 | 
227 |   <dt><dfn method for="BarcodeDetector"><code>detect(ImageBitmapSource |image|)</code></dfn></dt>
228 |   <dd>Tries to detect barcodes in the {{ImageBitmapSource}} |image|.</dd>
229 | </dl>
230 | 
231 | ### {{BarcodeDetectorOptions}} ### {#barcodedetectoroptions-section}
232 | 
233 | <xmp class="idl">
234 | dictionary BarcodeDetectorOptions {
235 |   sequence<BarcodeFormat> formats;
236 | };
237 | </xmp>
238 | 
239 | <dl class="domintro">
240 |   <dt><dfn dict-member for="BarcodeDetectorOptions">`formats`</dfn></dt>
241 |   <dd>A series of {{BarcodeFormat}}s to search for in the subsequent {{BarcodeDetector/detect()}} calls. If not present then the UA SHOULD search for all supported formats.
242 |   <div class="note">
243 |     Limiting the search to a particular subset of supported formats is likely to provide better performance.
244 |   </div>
245 |   </dd>
246 | </dl>
247 | 
248 | ### {{DetectedBarcode}} ### {#detectedbarcode-section}
249 | 
250 | <xmp class="idl">
251 | dictionary DetectedBarcode {
252 |   required DOMRectReadOnly boundingBox;
253 |   required DOMString rawValue;
254 |   required BarcodeFormat format;
255 |   required sequence<Point2D> cornerPoints;
256 | };
257 | </xmp>
258 | 
259 | <dl class="domintro">
260 |   <dt><dfn dict-member for="DetectedBarcode">`boundingBox`</dfn></dt>
261 |   <dd>A rectangle indicating the position and extent of a detected feature aligned to the image</dd>
262 | 
263 |   <dt><dfn dict-member for="DetectedBarcode">`rawValue`</dfn></dt>
264 |   <dd>String decoded from the barcode. This value might be multiline.</dd>
265 | 
266 |   <dt><dfn dict-member for="DetectedBarcode">`format`</dfn></dt>
267 |   <dd>Detect {{BarcodeFormat}}.</dd>
268 | 
269 |   <dt><dfn dict-member for="DetectedBarcode">`cornerPoints`</dfn></dt>
270 |   <dd>A <a>sequence</a> of corner points of the detected barcode, in clockwise direction and  starting with top-left. This is not necessarily a square due to possible perspective distortions.</dd>
271 | </dl>
272 | 
273 | ### {{BarcodeFormat}} ### {#barcodeformat-section}
274 | 
275 | <xmp class="idl">
276 |   enum BarcodeFormat {
277 |     "aztec",
278 |     "code_128",
279 |     "code_39",
280 |     "code_93",
281 |     "codabar",
282 |     "data_matrix",
283 |     "ean_13",
284 |     "ean_8",
285 |     "itf",
286 |     "pdf417",
287 |     "qr_code",
288 |     "unknown",
289 |     "upc_a",
290 |     "upc_e"
291 |   };
292 | </xmp>
293 | 
294 | <dl class="domintro">
295 |   <dt><dfn enum-value for="BarcodeFormat">`aztec`</dfn></dt>
296 |   <dd>This entry represents a square two-dimensional matrix following [[iso24778]] and with a square bullseye pattern at their centre, thus resembling an Aztec pyramid. Does not require a surrounding blank zone.
297 |   </dd>
298 |   <dt><dfn enum-value for="BarcodeFormat">`code_128`</dfn></dt>
299 |   <dd><dfn>Code 128</dfn> is a linear (one-dimensional), bidirectionally-decodable, self-checking barcode following [[iso15417]] and able to encode all 128 characters of ASCII (hence the naming).
300 |   </dd>
301 |   <dt><dfn enum-value for="BarcodeFormat">`code_39`</dfn></dt>
302 |   <dd>This part talks about the <dfn>Code 39</dfn> barcode. It is a discrete and variable-length barcode type.
303 |   [[iso16388]]
304 |   </dd>
305 |   <dt><dfn enum-value for="BarcodeFormat">`code_93`</dfn></dt>
306 |   <dd> Code 93 is a linear, continuous symbology with a variable length following [[bc5]]. It offers a larger information density than <a>Code 128</a> and the visually similar <a>Code 39</a>. Code 93 is used primarily by Canada Post to encode supplementary delivery information.</dd>
307 |   <dt><dfn enum-value for="BarcodeFormat">`codabar`</dfn></dt>
308 |   <dd>Codabar is a linear barcode symbology developed in 1972 by Pitney Bowes Corp. (
309 |   </dd>
310 | 
311 |   <dt><dfn enum-value for="BarcodeFormat">`data_matrix`</dfn></dt>
312 |   <dd> Data Matrix is an orientation-independent two-dimensional barcode composed of black and white modules arranged in either a square or rectangular pattern following [[iso16022]].</dd>
313 | 
314 |   <dt><dfn enum-value for="BarcodeFormat">`ean_13`</dfn></dt>
315 |   <dd><dfn>EAN-13</dfn> is a linear barcode based on the <a>UPC-A</a> standard and defined in [[iso15420]]. It was originally developed by the International Article Numbering Association (EAN) in Europe as a superset of the original 12-digit Universal Product Code (UPC) system developed in the United States (<a>UPC-A</a> codes are represented in EAN-13 with the first character set to <var>0</var>).</dd>
316 |   <dt><dfn enum-value for="BarcodeFormat">`ean_8`</dfn></dt>
317 |   <dd>EAN-8 is a linear barcode defined in [[iso15420]] and derived from <a>EAN-13</a>.</dd>
318 | 
319 |   <dt><dfn enum-value for="BarcodeFormat">`itf`</dfn></dt>
320 |   <dd>ITF14 barcode is the GS1 implementation of an Interleaved 2 of 5 bar code to encode a Global Trade Item  Number. It is continuous, self-checking, bidirectionally decodable and it will always encode 14 digits.
321 |   was once used in the package delivery industry but replaced by <a>Code 128</a>.
322 |   [[bc2]]
323 |   </dd>
324 | 
325 |   <dt><dfn enum-value for="BarcodeFormat">`pdf417`</dfn></dt>
326 |   <dd>PDF417 refers to a continuous two-dimensional barcode symbology format with multiple rows and columns, bi-directionally decodable and according to the Standard [[iso15438]].</dd>
327 | 
328 |   <dt><dfn enum-value for="BarcodeFormat">`qr_code`</dfn></dt>
329 |   <dd>QR Code is a two-dimensional barcode respecting the Standard [[iso18004]]. The information encoded can be text, URL or other data.</dd>
330 | 
331 |   <dt><dfn enum-value for="BarcodeFormat">`unknown`</dfn></dt>
332 |   <dd>This value is used by the platform to signify that it does not know or specify which barcode format is being detected or supported.</dd>
333 | 
334 |   <dt><dfn enum-value for="BarcodeFormat">`upc_a`</dfn></dt>
335 |   <dd><dfn>UPC-A</dfn> is one of the most common linear barcode types and is widely applied to retail in the United States.  Define in [[iso15420]], it represents digits by strips of bars and spaces, each digit being associated to a unique pattern of 2 bars and 2 spaces, both of variable width.  UPC-A can encode 12 digits that are uniquely assigned to each trade item, and it'ss technically a subset of <a>EAN-13</a> (UPC-A codes are represented in <a>EAN-13</a> with the first character set to <var>0</var>). </dd>
336 | 
337 |   <dt><dfn enum-value for="BarcodeFormat">`upc_e`</dfn></dt>
338 |   <dd>UPC-E Barcode is a variation of <a>UPC-A</a> defined in [[iso15420]], compressing out unnecessary zeros for a more compact barcode.</dd>
339 | </dl>
340 | 
341 | # Security and Privacy Considerations # {#security-and-privacy-considerations}
342 | 
343 | <em>This section is non-normative.</em>
344 | 
345 | This interface reveals information about the contents of an image source. It is
346 | critical for implementations to ensure that it cannot be used to bypass
347 | protections that would otherwise protect an image source from inspection.
348 | [[#image-sources-for-detection]] describes the algorithm to accomplish this.
349 | 
350 | By providing high-performance shape detection capabilities this interface allows
351 | developers to run image analysis tasks on the local device. This offers a
352 | privacy advantage over offloading computation to a remote system. Developers
353 | should consider the results returned by this interface as privacy sensitive as
354 | the original image from which they were derived.
355 | 
356 | # Examples # {#examples}
357 | 
358 | <i>This section is non-normative.</i>
359 | 
360 | <p class="note">
361 | Slightly modified/extended versions of these examples (and more) can be found in
362 |  e.g. <a href="https://codepen.io/collection/DwWVJj/">this codepen collection</a>.
363 | </p>
364 | 
365 | ## Platform support for a given detector ## {#example-feature-detection}
366 | 
367 | <div class="note">
368 | The following example can also be found in e.g. <a
369 | href="https://codepen.io/miguelao/pen/PbYpMv?editors=0010">this codepen</a>
370 | with minimal modifications.
371 | </div>
372 | 
373 | <div class="example">
374 | ```js
375 | if (window.FaceDetector == undefined) {
376 |   console.error('Face Detection not supported on this platform');
377 | }
378 | if (window.BarcodeDetector == undefined) {
379 |   console.error('Barcode Detection not supported on this platform');
380 | }
381 | ```
382 | </div>
383 | 
384 | ## Face Detection ## {#example-face-detection}
385 | 
386 | <div class="note">
387 | The following example can also be found in e.g.
388 | <a href="https://codepen.io/miguelao/pen/ORYbbm?editors=0010">this codepen</a> (or <a href="https://codepen.io/miguelao/pen/PmJWro">this one</a>, with landmarks overlay).
389 | </div>
390 | 
391 | <div class="example">
392 | ```js
393 | let faceDetector = new FaceDetector({fastMode: true, maxDetectedFaces: 1});
394 | // Assuming |theImage| is e.g. a &lt;img> content, or a Blob.
395 | 
396 | faceDetector.detect(theImage)
397 | .then(detectedFaces => {
398 |   for (const face of detectedFaces) {
399 |     console.log(
400 |         ' Face @ (${face.boundingBox.x}, ${face.boundingBox.y}),' +
401 |         ' size ${face.boundingBox.width}x${face.boundingBox.height}');
402 |   }
403 | }).catch(() => {
404 |   console.error("Face Detection failed, boo.");
405 | })
406 | ```
407 | </div>
408 | 
409 | ## Barcode Detection ## {#example-barcode-detection}
410 | 
411 | <div class="note">
412 | The following example can also be found in e.g.
413 | <a href="https://codepen.io/miguelao/pen/wgrYjZ?editors=0010">this codepen</a>.
414 | </div>
415 | 
416 | <div class="example">
417 | ```js
418 | let barcodeDetector = new BarcodeDetector();
419 | // Assuming |theImage| is e.g. a &lt;img> content, or a Blob.
420 | 
421 | barcodeDetector.detect(theImage)
422 | .then(detectedCodes => {
423 |   for (const barcode of detectedCodes) {
424 |     console.log(' Barcode ${barcode.rawValue}' +
425 |         ' @ (${barcode.boundingBox.x}, ${barcode.boundingBox.y}) with size' +
426 |         ' ${barcode.boundingBox.width}x${barcode.boundingBox.height}');
427 |   }
428 | }).catch(() => {
429 |   console.error("Barcode Detection failed, boo.");
430 | })
431 | ```
432 | </div>
433 | 
434 | <pre class="link-defaults">
435 | spec: html
436 |     type: dfn
437 |         text: allowed to show a popup
438 |         text: in parallel
439 |         text: incumbent settings object
440 |         for: /
441 |             text: origin
442 | </pre>
443 | 
444 | <pre class="biblio">
445 | {
446 |   "iso15417": {
447 |       "href": "https://www.iso.org/standard/43896.html",
448 |       "title": "Information technology -- Automatic identification and data capture techniques -- Code 128 bar code symbology specification",
449 |       "publisher": "ISO/IEC",
450 |       "date": "June 2007"
451 |   },
452 |   "iso15420": {
453 |       "href": "https://www.iso.org/standard/46143.html",
454 |       "title": "Information technology -- Automatic identification and data capture techniques -- EAN/UPC bar code symbology specification",
455 |       "publisher": "ISO/IEC",
456 |       "date": "Decemver 2009"
457 |   },
458 |   "iso15438": {
459 |       "href": "https://www.iso.org/standard/65502.html",
460 |       "title": "Information technology -- Automatic identification and data capture techniques -- PDF417 bar code symbology specification",
461 |       "publisher": "ISO/IEC",
462 |       "date": "September 2015"
463 |   },
464 |   "iso16022": {
465 |       "href": "https://www.iso.org/standard/44230.html",
466 |       "title": "Information technology -- Automatic identification and data capture techniques -- Data Matrix bar code symbology specification",
467 |       "publisher": "ISO/IEC",
468 |       "date": "September 2009"
469 |   },
470 |   "iso16388": {
471 |       "href": "https://www.iso.org/standard/43897.html",
472 |       "title": "nformation technology -- Automatic identification and data capture techniques -- Code 39 bar code symbology specification",
473 |       "publisher": "ISO/IEC",
474 |       "date": "May 2007"
475 |   },
476 |   "iso18004": {
477 |       "href": "https://www.iso.org/standard/62021.html",
478 |       "title": "Information technology -- Automatic identification and data capture techniques -- QR Code bar code symbology specification",
479 |       "publisher": "ISO/IEC",
480 |       "date": "February 2015"
481 |   },
482 |   "iso24778": {
483 |       "href": "https://www.iso.org/standard/62021.html",
484 |       "title": "Information technology -- Automatic identification and data capture techniques -- Aztec Code bar code symbology specification",
485 |       "publisher": "ISO/IEC",
486 |       "date": "February 2008"
487 |   },
488 |   "bc2" :{
489 |       "title": "ANSI/AIM-BC2, Uniform Symbol Specification - Interleaved 2 of 5",
490 |       "publisher": "ANSI",
491 |       "date": "1995"
492 |   },
493 |   "bc5" :{
494 |       "title": "ANSI/AIM-BC5, Uniform Symbol Specification - Code 93",
495 |       "publisher": "ANSI",
496 |       "date": "1995"
497 |   }
498 | }
499 | </pre>
500 | 


--------------------------------------------------------------------------------
/text.bs:
--------------------------------------------------------------------------------
  1 | <pre class="metadata">
  2 | Title: Accelerated Text Detection in Images
  3 | Repository: wicg/shape-detection-api
  4 | Status: CG-DRAFT
  5 | ED: https://wicg.github.io/shape-detection-api
  6 | Shortname: text-detection-api
  7 | Level: 1
  8 | Editor: Miguel Casas-Sanchez 82825, Google LLC https://www.google.com, mcasas@google.com
  9 | Editor: Reilly Grant 83788, Google LLC https://www.google.com, reillyg@google.com
 10 | Abstract: This document describes an API providing access to accelerated text detectors for still images and/or live image feeds.
 11 | Group: wicg
 12 | Markup Shorthands: markdown yes
 13 | !Participate: <a href="https://www.w3.org/community/wicg/">Join the W3C Community Group</a>
 14 | !Participate: <a href="https://github.com/WICG/shape-detection-api">Fix the text through GitHub</a>
 15 | </pre>
 16 | 
 17 | <style>
 18 | table {
 19 |   border-collapse: collapse;
 20 |   border-left-style: hidden;
 21 |   border-right-style: hidden;
 22 |   text-align: left;
 23 | }
 24 | table caption {
 25 |   font-weight: bold;
 26 |   padding: 3px;
 27 |   text-align: left;
 28 | }
 29 | table td, table th {
 30 |   border: 1px solid black;
 31 |   padding: 3px;
 32 | }
 33 | </style>
 34 | 
 35 | # Introduction # {#introduction}
 36 | 
 37 | Photos and images constitute the largest chunk of the Web, and many include recognisable features, such as human faces, QR codes or text. Detecting these features is computationally expensive, but would lead to interesting use cases e.g. face tagging, or web URL redirection. This document deals with text detection whereas the sister document [[SHAPE-DETECTION-API]] specifies the Face and Barcode detection cases and APIs.
 38 | 
 39 | ## Text detection use cases ## {#use-cases}
 40 | 
 41 | Please see the <a href="https://github.com/WICG/shape-detection-api/blob/gh-pages/README.md">Readme/Explainer</a> in the repository.
 42 | 
 43 | # Text Detection API # {#api}
 44 | 
 45 | Individual browsers MAY provide Detectors indicating the availability of hardware providing accelerated operation.
 46 | 
 47 | ## Image sources for detection ## {#image-sources-for-detection}
 48 | 
 49 | Please refer to [[SHAPE-DETECTION-API#image-sources-for-detection]]
 50 | 
 51 | ## Text Detection API ## {#text-detection-api}
 52 | 
 53 | {{TextDetector}} represents an underlying accelerated platform's component for detection in images of Latin-1 text as defined in [[iso8859-1]].  It provides a single {{TextDetector/detect()}} operation on an {{ImageBitmapSource}} of which the result is a Promise.  This method must reject this Promise in the cases detailed in [[#image-sources-for-detection]]; otherwise it may queue a task using the OS/Platform resources to resolve the Promise with a sequence of {{DetectedText}}s, each one essentially consisting on a {{DetectedText/rawValue}} and delimited by a {{DetectedText/boundingBox}} and a series of {{Point2D}}s.
 54 | 
 55 | <div class="example">
 56 | Example implementations of Text code detection are e.g. <a href="https://developers.google.com/android/reference/com/google/android/gms/vision/text/package-summary">Google Play Services</a>, <a href="https://developer.apple.com/reference/coreimage/cidetectortypetext">Apple's CIDetector</a> (bounding box only, no OCR) or <a href="https://msdn.microsoft.com/en-us/library/windows/apps/windows.media.ocr.aspx">Windows 10 <abbr title="Optical Character Recognition">OCR</abbr> API</a>.
 57 | </div>
 58 | 
 59 | <xmp class="idl">
 60 | [
 61 |     Exposed=(Window,Worker),
 62 |     SecureContext
 63 | ] interface TextDetector {
 64 |     constructor();
 65 |     Promise<sequence<DetectedText>> detect(ImageBitmapSource image);
 66 | };
 67 | </xmp>
 68 | 
 69 | <dl class="domintro">
 70 |   <dt><dfn constructor for="TextDetector">`TextDetector()`</dfn></dt>
 71 |   <dd>
 72 |     <div class="note">
 73 |     Detectors may potentially allocate and hold significant resources. Where possible, reuse the same {{TextDetector}} for several detections.
 74 |     </div>
 75 |   </dd>
 76 |   <dt><dfn method for="TextDetector"><code>detect(ImageBitmapSource |image|)</code></dfn></dt>
 77 |   <dd>Tries to detect text blocks in the {{ImageBitmapSource}} |image|.</dd>
 78 | </dl>
 79 | 
 80 | ### {{DetectedText}} ### {#detectedtext-section}
 81 | 
 82 | <xmp class="idl">
 83 | dictionary DetectedText {
 84 |   required DOMRectReadOnly boundingBox;
 85 |   required DOMString rawValue;
 86 |   required sequence<Point2D> cornerPoints;
 87 | };
 88 | </xmp>
 89 | 
 90 | <dl class="domintro">
 91 |   <dt><dfn dict-member for="DetectedText">`boundingBox`</dfn></dt>
 92 |   <dd>A rectangle indicating the position and extent of a detected feature aligned to the image</dd>
 93 | 
 94 |   <dt><dfn dict-member for="DetectedText">`rawValue`</dfn></dt>
 95 |   <dd>Raw string detected from the image, where characters are drawn from [[iso8859-1]].</dd>
 96 | 
 97 |   <dt><dfn dict-member for="DetectedText">`cornerPoints`</dfn></dt>
 98 |   <dd>A <a>sequence</a> of corner points of the detected feature, in clockwise direction and  starting with top-left. This is not necessarily a square due to possible perspective distortions.</dd>
 99 | </dl>
100 | 
101 | # Examples # {#examples}
102 | 
103 | <i>This section is non-normative.</i>
104 | 
105 | <p class="note">
106 | Slightly modified/extended versions of these examples (and more) can be found in
107 |  e.g. <a href="https://codepen.io/collection/DwWVJj/">this codepen collection</a>.
108 | </p>
109 | 
110 | ## Platform support for a text detector ## {#example-feature-detection}
111 | 
112 | <div class="note">
113 | The following example can also be found in e.g. <a
114 | href="https://codepen.io/miguelao/pen/PbYpMv?editors=0010">this codepen</a>
115 | with minimal modifications.
116 | </div>
117 | 
118 | <div class="example">
119 | ```js
120 | if (window.TextDetector == undefined) {
121 |   console.error('Text Detection not supported on this platform');
122 | }
123 | ```
124 | </div>
125 | 
126 | ## Text Detection ## {#example-text-detection}
127 | 
128 | <div class="note">
129 | The following example can also be found in e.g.
130 | <a href="https://codepen.io/miguelao/pen/ygxVqg">this codepen</a>.
131 | </div>
132 | 
133 | <div class="example">
134 | ```js
135 | let textDetector = new TextDetector();
136 | // Assuming |theImage| is e.g. a &lt;img> content, or a Blob.
137 | 
138 | textDetector.detect(theImage)
139 | .then(detectedTextBlocks => {
140 |   for (const textBlock of detectedTextBlocks) {
141 |     console.log(
142 |         'text @ (${textBlock.boundingBox.x}, ${textBlock.boundingBox.y}), ' +
143 |         'size ${textBlock.boundingBox.width}x${textBlock.boundingBox.height}');
144 |   }
145 | }).catch(() => {
146 |   console.error("Text Detection failed, boo.");
147 | })
148 | ```
149 | </div>
150 | 
151 | <pre class="link-defaults">
152 | spec: html
153 |     type: dfn
154 |         text: allowed to show a popup
155 |         text: in parallel
156 |         text: incumbent settings object
157 | </pre>
158 | 
159 | <pre class="biblio">
160 | {
161 |   "iso8859-1": {
162 |       "href": "https://www.iso.org/standard/28245.html",
163 |       "title": "Information technology -- 8-bit single-byte coded graphic character sets -- Part 1: Latin alphabet No. 1",
164 |       "publisher": "ISO/IEC",
165 |       "date": "April 1998"
166 |   }
167 | }
168 | </pre>
169 | 


--------------------------------------------------------------------------------
/w3c.json:
--------------------------------------------------------------------------------
1 |  {
2 |     "group":      [80485]
3 | ,   "contacts":   ["marcoscaceres"]
4 | ,   "repo-type":  "cg-report"
5 | }
6 | 


--------------------------------------------------------------------------------