-
Notifications
You must be signed in to change notification settings - Fork 36
/
Copy pathtext.bs
168 lines (137 loc) · 6.28 KB
/
text.bs
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
<pre class="metadata">
Title: Accelerated Text Detection in Images
Repository: wicg/shape-detection-api
Status: CG-DRAFT
ED: https://wicg.github.io/shape-detection-api
Shortname: text-detection-api
Level: 1
Editor: Miguel Casas-Sanchez 82825, Google LLC https://www.google.com, [email protected]
Editor: Reilly Grant 83788, Google LLC https://www.google.com, [email protected]
Abstract: This document describes an API providing access to accelerated text detectors for still images and/or live image feeds.
Group: wicg
Markup Shorthands: markdown yes
!Participate: <a href="https://www.w3.org/community/wicg/">Join the W3C Community Group</a>
!Participate: <a href="https://github.com/WICG/shape-detection-api">Fix the text through GitHub</a>
</pre>
<style>
table {
border-collapse: collapse;
border-left-style: hidden;
border-right-style: hidden;
text-align: left;
}
table caption {
font-weight: bold;
padding: 3px;
text-align: left;
}
table td, table th {
border: 1px solid black;
padding: 3px;
}
</style>
# Introduction # {#introduction}
Photos and images constitute the largest chunk of the Web, and many include recognisable features, such as human faces, QR codes or text. Detecting these features is computationally expensive, but would lead to interesting use cases e.g. face tagging, or web URL redirection. This document deals with text detection whereas the sister document [[SHAPE-DETECTION-API]] specifies the Face and Barcode detection cases and APIs.
## Text detection use cases ## {#use-cases}
Please see the <a href="https://github.com/WICG/shape-detection-api/blob/gh-pages/README.md">Readme/Explainer</a> in the repository.
# Text Detection API # {#api}
Individual browsers MAY provide Detectors indicating the availability of hardware providing accelerated operation.
## Image sources for detection ## {#image-sources-for-detection}
Please refer to [[SHAPE-DETECTION-API#image-sources-for-detection]]
## Text Detection API ## {#text-detection-api}
{{TextDetector}} represents an underlying accelerated platform's component for detection in images of Latin-1 text as defined in [[iso8859-1]]. It provides a single {{TextDetector/detect()}} operation on an {{ImageBitmapSource}} of which the result is a Promise. This method must reject this Promise in the cases detailed in [[#image-sources-for-detection]]; otherwise it may queue a task using the OS/Platform resources to resolve the Promise with a sequence of {{DetectedText}}s, each one essentially consisting on a {{DetectedText/rawValue}} and delimited by a {{DetectedText/boundingBox}} and a series of {{Point2D}}s.
<div class="example">
Example implementations of Text code detection are e.g. <a href="https://developers.google.com/android/reference/com/google/android/gms/vision/text/package-summary">Google Play Services</a>, <a href="https://developer.apple.com/reference/coreimage/cidetectortypetext">Apple's CIDetector</a> (bounding box only, no OCR) or <a href="https://msdn.microsoft.com/en-us/library/windows/apps/windows.media.ocr.aspx">Windows 10 <abbr title="Optical Character Recognition">OCR</abbr> API</a>.
</div>
<xmp class="idl">
[
Exposed=(Window,Worker),
SecureContext
] interface TextDetector {
constructor();
Promise<sequence<DetectedText>> detect(ImageBitmapSource image);
};
</xmp>
<dl class="domintro">
<dt><dfn constructor for="TextDetector">`TextDetector()`</dfn></dt>
<dd>
<div class="note">
Detectors may potentially allocate and hold significant resources. Where possible, reuse the same {{TextDetector}} for several detections.
</div>
</dd>
<dt><dfn method for="TextDetector"><code>detect(ImageBitmapSource |image|)</code></dfn></dt>
<dd>Tries to detect text blocks in the {{ImageBitmapSource}} |image|.</dd>
</dl>
### {{DetectedText}} ### {#detectedtext-section}
<xmp class="idl">
dictionary DetectedText {
required DOMRectReadOnly boundingBox;
required DOMString rawValue;
required sequence<Point2D> cornerPoints;
};
</xmp>
<dl class="domintro">
<dt><dfn dict-member for="DetectedText">`boundingBox`</dfn></dt>
<dd>A rectangle indicating the position and extent of a detected feature aligned to the image</dd>
<dt><dfn dict-member for="DetectedText">`rawValue`</dfn></dt>
<dd>Raw string detected from the image, where characters are drawn from [[iso8859-1]].</dd>
<dt><dfn dict-member for="DetectedText">`cornerPoints`</dfn></dt>
<dd>A <a>sequence</a> of corner points of the detected feature, in clockwise direction and starting with top-left. This is not necessarily a square due to possible perspective distortions.</dd>
</dl>
# Examples # {#examples}
<i>This section is non-normative.</i>
<p class="note">
Slightly modified/extended versions of these examples (and more) can be found in
e.g. <a href="https://codepen.io/collection/DwWVJj/">this codepen collection</a>.
</p>
## Platform support for a text detector ## {#example-feature-detection}
<div class="note">
The following example can also be found in e.g. <a
href="https://codepen.io/miguelao/pen/PbYpMv?editors=0010">this codepen</a>
with minimal modifications.
</div>
<div class="example">
```js
if (window.TextDetector == undefined) {
console.error('Text Detection not supported on this platform');
}
```
</div>
## Text Detection ## {#example-text-detection}
<div class="note">
The following example can also be found in e.g.
<a href="https://codepen.io/miguelao/pen/ygxVqg">this codepen</a>.
</div>
<div class="example">
```js
let textDetector = new TextDetector();
// Assuming |theImage| is e.g. a <img> content, or a Blob.
textDetector.detect(theImage)
.then(detectedTextBlocks => {
for (const textBlock of detectedTextBlocks) {
console.log(
'text @ (${textBlock.boundingBox.x}, ${textBlock.boundingBox.y}), ' +
'size ${textBlock.boundingBox.width}x${textBlock.boundingBox.height}');
}
}).catch(() => {
console.error("Text Detection failed, boo.");
})
```
</div>
<pre class="link-defaults">
spec: html
type: dfn
text: allowed to show a popup
text: in parallel
text: incumbent settings object
</pre>
<pre class="biblio">
{
"iso8859-1": {
"href": "https://www.iso.org/standard/28245.html",
"title": "Information technology -- 8-bit single-byte coded graphic character sets -- Part 1: Latin alphabet No. 1",
"publisher": "ISO/IEC",
"date": "April 1998"
}
}
</pre>