diff --git a/README.md b/README.md index 6c3df87..25fbf60 100644 --- a/README.md +++ b/README.md @@ -47,7 +47,7 @@ See [below](#language-tag-handling) for more on the details of how language tags Here is the basic usage of the translator API, with no error handling: ```js -const translator = await ai.translator.create({ +const translator = await Translator.create({ sourceLanguage: "en", targetLanguage: "ja" }); @@ -65,7 +65,7 @@ Note that the `create()` method call here might cause the download of a translat A similar simplified example of the language detector API: ```js -const detector = await ai.languageDetector.create(); +const detector = await LanguageDetector.create(); const results = await detector.detect(someUserText); for (const result of results) { @@ -84,7 +84,7 @@ For more details on the ways low-confidence results are excluded, see [the speci If there are certain languages you need to be able to detect for your use case, you can include them in the `expectedInputLanguages` option when creating a language detector: ```js -const detector = await ai.languageDetector.create({ expectedInputLanguages: ["en", "ja"] }); +const detector = await LanguageDetector.create({ expectedInputLanguages: ["en", "ja"] }); ``` This will allow the implementation to download additional resources like language detection models if necessary, and will ensure that the promise is rejected with a `"NotSupportedError"` `DOMException` if the browser is unable to detect the given input languages. @@ -102,7 +102,7 @@ Here is an example that adds capability checking to log more information and fal ```js async function translateUnknownCustomerInput(textToTranslate, targetLanguage) { - const detectorAvailability = await ai.languageDetector.availability(); + const detectorAvailability = await LanguageDetector.availability(); // If there is no language detector, then assume the source language is the // same as the document language. @@ -114,7 +114,7 @@ async function translateUnknownCustomerInput(textToTranslate, targetLanguage) { console.log("Language detection is available, but something will have to be downloaded. Hold tight!"); } - const detector = await ai.languageDetector.create(); + const detector = await LanguageDetector.create(); const [bestResult] = await detector.detect(textToTranslate); if (bestResult.detectedLanguage ==== "und" || bestResult.confidence < 0.4) { @@ -126,7 +126,7 @@ async function translateUnknownCustomerInput(textToTranslate, targetLanguage) { } // Now we've figured out the source language. Let's translate it! - const translatorAvailability = await ai.translator.availability({ sourceLanguage, targetLanguage }); + const translatorAvailability = await Translator.availability({ sourceLanguage, targetLanguage }); if (translatorAvailability === "unavailable") { console.warn("Translation is not available. Falling back to cloud API."); return await useSomeCloudAPIToTranslate(textToTranslate, { sourceLanguage, targetLanguage }); @@ -136,7 +136,7 @@ async function translateUnknownCustomerInput(textToTranslate, targetLanguage) { console.log("Translation is available, but something will have to be downloaded. Hold tight!"); } - const translator = await ai.translator.create({ sourceLanguage, targetLanguage }); + const translator = await Translator.create({ sourceLanguage, targetLanguage }); return await translator.translate(textToTranslate); } ``` @@ -146,7 +146,7 @@ async function translateUnknownCustomerInput(textToTranslate, targetLanguage) { For cases where using the API is only possible after a download, you can monitor the download progress (e.g. in order to show your users a progress bar) using code such as the following: ```js -const translator = await ai.translator.create({ +const translator = await Translator.create({ sourceLanguage, targetLanguage, monitor(m) { @@ -189,7 +189,7 @@ The "usage" concept is specific to the implementation, and could be something li This allows detecting failures due to overlarge inputs and giving clear feedback to the user, with code such as the following: ```js -const detector = await ai.languageDetector.create(); +const detector = await LanguageDetector.create(); try { console.log(await detector.detect(potentiallyLargeInput)); @@ -206,7 +206,7 @@ try { In some cases, instead of providing errors after the fact, the developer needs to be able to communicate to the user how close they are to the limit. For this, they can use the `inputQuota` property and the `measureInputUsage()` method on the translator or language detector objects: ```js -const translator = await ai.translator.create({ +const translator = await Translator.create({ sourceLanguage: "en", targetLanguage: "jp" }); @@ -247,7 +247,7 @@ The API comes equipped with a couple of `signal` options that accept `AbortSigna const controller = new AbortController(); stopButton.onclick = () => controller.abort(); -const languageDetector = await ai.languageDetector.create({ signal: controller.signal }); +const languageDetector = await LanguageDetector.create({ signal: controller.signal }); await languageDetector.detect(document.body.textContent, { signal: controller.signal }); ``` @@ -281,7 +281,7 @@ A future option might be to instead have the API return back the splitting of th The current design envisions that `availability()` methods will _not_ cause downloads of language packs or other material like a language detection model. Whereas, the `create()` methods _can_ cause downloads. In all cases, whether or not creation will initiate a download can be detected beforehand by the corresponding `availability()` method. -After a developer has a `AITranslator` or `AILanguageDetector` object, further calls are not expected to cause any downloads. (Although they might require internet access, if the implementation is not entirely on-device.) +After a developer has a `Translator` or `LanguageDetector` object, further calls are not expected to cause any downloads. (Although they might require internet access, if the implementation is not entirely on-device.) This design means that the implementation must have all information about the capabilities of its translation and language detection models available beforehand, i.e. "shipped with the browser". (Either as part of the browser binary, or through some out-of-band update mechanism that eagerly pushes updates.) @@ -297,7 +297,7 @@ Some sort of mitigation may be necessary here. We believe this is adjacent to ot * Partitioning download status by top-level site, introducing a fake download (which takes time but does not actually download anything) for the second-onward site to download a language pack. * Only exposing a fixed set of languages to this API, e.g. based on the user's locale or the document's main language. -As a first step, we require that detecting the availability of translation/detection be done via individual calls to `ai.translator.availability()` and `ai.languageDetector.availability()`. This allows browsers to implement possible mitigation techniques, such as detecting excessive calls to these methods and starting to return `"unavailable"`. +As a first step, we require that detecting the availability of translation/detection be done via individual calls to `Translator.availability()` and `LanguageDetector.availability()`. This allows browsers to implement possible mitigation techniques, such as detecting excessive calls to these methods and starting to return `"unavailable"`. Another way in which this API might enhance the web's fingerprinting surface is if translation and language detection models are updated separately from browser versions. In that case, differing results from different versions of the model provide additional fingerprinting bits beyond those already provided by the browser's major version number. Mandating that older browser versions not receive updates or be able to download models from too far into the future might be a possible remediation for this. @@ -320,10 +320,10 @@ That said, we are aware of [research](https://arxiv.org/abs/2005.08595) on trans The current design requires multiple async steps to do useful things: ```js -const translator = await ai.translator.create(options); +const translator = await Translator.create(options); const text = await translator.translate(sourceText); -const detector = await ai.languageDetector.create(); +const detector = await LanguageDetector.create(); const results = await detector.detect(sourceText); ``` diff --git a/index.bs b/index.bs index 8bb7502..cd354e8 100644 --- a/index.bs +++ b/index.bs @@ -23,6 +23,9 @@ urlPrefix: https://tc39.es/ecma402/; spec: ECMA-402 text: Unicode canonicalized locale identifier; url: sec-language-tags type: abstract-op text: LookupMatchingLocaleByBestFit; url: sec-lookupmatchinglocalebybestfit +urlPrefix: https://tc39.es/ecma262/; spec: ECMA-262 + type: dfn + text: current realm; url: current-realm urlPrefix: https://whatpr.org/webidl/1465.html; spec: WEBIDL type: interface text: QuotaExceededError; url: quotaexceedederror @@ -38,25 +41,18 @@ For now, see the [explainer](https://github.com/webmachinelearning/translation-a

The translator API

-partial interface AI { - readonly attribute AITranslatorFactory translator; -}; - [Exposed=(Window,Worker), SecureContext] -interface AITranslatorFactory { - Promise<AITranslator> create(AITranslatorCreateOptions options); - Promise<AIAvailability> availability(AITranslatorCreateCoreOptions options); -}; +interface Translator { + static Promise<Translator> create(TranslatorCreateOptions options); + static Promise<Availability> availability(TranslatorCreateCoreOptions options); -[Exposed=(Window,Worker), SecureContext] -interface AITranslator { Promise<DOMString> translate( DOMString input, - optional AITranslatorTranslateOptions options = {} + optional TranslatorTranslateOptions options = {} ); ReadableStream translateStreaming( DOMString input, - optional AITranslatorTranslateOptions options = {} + optional TranslatorTranslateOptions options = {} ); readonly attribute DOMString sourceLanguage; @@ -64,61 +60,49 @@ interface AITranslator { Promise<double> measureInputUsage( DOMString input, - optional AITranslatorTranslateOptions options = {} + optional TranslatorTranslateOptions options = {} ); readonly attribute unrestricted double inputQuota; }; -AITranslator includes AIDestroyable; +Translator includes DestroyableModel; -dictionary AITranslatorCreateCoreOptions { +dictionary TranslatorCreateCoreOptions { required DOMString sourceLanguage; required DOMString targetLanguage; }; -dictionary AITranslatorCreateOptions : AITranslatorCreateCoreOptions { +dictionary TranslatorCreateOptions : TranslatorCreateCoreOptions { AbortSignal signal; - AICreateMonitorCallback monitor; + CreateMonitorCallback monitor; }; -dictionary AITranslatorTranslateOptions { +dictionary TranslatorTranslateOptions { AbortSignal signal; }; -Every {{AI}} has a translator factory, an {{AITranslatorFactory}} object. Upon creation of the {{AI}} object, its [=AI/translator factory=] must be set to a [=new=] {{AITranslatorFactory}} object created in the {{AI}} object's [=relevant realm=]. - -The translator getter steps are to return [=this=]'s [=AI/translator factory=]. -

Creation

- The create(|options|) method steps are: - - 1. If [=this=]'s [=relevant global object=] is a {{Window}} whose [=associated Document=] is not [=Document/fully active=], then return [=a promise rejected with=] an "{{InvalidStateError}}" {{DOMException}}. - - 1. If |options|["{{AITranslatorCreateOptions/signal}}"] [=map/exists=] and is [=AbortSignal/aborted=], then return [=a promise rejected with=] |options|["{{AITranslatorCreateOptions/signal}}"]'s [=AbortSignal/abort reason=]. - - 1. [=Validate and canonicalize translator options=] given |options|. - -

This can mutate |options|. + The static create(|options|) method steps are: - 1. Return the result of [=creating an AI model object=] given [=this=]'s [=relevant realm=], |options|, [=compute translator options availability=], [=download the translation model=], [=initialize the translation model=], and [=create a translator object=]. + 1. Return the result of [=creating an AI model object=] given |options|, [=validate and canonicalize translator options=], [=compute translator options availability=], [=download the translation model=], [=initialize the translation model=], and [=create the translator object=].

- To validate and canonicalize translator options given an {{AITranslatorCreateCoreOptions}} |options|, perform the following steps. They mutate |options| in place to canonicalize language tags, and throw a {{TypeError}} if any are invalid. + To validate and canonicalize translator options given an {{TranslatorCreateCoreOptions}} |options|, perform the following steps. They mutate |options| in place to canonicalize language tags, and throw a {{TypeError}} if any are invalid. - 1. [=Validate and canonicalize language tags=] given |options| and "{{AITranslatorCreateCoreOptions/sourceLanguage}}". + 1. [=Validate and canonicalize language tags=] given |options| and "{{TranslatorCreateCoreOptions/sourceLanguage}}". - 1. [=Validate and canonicalize language tags=] given |options| and "{{AITranslatorCreateCoreOptions/targetLanguage}}". + 1. [=Validate and canonicalize language tags=] given |options| and "{{TranslatorCreateCoreOptions/targetLanguage}}".
- To download the translation model, given an {{AITranslatorCreateCoreOptions}} |options|: + To download the translation model, given an {{TranslatorCreateCoreOptions}} |options|: 1. [=Assert=]: these steps are running [=in parallel=]. - 1. Initiate the download process for everything the user agent needs to translate text from |options|["{{AITranslatorCreateCoreOptions/sourceLanguage}}"] to |options|["{{AITranslatorCreateCoreOptions/targetLanguage}}"]. + 1. Initiate the download process for everything the user agent needs to translate text from |options|["{{TranslatorCreateCoreOptions/sourceLanguage}}"] to |options|["{{TranslatorCreateCoreOptions/targetLanguage}}"]. This could include both a base translation model and specific language arc material, or perhaps material for multiple language arcs if an intermediate language is used. @@ -128,11 +112,11 @@ The translator getter steps are to return [=this=]
- To initialize the translation model, given an {{AITranslatorCreateCoreOptions}} |options|: + To initialize the translation model, given an {{TranslatorCreateCoreOptions}} |options|: 1. [=Assert=]: these steps are running [=in parallel=]. - 1. Perform any necessary initialization operations for the AI model backing the user agent's capabilities for translating from |options|["{{AITranslatorCreateCoreOptions/sourceLanguage}}"] to |options|["{{AITranslatorCreateCoreOptions/targetLanguage}}"]. + 1. Perform any necessary initialization operations for the AI model backing the user agent's capabilities for translating from |options|["{{TranslatorCreateCoreOptions/sourceLanguage}}"] to |options|["{{TranslatorCreateCoreOptions/targetLanguage}}"]. This could include loading the model into memory, or loading any fine-tunings necessary to support the specific options in question. @@ -142,51 +126,36 @@ The translator getter steps are to return [=this=]
- To create a translator object, given a [=ECMAScript/realm=] |realm| and an {{AITranslatorCreateCoreOptions}} |options|: + To create the translator object, given a [=ECMAScript/realm=] |realm| and an {{TranslatorCreateCoreOptions}} |options|: 1. [=Assert=]: these steps are running on |realm|'s [=ECMAScript/surrounding agent=]'s [=agent/event loop=]. 1. Let |inputQuota| be the amount of input quota that is available to the user agent for future [=translate|translation=] operations. (This value is [=implementation-defined=], and may be +∞ if there are no specific limits beyond, e.g., the user's memory, or the limits of JavaScript strings.) - 1. Return a new {{AITranslator}} object, created in |realm|, with + 1. Return a new {{Translator}} object, created in |realm|, with
- : [=AITranslator/source language=] - :: |options|["{{AITranslatorCreateCoreOptions/sourceLanguage}}"] + : [=Translator/source language=] + :: |options|["{{TranslatorCreateCoreOptions/sourceLanguage}}"] - : [=AITranslator/target language=] - :: |options|["{{AITranslatorCreateCoreOptions/targetLanguage}}"] + : [=Translator/target language=] + :: |options|["{{TranslatorCreateCoreOptions/targetLanguage}}"] - : [=AITranslator/input quota=] + : [=Translator/input quota=] :: |inputQuota|

Availability

-
- The availability(|options|) method steps are: - - 1. If [=this=]'s [=relevant global object=] is a {{Window}} whose [=associated Document=] is not [=Document/fully active=], then return [=a promise rejected with=] an "{{InvalidStateError}}" {{DOMException}}. - - 1. [=Validate and canonicalize translator options=] given |options|. - - 1. Let |promise| be [=a new promise=] created in [=this=]'s [=relevant realm=]. - - 1. [=In parallel=]: - - 1. Let |availability| be the result of [=computing translator options availability=] given |options|. - - 1. [=Queue a global task=] on the [=AI task source=] given [=this=]'s [=relevant global object=] to perform the following steps: - - 1. If |availability| is null, then [=reject=] |promise| with an "{{UnknownError}}" {{DOMException}}. + The static availability(|options|) method steps are: - 1. Otherwise, [=resolve=] |promise| with |availability|. + 1. Return the result of [=computing AI model availability=] given |options|, [=validate and canonicalize translator options=], and [=compute translator options availability=].
- To compute translator options availability given an {{AITranslatorCreateCoreOptions}} |options|, perform the following steps. They return either an {{AIAvailability}} value or null, and they mutate |options| in place to update language tags to their best-fit matches. + To compute translator options availability given an {{TranslatorCreateCoreOptions}} |options|, perform the following steps. They return either an {{Availability}} value or null, and they mutate |options| in place to update language tags to their best-fit matches. 1. [=Assert=]: this algorithm is running [=in parallel=]. @@ -196,49 +165,49 @@ The translator getter steps are to return [=this=] 1. [=map/For each=] |languageArc| → |availability| in |availabilities|: - 1. Let |sourceLanguageBestFit| be [$LookupMatchingLocaleByBestFit$](« |languageArc|'s [=language arc/source language=] », « |options|["{{AITranslatorCreateCoreOptions/sourceLanguage}}"] »). + 1. Let |sourceLanguageBestFit| be [$LookupMatchingLocaleByBestFit$](« |languageArc|'s [=language arc/source language=] », « |options|["{{TranslatorCreateCoreOptions/sourceLanguage}}"] »). - 1. Let |targetLanguageBestFit| be [$LookupMatchingLocaleByBestFit$](« |languageArc|'s [=language arc/target language=] », « |options|["{{AITranslatorCreateCoreOptions/targetLanguage}}"] »). + 1. Let |targetLanguageBestFit| be [$LookupMatchingLocaleByBestFit$](« |languageArc|'s [=language arc/target language=] », « |options|["{{TranslatorCreateCoreOptions/targetLanguage}}"] »). 1. If |sourceLanguageBestFit| and |targetLanguageBestFit| are both not undefined, then: - 1. Set |options|["{{AITranslatorCreateCoreOptions/sourceLanguage}}"] to |sourceLanguageBestFit|.\[[locale]]. + 1. Set |options|["{{TranslatorCreateCoreOptions/sourceLanguage}}"] to |sourceLanguageBestFit|.\[[locale]]. - 1. Set |options|["{{AITranslatorCreateCoreOptions/targetLanguage}}"] to |targetLanguageBestFit|.\[[locale]]. + 1. Set |options|["{{TranslatorCreateCoreOptions/targetLanguage}}"] to |targetLanguageBestFit|.\[[locale]]. 1. Return |availability|. - 1. If (|options|["{{AITranslatorCreateCoreOptions/sourceLanguage}}"], |options|["{{AITranslatorCreateCoreOptions/targetLanguage}}"]) [=language arc/can be fulfilled by the identity translation=], then return "{{AIAvailability/available}}". + 1. If (|options|["{{TranslatorCreateCoreOptions/sourceLanguage}}"], |options|["{{TranslatorCreateCoreOptions/targetLanguage}}"]) [=language arc/can be fulfilled by the identity translation=], then return "{{Availability/available}}". -

Such cases could also return "{{AIAvailability/downloadable}}", "{{AIAvailability/downloading}}", or "{{AIAvailability/available}}" because of the above steps, if the user agent has specific entries in its [=translator language arc availabilities=] for the given language arc. However, the identity translation is always available, so this step ensures that we never return "{{AIAvailability/unavailable}}" for such cases. +

Such cases could also return "{{Availability/downloadable}}", "{{Availability/downloading}}", or "{{Availability/available}}" because of the above steps, if the user agent has specific entries in its [=translator language arc availabilities=] for the given language arc. However, the identity translation is always available, so this step ensures that we never return "{{Availability/unavailable}}" for such cases.

One [=language arc=] that [=language arc/can be fulfilled by the identity translation=] is (`"en-US"`, `"en-GB"`). It is conceivable that an implementation might support a specialized model for this translation, which would show up in the [=translator language arc availabilities=].

On the other hand, it's pretty unlikely that an implementation has any specialized model for the [=language arc=] ("`en-x-asdf`", "`en-x-xyzw`"). In such a case, this step takes over, and later calls to the [=translate=] algorithm will use the identity translation. -

Note that when this step takes over, |options|["{{AITranslatorCreateCoreOptions/sourceLanguage}}"] and |options|["{{AITranslatorCreateCoreOptions/targetLanguage}}"] are not modified, so if this algorithm is being called from {{AITranslatorFactory/create()}}, that means the resulting {{AITranslator}} object's {{AITranslator/sourceLanguage}} and {{AITranslator/targetLanguage}} properties will return the original inputs, and not some canonicalized form. +

Note that when this step takes over, |options|["{{TranslatorCreateCoreOptions/sourceLanguage}}"] and |options|["{{TranslatorCreateCoreOptions/targetLanguage}}"] are not modified, so if this algorithm is being called from {{Translator/create()}}, that means the resulting {{Translator}} object's {{Translator/sourceLanguage}} and {{Translator/targetLanguage}} properties will return the original inputs, and not some canonicalized form.

- 1. Return "{{AIAvailability/unavailable}}". + 1. Return "{{Availability/unavailable}}".
A language arc is a [=tuple=] of two strings, a source language and a target language. Each item is a [=Unicode canonicalized locale identifier=].
- The translator language arc availabilities are given by the following steps. They return a [=map=] from [=language arcs=] to {{AIAvailability}} values, or null. + The translator language arc availabilities are given by the following steps. They return a [=map=] from [=language arcs=] to {{Availability}} values, or null. 1. [=Assert=]: this algorithm is running [=in parallel=]. 1. If there is some error attempting to determine what language arcs the user agent supports translating text between, which the user agent believes to be transient (such that re-querying the [=translator language arc availabilities=] could stop producing such an error), then return null. - 1. Return a [=map=] from [=language arcs=] to {{AIAvailability}} values, where each key is a [=language arc=] that the user agent supports translating text between, filled according to the following constraints: + 1. Return a [=map=] from [=language arcs=] to {{Availability}} values, where each key is a [=language arc=] that the user agent supports translating text between, filled according to the following constraints: - * If the user agent supports translating text from the [=language arc/source language=] to the [=language arc/target language=] of the [=language arc=] without performing any downloading operations, then the map must contain an [=map/entry=] whose [=map/key=] is that [=language arc=] and whose [=map/value=] is "{{AIAvailability/available}}". + * If the user agent supports translating text from the [=language arc/source language=] to the [=language arc/target language=] of the [=language arc=] without performing any downloading operations, then the map must contain an [=map/entry=] whose [=map/key=] is that [=language arc=] and whose [=map/value=] is "{{Availability/available}}". - * If the user agent supports translating text from the [=language arc/source language=] to the [=language arc/target language=] of the [=language arc=], but only after finishing a currently-ongoing download, then the map must contain an [=map/entry=] whose [=map/key=] is that [=language arc=] and whose [=map/value=] is "{{AIAvailability/downloading}}". + * If the user agent supports translating text from the [=language arc/source language=] to the [=language arc/target language=] of the [=language arc=], but only after finishing a currently-ongoing download, then the map must contain an [=map/entry=] whose [=map/key=] is that [=language arc=] and whose [=map/value=] is "{{Availability/downloading}}". - * If the user agent supports translating text from the [=language arc/source language=] to the [=language arc/target language=] of the [=language arc=], but only after performing a not-currently ongoing download, then the map must contain an [=map/entry=] whose [=map/key=] is that [=language arc=] and whose [=map/value=] is "{{AIAvailability/downloadable}}". + * If the user agent supports translating text from the [=language arc/source language=] to the [=language arc/target language=] of the [=language arc=], but only after performing a not-currently ongoing download, then the map must contain an [=map/entry=] whose [=map/key=] is that [=language arc=] and whose [=map/value=] is "{{Availability/downloadable}}". * The [=map/keys=] must not include any [=language arcs=] that [=language arc/overlap=] with the other [=map/keys=].
@@ -246,10 +215,10 @@ A language arc is a [=tuple=] of two strings, a Let's suppose that the user agent's [=translator language arc availabilities=] are as follows: - * ("`en`", "`zh-Hans`") → "{{AIAvailability/available}}" - * ("`en`", "`zh-Hant`") → "{{AIAvailability/downloadable}}" + * ("`en`", "`zh-Hans`") → "{{Availability/available}}" + * ("`en`", "`zh-Hant`") → "{{Availability/downloadable}}" - The use of [$LookupMatchingLocaleByBestFit$] means that {{AITranslatorFactory/availability()}} will probably give the following answers: + The use of [$LookupMatchingLocaleByBestFit$] means that {{Translator/availability()}} will probably give the following answers: function a(sourceLanguage, targetLanguage) { @@ -306,44 +275,44 @@ A <dfn>language arc</dfn> is a [=tuple=] of two strings, a <dfn for="language ar 1. Return false. </div> -<h3 id="the-aitranslator-class">The {{AITranslator}} class</h3> +<h3 id="the-Translator-class">The {{Translator}} class</h3> -Every {{AITranslator}} has a <dfn for="AITranslator">source language</dfn>, a [=string=], set during creation. +Every {{Translator}} has a <dfn for="Translator">source language</dfn>, a [=string=], set during creation. -Every {{AITranslator}} has a <dfn for="AITranslator">target language</dfn>, a [=string=], set during creation. +Every {{Translator}} has a <dfn for="Translator">target language</dfn>, a [=string=], set during creation. -Every {{AITranslator}} has an <dfn for="AITranslator">input quota</dfn>, a [=number=], set during creation. +Every {{Translator}} has an <dfn for="Translator">input quota</dfn>, a number, set during creation. <hr> -The <dfn attribute for="AITranslator">sourceLanguage</dfn> getter steps are to return [=this=]'s [=AITranslator/source language=]. +The <dfn attribute for="Translator">sourceLanguage</dfn> getter steps are to return [=this=]'s [=Translator/source language=]. -The <dfn attribute for="AITranslator">targetLanguage</dfn> getter steps are to return [=this=]'s [=AITranslator/target language=]. +The <dfn attribute for="Translator">targetLanguage</dfn> getter steps are to return [=this=]'s [=Translator/target language=]. -The <dfn attribute for="AITranslator">inputQuota</dfn> getter steps are to return [=this=]'s [=AITranslator/input quota=]. +The <dfn attribute for="Translator">inputQuota</dfn> getter steps are to return [=this=]'s [=Translator/input quota=]. <hr> <div algorithm> - The <dfn method for="AITranslator">translate(|input|, |options|)</dfn> method steps are: + The <dfn method for="Translator">translate(|input|, |options|)</dfn> method steps are: - 1. Let |operation| be an algorithm step which takes arguments |chunkProduced|, |done|, |error|, and |stopProducing|, and [=translates=] |input| given [=this=]'s [=AITranslator/source language=], [=this=]'s [=AITranslator/target language=], [=this=]'s [=AITranslator/input quota=], |chunkProduced|, |done|, |error|, and |stopProducing|. + 1. Let |operation| be an algorithm step which takes arguments |chunkProduced|, |done|, |error|, and |stopProducing|, and [=translates=] |input| given [=this=]'s [=Translator/source language=], [=this=]'s [=Translator/target language=], [=this=]'s [=Translator/input quota=], |chunkProduced|, |done|, |error|, and |stopProducing|. 1. Return the result of [=getting an aggregated AI model result=] given [=this=], |options|, and |operation|. </div> <div algorithm> - The <dfn method for="AITranslator">translateStreaming(|input|, |options|)</dfn> method steps are: + The <dfn method for="Translator">translateStreaming(|input|, |options|)</dfn> method steps are: - 1. Let |operation| be an algorithm step which takes arguments |chunkProduced|, |done|, |error|, and |stopProducing|, and [=translates=] |input| given [=this=]'s [=AITranslator/source language=], [=this=]'s [=AITranslator/target language=], [=this=]'s [=AITranslator/input quota=], |chunkProduced|, |done|, |error|, and |stopProducing|. + 1. Let |operation| be an algorithm step which takes arguments |chunkProduced|, |done|, |error|, and |stopProducing|, and [=translates=] |input| given [=this=]'s [=Translator/source language=], [=this=]'s [=Translator/target language=], [=this=]'s [=Translator/input quota=], |chunkProduced|, |done|, |error|, and |stopProducing|. 1. Return the result of [=getting a streaming AI model result=] given [=this=], |options|, and |operation|. </div> <div algorithm> - The <dfn method for="AITranslator">measureInputUsage(|input|, |options|)</dfn> method steps are: + The <dfn method for="Translator">measureInputUsage(|input|, |options|)</dfn> method steps are: - 1. Let |measureUsage| be an algorithm step which takes argument |stopMeasuring|, and returns the result of [=measuring translator input usage=] given |input|, [=this=]'s [=AITranslator/source language=], [=this=]'s [=AITranslator/target language=], and |stopMeasuring|. + 1. Let |measureUsage| be an algorithm step which takes argument |stopMeasuring|, and returns the result of [=measuring translator input usage=] given |input|, [=this=]'s [=Translator/source language=], [=this=]'s [=Translator/target language=], and |stopMeasuring|. 1. Return the result of [=measuring AI model input usage=] given [=this=], |options|, and |measureUsage|. </div> @@ -358,7 +327,7 @@ The <dfn attribute for="AITranslator">inputQuota</dfn> getter steps are to retur * a [=string=] |input|, * a [=Unicode canonicalized locale identifier=] |sourceLanguage|, * a [=Unicode canonicalized locale identifier=] |targetLanguage|, - * a [=number=] |inputQuota|, + * a number |inputQuota|, * an algorithm |chunkProduced| that takes a string and returns nothing, * an algorithm |done| that takes no arguments and returns nothing, * an algorithm |error| that takes [=error information=] and returns nothing, and @@ -447,7 +416,7 @@ The <dfn attribute for="AITranslator">inputQuota</dfn> getter steps are to retur 1. Return the amount of input usage needed to represent |inputToModel| when given to the underlying model. The exact calculation procedure is [=implementation-defined=], subject to the following constraints. - The returned input usage must be nonnegative and finite. It must be 0, if there are no usage quotas for the translation process (i.e., if the [=AITranslator/input quota=] is +∞). Otherwise, it must be positive and should be roughly proportional to the [=string/length=] of |inputToModel|. + The returned input usage must be nonnegative and finite. It must be 0, if there are no usage quotas for the translation process (i.e., if the [=Translator/input quota=] is +∞). Otherwise, it must be positive and should be roughly proportional to the [=string/length=] of |inputToModel|. <p class="note" id="note-translator-input-usage">This might be the number of tokens needed to represent |input| in a <a href="https://arxiv.org/abs/2404.08335">language model tokenization scheme</a>, or it might be |input|'s [=string/length=]. It could also be some variation of these which also counts the usage of any prefixes or suffixes necessary to give to the model. @@ -485,47 +454,40 @@ When translation fails, the following possible reasons may be surfaced to the we <h2 id="language-detector-api">The language detector API</h2> <xmp class="idl"> -partial interface AI { - readonly attribute AILanguageDetectorFactory languageDetector; -}; - [Exposed=(Window,Worker), SecureContext] -interface AILanguageDetectorFactory { - Promise<AILanguageDetector> create( - optional AILanguageDetectorCreateOptions options = {} +interface LanguageDetector { + Promise<LanguageDetector> create( + optional LanguageDetectorCreateOptions options = {} ); - Promise<AIAvailability> availability( - optional AILanguageDetectorCreateCoreOptions options = {} + Promise<Availability> availability( + optional LanguageDetectorCreateCoreOptions options = {} ); -}; -[Exposed=(Window,Worker), SecureContext] -interface AILanguageDetector { Promise<sequence<LanguageDetectionResult>> detect( DOMString input, - optional AILanguageDetectorDetectOptions options = {} + optional LanguageDetectorDetectOptions options = {} ); readonly attribute FrozenArray<DOMString>? expectedInputLanguages; Promise<double> measureInputUsage( DOMString input, - optional AITranslatorTranslateOptions options = {} + optional TranslatorTranslateOptions options = {} ); readonly attribute unrestricted double inputQuota; }; -AILanguageDetector includes AIDestroyable; +LanguageDetector includes DestroyableModel; -dictionary AILanguageDetectorCreateCoreOptions { +dictionary LanguageDetectorCreateCoreOptions { sequence<DOMString> expectedInputLanguages; }; -dictionary AILanguageDetectorCreateOptions : AILanguageDetectorCreateCoreOptions { +dictionary LanguageDetectorCreateOptions : LanguageDetectorCreateCoreOptions { AbortSignal signal; - AICreateMonitorCallback monitor; + CreateMonitorCallback monitor; }; -dictionary AILanguageDetectorDetectOptions { +dictionary LanguageDetectorDetectOptions { AbortSignal signal; }; @@ -535,40 +497,28 @@ dictionary LanguageDetectionResult { }; -Every {{AI}} has a language detector factory, an {{AILanguageDetector}} object. Upon creation of the {{AI}} object, its [=AI/language detector factory=] must be set to a [=new=] {{AILanguageDetectorFactory}} object created in the {{AI}} object's [=relevant realm=]. - -The languageDetector getter steps are to return [=this=]'s [=AI/language detector factory=]. -

Creation

- The create(|options|) method steps are: + The static create(|options|) method steps are: - 1. If [=this=]'s [=relevant global object=] is a {{Window}} whose [=associated Document=] is not [=Document/fully active=], then return [=a promise rejected with=] an "{{InvalidStateError}}" {{DOMException}}. - - 1. If |options|["{{AILanguageDetectorCreateOptions/signal}}"] [=map/exists=] and is [=AbortSignal/aborted=], then return [=a promise rejected with=] |options|["{{AILanguageDetectorCreateOptions/signal}}"]'s [=AbortSignal/abort reason=]. - - 1. [=Validate and canonicalize language detector options=] given |options|. - -

This can mutate |options|. - - 1. Return the result of [=creating an AI model object=] given [=this=]'s [=relevant realm=], |options|, [=compute language detector options availability=], [=download the language detector model=], [=initialize the language detector model=], and [=create the language detector object=]. + 1. Return the result of [=creating an AI model object=] given |options|, [=validate and canonicalize language detector options=], [=compute language detector options availability=], [=download the language detector model=], [=initialize the language detector model=], and [=create the language detector object=].

- To validate and canonicalize language detector options given an {{AILanguageDetectorCreateCoreOptions}} |options|, perform the following steps. They mutate |options| in place to canonicalize language tags, and throw a {{TypeError}} if any are invalid. + To validate and canonicalize language detector options given an {{LanguageDetectorCreateCoreOptions}} |options|, perform the following steps. They mutate |options| in place to canonicalize language tags, and throw a {{TypeError}} if any are invalid. - 1. [=Validate and canonicalize language tags=] given |options| and "{{AILanguageDetectorCreateCoreOptions/expectedInputLanguages}}". + 1. [=Validate and canonicalize language tags=] given |options| and "{{LanguageDetectorCreateCoreOptions/expectedInputLanguages}}".
- To download the language detector model, given an {{AILanguageDetectorCreateCoreOptions}} |options|: + To download the language detector model, given an {{LanguageDetectorCreateCoreOptions}} |options|: 1. [=Assert=]: these steps are running [=in parallel=]. - 1. Initiate the download process for everything the user agent needs to detect the languages of input text, including all the languages in |options|["{{AILanguageDetectorCreateCoreOptions/expectedInputLanguages}}"]. + 1. Initiate the download process for everything the user agent needs to detect the languages of input text, including all the languages in |options|["{{LanguageDetectorCreateCoreOptions/expectedInputLanguages}}"]. - This could include both a base language detection model, and specific fine-tunings or other material to help with the languages identified in |options|["{{AILanguageDetectorCreateCoreOptions/expectedInputLanguages}}"]. + This could include both a base language detection model, and specific fine-tunings or other material to help with the languages identified in |options|["{{LanguageDetectorCreateCoreOptions/expectedInputLanguages}}"]. 1. If the download process cannot be started for any reason, then return false. @@ -576,13 +526,13 @@ The languageDetector getter steps are to return [=
- To initialize the language detector model, given an {{AILanguageDetectorCreateCoreOptions}} |options|: + To initialize the language detector model, given an {{LanguageDetectorCreateCoreOptions}} |options|: 1. [=Assert=]: these steps are running [=in parallel=]. 1. Perform any necessary initialization operations for the AI model backing the user agent's capabilities for detecting the languages of input text. - This could include loading the model into memory, or loading any fine-tunings necessary to support the languages identified in |options|["{{AILanguageDetectorCreateCoreOptions/expectedInputLanguages}}"]. + This could include loading the model into memory, or loading any fine-tunings necessary to support the languages identified in |options|["{{LanguageDetectorCreateCoreOptions/expectedInputLanguages}}"]. 1. If initialization failed for any reason, then return a [=DOMException error information=] whose [=DOMException error information/name=] is "{{OperationError}}" and whose [=DOMException error information/details=] contain appropriate detail. @@ -590,49 +540,34 @@ The languageDetector getter steps are to return [=
- To create the language detector object, given a [=ECMAScript/realm=] |realm| and an {{AILanguageDetectorCreateCoreOptions}} |options|: + To create the language detector object, given a [=ECMAScript/realm=] |realm| and an {{LanguageDetectorCreateCoreOptions}} |options|: 1. [=Assert=]: these steps are running on |realm|'s [=ECMAScript/surrounding agent=]'s [=agent/event loop=]. 1. Let |inputQuota| be the amount of input quota that is available to the user agent for future [=detect languages|language detection=] operations. (This value is [=implementation-defined=], and may be +∞ if there are no specific limits beyond, e.g., the user's memory, or the limits of JavaScript strings.) - 1. Return a new {{AILanguageDetector}} object, created in |realm|, with + 1. Return a new {{LanguageDetector}} object, created in |realm|, with
- : [=AILanguageDetector/expected input languages=] - :: the result of [=creating a frozen array=] given |options|["{{AILanguageDetectorCreateCoreOptions/expectedInputLanguages}}"] if it [=set/is empty|is not empty=]; otherwise null + : [=LanguageDetector/expected input languages=] + :: the result of [=creating a frozen array=] given |options|["{{LanguageDetectorCreateCoreOptions/expectedInputLanguages}}"] if it [=set/is empty|is not empty=]; otherwise null - : [=AILanguageDetector/input quota=] + : [=LanguageDetector/input quota=] :: |inputQuota|

Availability

-
- The availability(|options|) method steps are: + The static availability(|options|) method steps are: - 1. If [=this=]'s [=relevant global object=] is a {{Window}} whose [=associated Document=] is not [=Document/fully active=], then return [=a promise rejected with=] an "{{InvalidStateError}}" {{DOMException}}. - - 1. [=Validate and canonicalize language detector options=] given |options|. - - 1. Let |promise| be [=a new promise=] created in [=this=]'s [=relevant realm=]. - - 1. [=In parallel=]: - - 1. Let |availability| be the result of [=computing language detector options availability=] given |options|. - - 1. [=Queue a global task=] on the [=AI task source=] given [=this=]'s [=relevant global object=] to perform the following steps: - - 1. If |availability| is null, then [=reject=] |promise| with an "{{UnknownError}}" {{DOMException}}. - - 1. Otherwise, [=resolve=] |promise| with |availability|. + 1. Return the result of [=computing AI model availability=] given |options|, [=validate and canonicalize language detector options=], and [=compute language detector options availability=].
- To compute language detector options availability given an {{AILanguageDetectorCreateCoreOptions}} |options|, perform the following steps. They return either an {{AIAvailability}} value or null, and they mutate |options| in place to update language tags to their best-fit matches. + To compute language detector options availability given an {{LanguageDetectorCreateCoreOptions}} |options|, perform the following steps. They return either an {{Availability}} value or null, and they mutate |options| in place to update language tags to their best-fit matches. 1. [=Assert=]: this algorithm is running [=in parallel=]. @@ -640,11 +575,11 @@ The languageDetector getter steps are to return [= 1. Let |availabilities| be the result of [=getting language availabilities=] given the purpose of detecting text written in that language. - 1. Let |availability| be "{{AIAvailability/available}}". + 1. Let |availability| be "{{Availability/available}}". - 1. [=set/For each=] |language| in |options|["{{AILanguageDetectorCreateCoreOptions/expectedInputLanguages}}"]: + 1. [=set/For each=] |language| in |options|["{{LanguageDetectorCreateCoreOptions/expectedInputLanguages}}"]: - 1. [=list/For each=] |availabilityToCheck| in « "{{AIAvailability/available}}", "{{AIAvailability/downloading}}", "{{AIAvailability/downloadable}}" »: + 1. [=list/For each=] |availabilityToCheck| in « "{{Availability/available}}", "{{Availability/downloading}}", "{{Availability/downloadable}}" »: 1. Let |languagesWithThisAvailability| be |availabilities|[|availabilityToCheck|]. @@ -652,38 +587,38 @@ The languageDetector getter steps are to return [= 1. If |bestMatch| is not undefined, then: - 1. [=list/Replace=] |language| with |bestMatch|.\[[locale]] in |options|["{{AILanguageDetectorCreateCoreOptions/expectedInputLanguages}}"]. + 1. [=list/Replace=] |language| with |bestMatch|.\[[locale]] in |options|["{{LanguageDetectorCreateCoreOptions/expectedInputLanguages}}"]. - 1. Set |availability| to the [=AIAvailability/minimum availability=] given |availability| and |availabilityToCheck|. + 1. Set |availability| to the [=Availability/minimum availability=] given |availability| and |availabilityToCheck|. 1. [=iteration/Break=]. - 1. Return "{{AIAvailability/unavailable}}". + 1. Return "{{Availability/unavailable}}". 1. Return |availability|.
-

The {{AILanguageDetector}} class

+

The {{LanguageDetector}} class

-Every {{AILanguageDetector}} has an expected input languages, a {{FrozenArray}}<{{DOMString}}> or null, set during creation. +Every {{LanguageDetector}} has an expected input languages, a {{FrozenArray}}<{{DOMString}}> or null, set during creation. -Every {{AILanguageDetector}} has an input quota, a [=number=], set during creation. +Every {{LanguageDetector}} has an input quota, a number, set during creation.
-The expectedInputLanguages getter steps are to return [=this=]'s [=AILanguageDetector/expected input languages=]. +The expectedInputLanguages getter steps are to return [=this=]'s [=LanguageDetector/expected input languages=]. -The inputQuota getter steps are to return [=this=]'s [=AILanguageDetector/input quota=]. +The inputQuota getter steps are to return [=this=]'s [=LanguageDetector/input quota=].
- The detect(|input|, |options|) method steps are: + The detect(|input|, |options|) method steps are: 1. If [=this=]'s [=relevant global object=] is a {{Window}} whose [=associated Document=] is not [=Document/fully active=], then return [=a promise rejected with=] an "{{InvalidStateError}}" {{DOMException}}. - 1. Let |signals| be « [=this=]'s [=AIDestroyable/destruction abort controller=]'s [=AbortController/signal=] ». + 1. Let |signals| be « [=this=]'s [=DestroyableModel/destruction abort controller=]'s [=AbortController/signal=] ». 1. If |options|["`signal`"] [=map/exists=], then [=set/append=] it to |signals|. @@ -707,7 +642,7 @@ The inputQuota getter steps are to 1. Return |abortedDuringOperation|. - 1. Let |result| be the result of [=detecting languages=] given |input|, [=this=]'s [=AILanguageDetector/input quota=], and |stopProducing|. + 1. Let |result| be the result of [=detecting languages=] given |input|, [=this=]'s [=LanguageDetector/input quota=], and |stopProducing|. 1. [=Queue a global task=] on the [=AI task source=] given [=this=]'s [=relevant global object=] to perform the following steps: @@ -723,17 +658,19 @@ The inputQuota getter steps are to
- The measureInputUsage(|input|, |options|) method steps are: + The measureInputUsage(|input|, |options|) method steps are: 1. Let |measureUsage| be an algorithm step which takes argument |stopMeasuring|, and returns the result of [=measuring language detector input usage=] given |input| and |stopMeasuring|. 1. Return the result of [=measuring AI model input usage=] given [=this=], |options|, and |measureUsage|.
+

Language detection

+

The algorithm

- To detect languages given a [=string=] |input|, a [=number=] |inputQuota|, and an algorithm |stopProducing| that takes no arguments and returns a boolean, perform the following steps. They will return either null, an [=error information=], or a [=list=] of {{LanguageDetectionResult}} dictionaries. + To detect languages given a [=string=] |input|, a number |inputQuota|, and an algorithm |stopProducing| that takes no arguments and returns a boolean, perform the following steps. They will return either null, an [=error information=], or a [=list=] of {{LanguageDetectionResult}} dictionaries. 1. [=Assert=]: this algorithm is running [=in parallel=]. @@ -749,7 +686,7 @@ The inputQuota getter steps are to 1. Let |availabilities| be the result of [=getting language availabilities=] given the purpose of detecting text written in that language. - 1. Let |currentlyAvailableLanguages| be |availabilities|["{{AIAvailability/available}}"]. + 1. Let |currentlyAvailableLanguages| be |availabilities|["{{Availability/available}}"]. 1. In an [=implementation-defined=] manner, subject to the following guidelines, let |rawResult| and |unknown| be the result of detecting the languages of |input|. @@ -817,7 +754,7 @@ The inputQuota getter steps are to 1. Return the amount of input usage needed to represent |inputToModel| when given to the underlying model. The exact calculation procedure is [=implementation-defined=], subject to the following constraints. - The returned input usage must be nonnegative and finite. It must be 0, if there are no usage quotas for the translation process (i.e., if the [=AILanguageDetector/input quota=] is +∞). Otherwise, it must be positive and should be roughly proportional to the [=string/length=] of |inputToModel|. + The returned input usage must be nonnegative and finite. It must be 0, if there are no usage quotas for the translation process (i.e., if the [=LanguageDetector/input quota=] is +∞). Otherwise, it must be positive and should be roughly proportional to the [=string/length=] of |inputToModel|.

This might be the number of tokens needed to represent |input| in a language model tokenization scheme, or it might be |input|'s [=string/length=]. It could also be some variation of these which also counts the usage of any prefixes or suffixes necessary to give to the model. diff --git a/security-privacy-questionnaire.md b/security-privacy-questionnaire.md index def2de3..6e99167 100644 --- a/security-privacy-questionnaire.md +++ b/security-privacy-questionnaire.md @@ -94,9 +94,9 @@ No. > (instead of getting destroyed) after navigation, and potentially gets reused > on future navigations back to the document? -Ideally, nothing special should happen. In particular, `AITranslator` and `AILanguageDetector` objects should still be usable without interruption after navigating back. We'll need to add web platform tests to confirm this, as it's easy to imagine implementation architectures in which keeping these objects alive while the `Document` is in the back/forward cache is difficult. +Ideally, nothing special should happen. In particular, `Translator` and `LanguageDetector` objects should still be usable without interruption after navigating back. We'll need to add web platform tests to confirm this, as it's easy to imagine implementation architectures in which keeping these objects alive while the `Document` is in the back/forward cache is difficult. -(For such implementations, failing to bfcache `Document`s with active `AITranslator` or `AILanguageDetector` objects would a simple way of being spec-compliant.) +(For such implementations, failing to bfcache `Document`s with active `Translator` or `LanguageDetector` objects would a simple way of being spec-compliant.) > 18. What happens when a document that uses your feature gets disconnected?