-
-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Localization Proposal
The main goal is to make it possible to present MathJax's user interface elements in languages other than English. This includes things like the MathJax menu, the About MathJax dialog, the loading messages, and the various error messages produced by the input jax. This document describes a proposal for the underlying code and data structures for implementing this in MathJax.
The code must be able to handle the following:
- expressions with substitution values (e.g., "file xxx not found")
- plural forms (e.g. "loaded xx file" versus "loaded xx files")
- multiple forms for a word (e.g., "Post" as a verb versus "Post" as a noun)
- HTML-snippets as defined in MathJax (since many dialogs are constructed from these)
- fallback to English when translations are not available
- translations for dynamically loaded components
- components that may not all come from the same location
- third-party translations
The mechanism for specifying the selected language has yet to be determined, but the page author should be able to give a default language, and users should be able to override that if they choose.
A new Localization
object will be added to the MathJax
variable to handle localization functions. This will include the data needed for the translations into the selected language, the methods to be called for obtaining those translations, and the methods needed for loading and registering translations.
Currently all messages used in MathJax are in English, and the text of these messages usually are hard-coded as literal strings at the locations the messages are used. (Some messages are constructed on the fly from smaller pieces. These messages may need to be handled differently to allow for easier translation.) This is convenient since it is easy to see what message will be produced at any particular point, but in order to allow MathJax to be localized, these strings will need to be replaced by function calls that obtain the translation appropriate for the selected language.
One approach would be to use these message strings as the keys for looking up the translations, but this would make it harder to modify the English messages if rewording were required, or if spelling errors were found. Instead, each message will have an ID string that will be used to identify the phrase so that the English can be changed without requiring all the translation files to be modified to reflect the change. This also has the advantage the the same word or phrase, when used in different ways, can have different identifiers, so "Post" as a verb and "Post" as a noun can be translated differently, if necessary.
The basic means of obtaining the string to use for a message to display to the user is to call the _()
method of the MathJax.Localization
object, passing the string id and the English phrase. For example,
MathJax.Message.Set("Typesetting complete");
could be replaced by
MathJax.Message.Set(_("TC","Typsetting complete"));
where "TC"
is the identifier for the message "Typesetting complete"
, and provided you have defined
var _ = function () {MathJax.Localization._.apply(MathJax.Localization,arguments)}
earlier. (Since most of MathJax is defined within a function closure, making such function shortcuts is straight-forward.)
The advantage of having both the identifier and the English string together is that
- You still can see the actual English message at the location in the code where it is used.
- The English version is available to use as a fallback if the phrase has not been translated into the selected language.
- The English translation doesn't need to be loaded separately (i.e., you don't need to load two language files, the selected one, plus English for fallback, and English users won't need to download any language files at all).
Using short identifiers can lead to collisions if not handled carefully. To help avoid this, we introduce identifier domains that are used to isolate collections of identifiers for one component of MathJax from those for another component. For example, each input jax could have its own domain, as could each extension. This means you only have to worry about collisions within your own domain, and so can more easily manage the uniqueness id's in use.
To use a domain with your id, pass _()
an array consisting of the domain and the id in place of the id. For example, the TeX input jax could use
TEX.Error(_(["TeX","mb"],"Missing Close Brace"));
to get the message with id "mb"
in the domain "TeX"
. Note that the local definition for _()
within the TeX input jax could be
var _ = function (id) {MathJax.Localization._.apply(MathJax.Localization,[ ["TeX",id] ].concat([].slice.call(arguments,1)));
in which case the message above could become
TEX.Error(_("mb","Missing Close Brace"));
This lets you avoid having to repeat the domain within every call to _()
in the input jax. (It would also be possible for TEX.Error()
to call _()
for you, but see below for information about obtaining the translation data.)
...