Skip to content

Latest commit

 

History

History
357 lines (275 loc) · 14.3 KB

README.md

File metadata and controls

357 lines (275 loc) · 14.3 KB

deeprl

Access the DeepL translation engine through a quick and reliable interface. We aim to provide the full suite of tools DeepL offers. See the DeepL API docs for detailed resources.

Note:

This crate uses a blocking http client, and as such is only suitable for use in synchronous (blocking) applications. If you intend to use the library functions in an async app, there is a crate for that.

Quickstart

Create a new client with a valid API token to access the associated methods. For instance, you may wish to translate a simple text string to some target language.

use deeprl::{DeepL, Language, TextOptions};

let key = std::env::var("DEEPL_API_KEY").unwrap();
let dl = DeepL::new(&key);

// Translate 'good morning' to German
let text = vec![
    "good morning".to_string(),
];

let opt = TextOptions::new(Language::De).text(text);

let result = dl.translate(opt).unwrap();
assert!(!result.translations.is_empty());

let translation = &result.translations[0];
assert_eq!(translation.text, "Guten Morgen");

As a helpful sanity check, make sure you're able to return account usage statistics.

use deeprl::DeepL;

let dl = DeepL::new(
    &std::env::var("DEEPL_API_KEY").unwrap()
);

let usage = dl.usage().unwrap();
assert!(usage.character_limit > 0);

let count = usage.character_count;
let limit = usage.character_limit;
println!("Used: {count}/{limit}");
// Used: 42/500000

Walkthrough

Contents:

Configuration

The library supports a number of configuration options, one of which is the ability to swap out the default client for a new instance of reqwest::blocking::Client.

As before, we create a new instance of DeepL, but this time we declare it mut. Then we call client on the object and pass in our custom client.

let mut dl = DeepL::new(
    &std::env::var("DEEPL_API_KEY").unwrap()
);

let client = reqwest::blocking::Client::builder()
    .timeout(std::time::Duration::from_secs(21))
    .build()
    .unwrap();

dl.client(client);

We support sending a custom user agent along with requests. So for instance if you're using this library in another application, say My App v1.2.3, you can set the app name and version using set_app_info.

dl.set_app_info(
    "my-app/1.2.3".to_string()
);

Errors

Errors are encapsulated in the Error enum whose variants may be one of:

  • Client: A generic client-side error
  • Server: An error sent by the server
  • Deserialize: An error occurred while deserializing the response
  • InvalidRequest: Error sending an http request
  • InvalidResponse: Error parsing the response
  • InvalidLanguage: Error matching a user-supplied string to a Language

The library functions we'll look at below are all methods on the DeepL client, and many return a Result type (enum) that either resolves to a value of the type we expect, or one of the above Errors. So for some type T, the return type of a function returning a result is Result<T, Error>. While in the examples we unwrap the Result to pull out a value, it's common to implement more robust error handling in production code.

Get languages

Getting available languages requires specifying LanguageType as either Source or Target and returns a Result whose success value is a Vec<LanguageInfo>.

All instances of LanguageInfo contain language and name attributes. Target languages contain a third field, supports_formality which is true or false.

let source_langs = dl.languages(LanguageType::Source).unwrap();

for lang in source_langs {
    let code = lang.language;
    let name = lang.name;
    
    println!("{code} {name}");
    // BG Bulgarian
}

Translate text options

Translating text allows setting a number of options through the TextOptions builder, only one of which is required, namely the target_lang for the translation.

The list of options for text translation is as follows:

  • target_lang: The target Language (required)
  • source_lang: The source Language
  • split_sentences: Decide how to split sentences from the input text. Can be one of
    • SplitSentences::None Do not split sentences.
    • SplitSentences::Default Split on punctuation and newlines (default).
    • SplitSentences::NoNewLines Split on punctuation only.
  • preserve_formatting: Whether the translator should preserve the original format of the text. (default false)
  • formality: The desired formality in the target language. Not all target languages support formality. Options include:
    • Formality::Default
    • Formality::More
    • Formality::Less
    • Formality::PreferMore
    • Formality::PreferLess
  • glossary_id: The glossary id String to use for translation

Tag handling

The following are translation options related to tag handling

  • tag_handling: Enable handling tags in the input text. Can be one of:
    • TagHandling::Xml
    • TagHandling::Html
  • outline_detection: Whether the translator should automatically detect the outline (default true)
  • splitting_tags: List of tags used to split sentences, Vec<String>
  • non_splitting_tags: List of tags which do not split sentences, Vec<String>
  • ignore_tags: List of tags not to translate, Vec<String>

Below is a more complex translation where we want to specify a source language, ignore newlines in the input, preserve formatting, and set a desired formality. We'll also use a custom glossary, ensuring the given glossary matches both the source and target language of this translation.

The function translate expects two arguments: a TextOptions object, and a Vec<String> containing one or more texts to be translated. It returns a Result whose Ok value is a TranslateTextResult with a single field, translations that holds a Vec<Translation>.

// Translate 'you are nice' to French 
let text = vec![
    "you are nice".to_string(),
];
let opt = TextOptions::new(Language::Fr) // note `new` expects the required target lang
    .source_lang(Language::En)
    .formality(Formality::PreferLess)
    .text(text);

let result = dl.translate(opt).unwrap();

let translation = &result.translations[0];
println!("{}", translation.text);
// tu es gentille

A Translation has two attributes: text containing the translated text string, and detected_source_language, a string containing the language code of the source language detected by the server.

Here's an example where the input contains xml and where we only want to translate content inside the <p> tags.

let xml = r"
<xml>
    <head>
        <title>My English title</title>
    </head>
    <body>
        <p>The red crab</p>
        <p>Do you speak French?</p>
    </body>
</xml>"
    .to_string();

let text = vec![xml];
let split = vec!["p".to_string()]; // split on <p> tags
let ignore = vec!["title".to_string()]; // ignore <title> tags

let opt = TextOptions::new(Language::Fr)
    .source_lang(Language::En)
    .tag_handling(TagHandling::Xml)
    .outline_detection(false)
    .splitting_tags(split)
    .ignore_tags(ignore)
    .text(text);

let result = dl.translate(opt).unwrap();

let text = &result.translations[0].text;
assert!(text.contains("My English title"));
assert!(text.contains("Le crabe rouge"));

Translate documents

Translating a document consists of three steps: 1) uploading a document, 2) polling the status of a translation in progress, and 3) requesting download of the translated document.

First, we create an instance of DocumentOptions which requires we know the target language as well as the file path to a document stored locally and in a supported format. The list of document options is as follows:

  • target_lang: The target Language (required)
  • file_path: Path to the source file as PathBuf (required)
  • source_lang: The source Language
  • filename: Name of the file, String
  • formality: Formality preference, can be one of:
    • Formality::Default
    • Formality::More
    • Formality::Less
    • Formality::PreferMore
    • Formality::PreferLess
  • glossary_id: The id of the glossary to use for translation, String
// Upload a file in the current directory called 'test.txt'
let target_lang = Language::De;
let file_path = std::path::PathBuf::from("test.txt");
let opt = DocumentOptions::new(target_lang, file_path);

let doc = dl.document_upload(opt).unwrap();

println!("Document Id: {}", doc.document_id);
println!("Document Key: {}", doc.document_key);

document_upload expects an instance of DocumentOptions and returns a Result whose Ok value is a Document handle with two fields: document_id and document_key as strings.

Before we can download a finished document, we need to check the status of the translation process. We do so by calling document_status on the client and passing in a reference to the Document handle we received previously. The method returns a Result<DocumentStatus> where DocumentStatus is a struct with the following fields:

  • document_id: The unique document id String
  • status: An enum, DocState in one of the following states:
    • DocState::Queued
    • DocState::Translating
    • DocState::Done
    • DocState::Error
  • seconds_remaining: Estimated time until translation is complete Option<u64>
  • billed_characters: Number of characters billed Option<u64>
  • error_message: Message from the server in case of error Option<String>

When translation is complete, status will be in a state of DocState::Done, and calling is_done on our DocumentStatus object returns true. We may then proceed with download.

// Get the status of a document translation in progress
let status = dl.document_status(&doc).unwrap();

if status.is_done() {
    // Download translation result
    let out_file = PathBuf::from("test-translated.txt");
    let _ = dl.document_download(doc, Some(out_file.clone())).unwrap();
    let content = std::fs::read_to_string(out_file).unwrap();
    assert(!content.is_empty());
}

document_download takes as arguments the same Document handle we received after uploading as well as an optional PathBuf denoting the path to the file where the finished document will be saved. The function returns Result<PathBuf> where PathBuf is the path to the newly translated document.

If the user-supplied file path for the outgoing file is None, a file will be created in the current directory whose name contains the unique document_id.

Glossaries

DeepL supports creating custom glossaries for several language pairs allowing the user to specify an exact translation to use for a given word in the source text. To demonstrate, first we'll query the list of supported glossary language pairs.

The glossary_languages method takes no arguments and returns a Result<GlossaryLanguagePairsResult> whose Ok value has a single field, supported_languages holding a Vec<GlossaryLanguagePair>.

A GlossaryLanguagePair contains two fields: source_lang and target_lang as strings.

// Get supported glossary language pairs
let result = dl.glossary_languages().unwrap();

let lang_pairs = result.supported_languages;
assert!(!lang_pairs.is_empty());

for pair in lang_pairs {
    println!("{} -> {}", pair.source_lang, pair.target_lang);
    // EN -> IT
}

Now let's create a glossary with source language English and target language Italian. To do so, we'll create a file called my_glossary.csv to hold the glossary entries. The entries are formatted as a comma-separated list with two columns (source,target) with one entry per line. Thus, our csv file with two glossary entries looks like this:

my_glossary.csv

hello,ciao
goodbye,ciao

Back in rust, we'll read the contents of the file to a string called entries and pass it to glossary_new together with the following parameters (note, DeepL accepts glossary entries as tab-separated values as well):

  • name: String
  • source_lang: Language
  • target_lang: Language
  • entries: String
  • fmt: The format of our entries, must be one of:
    • GlossaryEntriesFormat::Csv
    • GlossaryEntriesFormat::Tsv

glossary_new returns a Result<Glossary> where Glossary is a struct with the following fields:

  • glossary_id: String
  • ready: bool
  • name: String
  • source_lang: String
  • target_lang: String
  • creation_time: String
  • entry_count: u64
// Create a new glossary
let name = "my_glossary".to_string();
let src = Language::En;
let trg = Language::It;
let entries = std::fs::read_to_string("my_glossary.csv").unwrap();
let fmt = GlossaryEntriesFormat::Csv;

let glossary = dl.glossary_new(name, src, trg, entries, fmt).unwrap();
assert_eq!(glossary.entry_count, 2);

let glos_id = glossary.glossary_id; // remember this!

// List glossaries
let result = dl.glossaries().unwrap();
let glossaries = result.glossaries;
assert!(!glossaries.is_empty());

Listing available glossaries returns a Result<GlossariesResult> whose inner value has an attribute glossaries that holds a Vec<Glossary>.

We can get information from a glossary in different ways. Calling glossary_info with a valid glossary_id returns a Result<Glossary> containing the attributes mentioned above. This method returns glossary metadata only.

To retrieve the actual entries, we use the glossary_entries method with a valid glossary_id. The function returns a Result<HashMap<String, String>> where the Ok value is a collection mapping a unique source word to its target translation.

// Get glossary info
// recall `glos_id` is the glossary id we obtained earlier
let glossary = dl.glossary_info(&glos_id).unwrap();

println!("{}", glossary.name);
// my_glossary

// Get entries from a glossary
let entries = dl.glossary_entries(&glos_id).unwrap();

for (key, value) in entries {
    println!("{key} {value}");
    
    /*
    hello ciao
    goodbye ciao
    */
}

// Remove an unwanted glossary
let result = dl.glossary_delete(&glos_id);
assert!(result.is_ok());

To remove a glossary, call the glossary_delete method passing a reference to the glossary_id. The function returns Result<()> where the success value is an empty tuple.