6 releases (3 breaking)
0.4.0 | Sep 3, 2024 |
---|---|
0.3.0 | Apr 21, 2024 |
0.2.1 | Sep 13, 2023 |
0.2.0 | Jul 20, 2023 |
0.1.1 | Jul 13, 2023 |
#109 in HTTP client
59KB
912 lines
deeprl
Access the DeepL translation engine through a quick and reliable interface. We aim to provide the full suite of tools DeepL offers. See the DeepL API docs for detailed resources.
Note:
This crate uses a blocking http client, and as such is only suitable for use in synchronous (blocking) applications. If you intend to use the library functions in an async app, there is a crate for that.
Quickstart
Create a new client with a valid API token to access the associated methods. For instance, you may wish to translate a simple text string to some target language.
use deeprl::{DeepL, Language, TextOptions};
let key = std::env::var("DEEPL_API_KEY").unwrap();
let dl = DeepL::new(&key);
// Translate 'good morning' to German
let text = vec![
"good morning".to_string(),
];
let opt = TextOptions::new(Language::De).text(text);
let result = dl.translate(opt).unwrap();
assert!(!result.translations.is_empty());
let translation = &result.translations[0];
assert_eq!(translation.text, "Guten Morgen");
As a helpful sanity check, make sure you're able to return account usage statistics.
use deeprl::DeepL;
let dl = DeepL::new(
&std::env::var("DEEPL_API_KEY").unwrap()
);
let usage = dl.usage().unwrap();
assert!(usage.character_limit > 0);
let count = usage.character_count;
let limit = usage.character_limit;
println!("Used: {count}/{limit}");
// Used: 42/500000
Walkthrough
Contents:
Configuration
The library supports a number of configuration options, one of which is the ability to swap out the default client for a new instance of reqwest::blocking::Client
.
As before, we create a new instance of DeepL
, but this time we declare it mut
. Then we call client
on the object and pass in our custom client.
let mut dl = DeepL::new(
&std::env::var("DEEPL_API_KEY").unwrap()
);
let client = reqwest::blocking::Client::builder()
.timeout(std::time::Duration::from_secs(21))
.build()
.unwrap();
dl.client(client);
We support sending a custom user agent along with requests. So for instance if you're using this library in another application, say My App v1.2.3, you can set the app name and version using set_app_info
.
dl.set_app_info(
"my-app/1.2.3".to_string()
);
Errors
Errors are encapsulated in the Error
enum whose variants may be one of:
Client
: A generic client-side errorServer
: An error sent by the serverDeserialize
: An error occurred while deserializing the responseInvalidRequest
: Error sending an http requestInvalidResponse
: Error parsing the responseInvalidLanguage
: Error matching a user-supplied string to aLanguage
The library functions we'll look at below are all methods on the DeepL
client, and many return a Result
type (enum) that either resolves to a value of the type we expect, or one of the above Error
s. So for some type T
, the return type of a function returning a result is Result<T, Error>
. While in the examples we unwrap
the Result
to pull out a value, it's common to implement more robust error handling in production code.
Get languages
Getting available languages
requires specifying LanguageType
as either Source
or Target
and returns a Result
whose success value is a Vec<LanguageInfo>
.
All instances of LanguageInfo
contain language
and name
attributes. Target languages contain a third field, supports_formality
which is true
or false
.
let source_langs = dl.languages(LanguageType::Source).unwrap();
for lang in source_langs {
let code = lang.language;
let name = lang.name;
println!("{code} {name}");
// BG Bulgarian
}
Translate text options
Translating text allows setting a number of options through the TextOptions
builder, only one of which is required, namely the target_lang
for the translation.
The list of options for text translation is as follows:
target_lang
: The targetLanguage
(required)source_lang
: The sourceLanguage
split_sentences
: Decide how to split sentences from the input text. Can be one ofSplitSentences::None
Do not split sentences.SplitSentences::Default
Split on punctuation and newlines (default).SplitSentences::NoNewLines
Split on punctuation only.
preserve_formatting
: Whether the translator should preserve the original format of the text. (defaultfalse
)formality
: The desired formality in the target language. Not all target languages support formality. Options include:Formality::Default
Formality::More
Formality::Less
Formality::PreferMore
Formality::PreferLess
glossary_id
: The glossary idString
to use for translation
Tag handling
The following are translation options related to tag handling
tag_handling
: Enable handling tags in the input text. Can be one of:TagHandling::Xml
TagHandling::Html
outline_detection
: Whether the translator should automatically detect the outline (defaulttrue
)splitting_tags
: List of tags used to split sentences,Vec<String>
non_splitting_tags
: List of tags which do not split sentences,Vec<String>
ignore_tags
: List of tags not to translate,Vec<String>
Below is a more complex translation where we want to specify a source language, ignore newlines in the input, preserve formatting, and set a desired formality. We'll also use a custom glossary, ensuring the given glossary matches both the source and target language of this translation.
The function translate
expects two arguments: a TextOptions
object, and a Vec<String>
containing one or more texts to be translated. It returns a Result
whose Ok
value is a TranslateTextResult
with a single field, translations
that holds a Vec<Translation>
.
// Translate 'you are nice' to French
let text = vec![
"you are nice".to_string(),
];
let opt = TextOptions::new(Language::Fr) // note `new` expects the required target lang
.source_lang(Language::En)
.formality(Formality::PreferLess)
.text(text);
let result = dl.translate(opt).unwrap();
let translation = &result.translations[0];
println!("{}", translation.text);
// tu es gentille
A Translation
has two attributes: text
containing the translated text string, and detected_source_language
, a string containing the language code of the source language detected by the server.
Here's an example where the input contains xml and where we only want to translate content inside the <p>
tags.
let xml = r"
<xml>
<head>
<title>My English title</title>
</head>
<body>
<p>The red crab</p>
<p>Do you speak French?</p>
</body>
</xml>"
.to_string();
let text = vec![xml];
let split = vec!["p".to_string()]; // split on <p> tags
let ignore = vec!["title".to_string()]; // ignore <title> tags
let opt = TextOptions::new(Language::Fr)
.source_lang(Language::En)
.tag_handling(TagHandling::Xml)
.outline_detection(false)
.splitting_tags(split)
.ignore_tags(ignore)
.text(text);
let result = dl.translate(opt).unwrap();
let text = &result.translations[0].text;
assert!(text.contains("My English title"));
assert!(text.contains("Le crabe rouge"));
Translate documents
Translating a document consists of three steps: 1) uploading a document, 2) polling the status of a translation in progress, and 3) requesting download of the translated document.
First, we create an instance of DocumentOptions
which requires we know the target language as well as the file path to a document stored locally and in a supported format. The list of document options is as follows:
target_lang
: The targetLanguage
(required)file_path
: Path to the source file asPathBuf
(required)source_lang
: The sourceLanguage
filename
: Name of the file,String
formality
: Formality preference, can be one of:Formality::Default
Formality::More
Formality::Less
Formality::PreferMore
Formality::PreferLess
glossary_id
: The id of the glossary to use for translation,String
// Upload a file in the current directory called 'test.txt'
let target_lang = Language::De;
let file_path = std::path::PathBuf::from("test.txt");
let opt = DocumentOptions::new(target_lang, file_path);
let doc = dl.document_upload(opt).unwrap();
println!("Document Id: {}", doc.document_id);
println!("Document Key: {}", doc.document_key);
document_upload
expects an instance of DocumentOptions
and returns a Result
whose Ok
value is a Document
handle with two fields: document_id
and document_key
as strings.
Before we can download a finished document, we need to check the status of the translation process. We do so by calling document_status
on the client and passing in a reference to the Document
handle we received previously. The method returns a Result<DocumentStatus>
where DocumentStatus
is a struct with the following fields:
document_id
: The unique document idString
status
: An enum,DocState
in one of the following states:DocState::Queued
DocState::Translating
DocState::Done
DocState::Error
seconds_remaining
: Estimated time until translation is completeOption<u64>
billed_characters
: Number of characters billedOption<u64>
error_message
: Message from the server in case of errorOption<String>
When translation is complete, status
will be in a state of DocState::Done
, and calling is_done
on our DocumentStatus
object returns true. We may then proceed with download.
// Get the status of a document translation in progress
let status = dl.document_status(&doc).unwrap();
if status.is_done() {
// Download translation result
let out_file = PathBuf::from("test-translated.txt");
let _ = dl.document_download(doc, Some(out_file.clone())).unwrap();
let content = std::fs::read_to_string(out_file).unwrap();
assert(!content.is_empty());
}
document_download
takes as arguments the same Document
handle we received after uploading as well as an optional PathBuf
denoting the path to the file where the finished document will be saved. The function returns Result<PathBuf>
where PathBuf
is the path to the newly translated document.
If the user-supplied file path for the outgoing file is None
, a file will be created in the current directory whose name contains the unique document_id
.
Glossaries
DeepL supports creating custom glossaries for several language pairs allowing the user to specify an exact translation to use for a given word in the source text. To demonstrate, first we'll query the list of supported glossary language pairs.
The glossary_languages
method takes no arguments and returns a Result<GlossaryLanguagePairsResult>
whose Ok
value has a single field, supported_languages
holding a Vec<GlossaryLanguagePair>
.
A GlossaryLanguagePair
contains two fields: source_lang
and target_lang
as strings.
// Get supported glossary language pairs
let result = dl.glossary_languages().unwrap();
let lang_pairs = result.supported_languages;
assert!(!lang_pairs.is_empty());
for pair in lang_pairs {
println!("{} -> {}", pair.source_lang, pair.target_lang);
// EN -> IT
}
Now let's create a glossary with source language English and target language Italian. To do so, we'll create a file called my_glossary.csv to hold the glossary entries. The entries are formatted as a comma-separated list with two columns (source,target) with one entry per line. Thus, our csv file with two glossary entries looks like this:
my_glossary.csv
hello,ciao
goodbye,ciao
Back in rust, we'll read the contents of the file to a string called entries
and pass it to glossary_new
together with the following parameters (note, DeepL accepts glossary entries as tab-separated values as well):
name: String
source_lang: Language
target_lang: Language
entries: String
fmt
: The format of our entries, must be one of:GlossaryEntriesFormat::Csv
GlossaryEntriesFormat::Tsv
glossary_new
returns a Result<Glossary>
where Glossary
is a struct with the following fields:
glossary_id: String
ready: bool
name: String
source_lang: String
target_lang: String
creation_time: String
entry_count: u64
// Create a new glossary
let name = "my_glossary".to_string();
let src = Language::En;
let trg = Language::It;
let entries = std::fs::read_to_string("my_glossary.csv").unwrap();
let fmt = GlossaryEntriesFormat::Csv;
let glossary = dl.glossary_new(name, src, trg, entries, fmt).unwrap();
assert_eq!(glossary.entry_count, 2);
let glos_id = glossary.glossary_id; // remember this!
// List glossaries
let result = dl.glossaries().unwrap();
let glossaries = result.glossaries;
assert!(!glossaries.is_empty());
Listing available glossaries
returns a Result<GlossariesResult>
whose inner value has an attribute glossaries
that holds a Vec<Glossary>
.
We can get information from a glossary in different ways. Calling glossary_info
with a valid glossary_id
returns a Result<Glossary>
containing the attributes mentioned above. This method returns glossary metadata only.
To retrieve the actual entries, we use the glossary_entries
method with a valid glossary_id
. The function returns a Result<HashMap<String, String>>
where the Ok
value is a collection mapping a unique source word to its target translation.
// Get glossary info
// recall `glos_id` is the glossary id we obtained earlier
let glossary = dl.glossary_info(&glos_id).unwrap();
println!("{}", glossary.name);
// my_glossary
// Get entries from a glossary
let entries = dl.glossary_entries(&glos_id).unwrap();
for (key, value) in entries {
println!("{key} {value}");
/*
hello ciao
goodbye ciao
*/
}
// Remove an unwanted glossary
let result = dl.glossary_delete(&glos_id);
assert!(result.is_ok());
To remove a glossary, call the glossary_delete
method passing a reference to the glossary_id
. The function returns Result<()>
where the success value is an empty tuple.
Dependencies
~4–17MB
~224K SLoC