21 releases (10 stable)
2.4.0 | Apr 4, 2025 |
---|---|
2.3.0 | Mar 30, 2025 |
1.0.1 | Mar 7, 2025 |
0.3.2 | Sep 3, 2024 |
0.1.8 | Jul 22, 2024 |
#82 in GUI
1,090 downloads per month
240KB
4.5K
SLoC
RustAutoGUI
RustAutoGUI crate, made after Al Sweigarts library PyAutoGUI for python.
RustAutoGUI allows you to control the mouse and keyboard to automate interactions with other applications. The crate works on Windows, Linux and Macos.
Main functions:
- capture screen
- find image on screen
- move mouse to pixel coordinate
- click mouse buttons
- input keyboard string
- input keyboard command
- keyboard multiple key press
- find image on screen and move mouse to it
- detect cursor position
- and more
Achievable speed
Unlike PyAutoGUI, this library does not use OpenCV for template matching. Instead, it employs a custom multithreaded implementation, though it lacks GPU acceleration. While OpenCV's template matching is highly optimized, RustAutoGui aims to provide faster overall performance by optimizing the entire process of screen capture, processing, and image matching. From tests so far, the performance appears to be ~5x faster than python counterpart. The speed will also vary between operating systems, where Windows outperforms Linux for instance.
Gif presentation (intentionally captured with phone camera):
Why not OpenCV?
OpenCV requires complex dependencies and a lengthy setup process in Rust. To keep installation simple and avoid forcing users to spend hours setting up dependencies, RustAutoGui features a fully custom template matching algorithm that minimizes computations while achieving high accuracy.
Segmented template matching algorithm
Since version 1.0.0, RustAutoGUI crate includes another variation of template matching algorithm using Segmented Normalized Cross-Correlation. More information: https://arxiv.org/pdf/2502.01286
Installation
Either run
cargo add rustautogui
or add the crate in your Cargo.toml:
rustautogui = "2.4.0"
For Linux additionally run:
sudo apt-get update
sudo apt-get install libx11-dev libxtst-dev
For macOS: grant necessary permissions in your settings.
Usage:
Since version 2.2.0, RustAutoGUI supports loading multiple images in memory and searching them. It also allows loading images from memory instead of only from disk.
Loading an image does certain precalculations required for the template matching process, which allows faster execution of the process itself requiring less computations.
Import and Initialize RustAutoGui
use rustautogui;
let mut rustautogui = rustautogui::RustAutoGui::new(false); // arg: debug
Loading single image into memory
From file, same as load_and_prepare_template which will be deprecated
rustautogui.prepare_template_from_file( // returns Result<(), String>
"template.png", // template_path: &str path to the image file on disk
Some((0,0,1000,1000)), // region: Option<(u32, u32, u32, u32)> region of monitor to search (x, y, width, height)
rustautogui::MatchMode::Segmented, // match_mode: rustautogui::MatchMode search mode (Segmented or FFT)
).unwrap();
From ImageBuffer<RGB/RGBA/Luma>
rustautogui.prepare_template_from_imagebuffer( // returns Result<(), String>
img_buffer, // image: ImageBuffer<P, Vec<T>> -- Accepts RGB/RGBA/Luma(black and white)
None, // region: Option<(u32, u32, u32, u32)> searches whole screen when None
rustautogui::MatchMode::FFT, // match_mode: rustautogui::MatchMode search mode (Segmented or FFT)
).unwrap();
From raw bytes of encoded image
rustautogui.prepare_template_from_raw_encoded( // returns Result<(), String>
img_raw // img_raw: &[u8] - encoded raw bytes
None, // region: Option<(u32, u32, u32, u32)>
rustautogui::MatchMode::FFT, // match_mode: rustautogui::MatchMode search mode (Segmented or FFT)
).unwrap();
Segmented vs FFT matching
It is hard to give a 100% correct answer when to use which algorithm. FFT algorithm is mostly consistent, with no big variances in speed. Segmented on other hand can heavily vary and speed can be up to 10x faster than FFT, but also slower by factor of up to thousands. The best method would be for users to test both methods and determine when to use which method. A general advice can be: Use segmented on smaller template images and when template is less visually complex (visual complexity is randomness of pixels in an image, for instance an image that is half white vs half black vs random noise image). FFT would probably be better when comparing large template images on a large region, but also when template size approaches image region size.
Generally, if you're following the idea of maximizing speeds by using as small as possible template images and determining small as possible screen regions, in most cases Segmented will perform faster than FFT.
Matchmodes enum:
pub enum MatchMode {
Segmented,
FFT,
}
Loading multiple images into memory
Functions work the same as single image loads, with additional parameter of alias for the image.
Load from file
rustautogui.store_template_from_file( // returns Result<(), String>
"template.png", // template_path: &str path to the image file on disk
Some((0,0,1000,1000)), // region: Option<(u32, u32, u32, u32)> region of monitor to search (x, y, width, height)
rustautogui::MatchMode::Segmented, // match_mode: rustautogui::MatchMode search mode (Segmented or FFT)
"button_image" // alias: &str. Keyword used to select which image to search for
).unwrap();
Load from Imagebuffer
rustautogui.store_template_from_imagebuffer( // returns Result<(), String>
img_buffer, // image: ImageBuffer<P, Vec<T>> -- Accepts RGB/RGBA/Luma(black and white)
None, // region: Option<(u32, u32, u32, u32)>
rustautogui::MatchMode::Segmented, // match_mode: rustautogui::MatchMode search mode (Segmented or FFT)
"button_image" // alias: &str. Keyword used to select which image to search for
).unwrap();
Load from encoded raw bytes
rustautogui.store_template_from_raw_encoded( // returns Result<(), String>
img_raw // img_raw: &[u8] encoded raw bytes
None, // region: Option<(u32, u32, u32, u32)>
rustautogui::MatchMode::Segmented, // match_mode: rustautogui::MatchMode search mode (Segmented or FFT)
"button_image" // alias: &str. Keyword used to select which image to search for
).unwrap();
Single loaded template search
Find image and get pixel coordinates
let found_locations: Option<Vec<(u32, u32, f64)>> = rustautogui.find_image_on_screen(0.9).unwrap(); // arg: precision
// returns pixel locations for prepared template that have correlation higher than precision, ordered from highest correlation to lowest
// Must have prepared template before
Find image, get pixel coordinates and move mouse to location
let found_locations: Option<Vec<(u32, u32, f64)>> = rustautogui.find_image_on_screen_and_move_mouse(0.9, 1.0).unwrap();
// args: precision , moving_time
// executes find_image_on_screen() and moves mouse to the center of the highest correlation location
IMPORTANT: Difference between linux and windows/macOS when using multiple monitors. On Windows and macOS, search for template image can be done only on the main monitor. On Linux, searches can be done on all monitors if multiple are used, with (0,0) starting from the top-left monitor.
Loop search with timeout. Searches till image is found or timeout in seconds is hit.
Warning: timeout of 0 initiates infinite loop
rustautogui
.loop_find_image_on_screen(0.95, 15) // args: precision, timeout
.unwrap();
rustautogui
.loop_find_image_on_screen_and_move_mouse(0.95, 1.0, 15) // args: precision, moving_time and timeout
.unwrap();
Multiple stored templates search
Again, functions are the same, just having alias argument
rustautogui
.find_stored_image_on_screen(0.9, "test2") // precision, alias
.unwrap();
With mouse movement to location
rustautogui
.find_stored_image_on_screen_and_move_mouse(0.9, 1.0, "test2") // precision, moving_time, alias (&str)
.unwrap();
Loop search
Warning: timeout of 0 initiates infinite loop
rustautogui
.loop_find_stored_image_on_screen(0.95, 15, "stars") // precision, timeout, alias
.unwrap();
rustautogui
.loop_find_stored_image_on_screen_and_move_mouse(0.95, 1.0, 15, "stars") // precision, moving_time, timeout, alias
.unwrap();
MacOS retina display issues:
Macos retina display functions by digitally doubling the amount of displayed pixels. The original screen size registered by OS is, for instance, 1400x800. Retina display doubles it to 2800x1600. If a user provides a screengrab, the image will be saved with doubled the amount of pixels, where it then fails to match template since screen provided by OS api is not doubled. It can also not be known if user is providing template from a screen grab, or an image thats coming from some other source. For that reason, every template is saved in its original format, and also resized by half. The template search first searches for resized template, and if it fails then it tries with original. For that reason, users on macOS will experience slower search times than users on other operating systems.
General functions
Debug mode prints out number of segments in segmented picture, times taken for algorithm run and it saves segmented images. It also creates debug folder in code root, where the images are saved.
Warnings give useful information which shouldn't pop up frequently
rustautogui.get_screen_size(); // returns (x, y) size of display
rustautogui.change_debug_state(true); // change debugging
rustautogui.set_suppress_warning(true); // turn off warnings
rustautogui.save_screenshot("test.png").unwrap(); //saves screen screenshot
Mouse functions
MouseClick enum used in some functions
pub enum MouseClick {
LEFT,
RIGHT,
MIDDLE,
}
Get current mouse position
rustautogui.get_mouse_position().unwrap(); // returns (x,y) coordinate of mouse
Mouse clicks functions. Mouse up and down work only on Windows / Linux.
rustautogui.click(MouseClick::LEFT).unwrap(); // args: button, choose click button MouseClick::{LEFT, RIGHT, MIDDLE}
rustautogui.left_click().unwrap(); // left mouse click
rustautogui.right_click().unwrap(); // right mouse click
rustautogui.double_click().unwrap(); // double left click
rustautogui.middle_click().unwrap(); // double left click
// mouse up and mouse down work only on Windows and Linux
rustautogui.mouse_down(MouseClick::RIGHT).unwrap(); // args: button, click button down, MouseClick::{LEFT, RIGHT, MIDDLE}
rustautogui.mouse_up(MouseClick::RIGHT).unwrap(); // args: button, click button up MouseClick::{LEFT, RIGHT, MIDDLE}
Mouse scrolls functions
rustautogui.scroll_up().unwrap();
rustautogui.scroll_down().unwrap();
rustautogui.scroll_left().unwrap();
rustautogui.scroll_right().unwrap();
Mouse movements functions
rustautogui.move_mouse_to_pos(1920, 1080, 1.0).unwrap(); // args: x, y, moving_time. Moves mouse to position for certain time
rustautogui.move_mouse_to(Some(500), None, 1.0).unwrap(); // args: x, y, moving_time. Moves mouse to position, but acceps Option
// None Value keeps same position
rustautogui.move_mouse(-50, 120, 1.0).unwrap(); // args: x, y, moving_time. Moves mouse relative to its current position.
// -x left, +x right, -y up, +y down. 0 maintain position
Mouse drag functions.
For all mouse drag commands, use moving time > 0.2, or even higher, depending on distance. Especially important for macOS
In version 2.4.0 drag_mouse() was renamed to drag_mouse_to_pos(). New drag_mouse() is in relative to its current position
Drag action is: left click down, move mouse to position, left click up. Like when moving icons
rustautogui.drag_mouse_to_pos(150, 980, 2.0).unwrap(); // args: x, y, moving_time.
rustautogui.drag_mouse_to(Some(200), Some(400), 1.2).unwrap(); // args: x, y, moving_time. Accepts option. None value keeps current pos.
rustautogui.drag_mouse(500, -500, 1.0).unwrap(); // args: x, y, moving_time. Drags mouse relative to its current position.
// Same rules as in move_mouse
Below is a helper function to determine coordinates on screen, helpful when determining region or mouse move target when developing
- Before 0.3.0 this function popped up window, now it just prints. This was changed to reduce dependencies.
use rustautogui::print_mouse_position;
fn main() {
print_mouse_position().unwrap();
}
Keyboard functions
Currently, only US keyboard is implemented. If you have different layout active, lots of characters will not work correctly
rustautogui.keyboard_input("test!@#24").unwrap(); // input string, or better say, do the sequence of key presses
rustautogui.keyboard_command("backspace").unwrap(); // press a keyboard button
rustautogui.keyboard_multi_key("shift", "control", Some("t")).unwrap(); // Executed multiple key press at same time. third argument is optional
rustautogui.key_down("backspace").unwrap(); // press a keyboard button down only
rustautogui.key_up("backspace").unwrap(); // press a keyboard button down only
For all the keyboard commands check Keyboard_commands.md, a table of possible keyboard inputs/commands for each OS. If you find some keyboard commands missing that you need, please open an issue in order to get it added in next versions.
Warnings options:
Rustautogui may display some warnings. In case you want to turn them off, either run:
Windows powershell:
$env:RUSTAUTOGUI_SUPPRESS_WARNINGS="1" #to turn off warnings
$env:RUSTAUTOGUI_SUPPRESS_WARNINGS="0" #to activate warnings
Windows CMD:
set RUSTAUTOGUI_SUPPRESS_WARNINGS=1 #to turn off warnings
set RUSTAUTOGUI_SUPPRESS_WARNINGS=0 #to activate warnings
Linux/MacOS:
export RUSTAUTOGUI_SUPPRESS_WARNINGS=1 #to turn off warnings
export RUSTAUTOGUI_SUPPRESS_WARNINGS=0 #to activate warnings
or in code:
let mut rustautogui = RustAutoGui::new(false).unwrap();
rustautogui.set_suppress_warnings(true);
How does crate work:
- On Windows, RustAutoGUI interacts with winapi
- on Linux, it uses x11, and Wayland is not supported
- on macOS, it uses core-graphics crate
Major changes:
For more details, check CHANGELOG.md
- 1.0.0 - introduces segmented match mode
- 2.0.0 - removed most of panics and crashes
- 2.1.0 - fixed on keyboard, some methods arguments / returns changed and will cause code breaking.
- 2.2.0 - loading multiple images, loading images from memory
- 2.3.0 - rework and improvement on Segmented match mode
- 2.4.0 - many additional functions for mouse and keyboard
Additional notes
Data stored in prepared template data
pub enum PreparedData {
Segmented(
(
Vec<(u32, u32, u32, u32, f32)>, // template_segments_fast
Vec<(u32, u32, u32, u32, f32)>, // template_segments_slow
u32, // template_width
u32, // template_height
f32, // segment_sum_squared_deviations_fast
f32, // segment_sum_squared_deviations_slow
f32, // expected_corr_fast
f32, // expected_corr_slow
f32, // segments_mean_fast
f32, // segments_mean_slow
),
),
FFT(
(
Vec<Complex<f32>>, // template_conj_freq
f32, // template_sum_squared_deviations
u32, // template_width
u32, // template_height
u32, // padded_size
),
),
None,
}
Dependencies
~7.5MB
~136K SLoC