2 releases

0.1.1 Aug 14, 2024
0.1.0 Aug 14, 2024

#531 in Asynchronous

GPL-3.0-or-later

35KB
312 lines

async-blocking-stdio

Async versions of the handles returned by std::io::stdout, std::io::stderr, and std::io::stdin, which are able to be locked asynchronously to ensure that operations on these streams in an async context avoid being interleaved (with some caveats when they're also being used via the synchronous APIs). This provides similar cleanup behaviour as the standard library on program exit, as well.

Quick Example

This example locks stdout and prints some text. Note how we're holding the lock across the await point.

stdout and stderr are flushed on program exit, just like the normal standard outputs. However, you can manually flush as well if you want to ensure the time when the data is provided (regardless of flushing, any data sent while the handle is locked will not be interleaved with other things, barring the caveats around synchronous users).

futures_lite::future::block_on(async {
    use async_blocking_stdio as astdio;
    use futures_lite::io::AsyncWriteExt as _;
    let mut stdout_handle = astdio::stdout().lock().await;
    stdout_handle.write_all(b"Hello world!\n").await?;
    stdout_handle.write_all(b"Hello jay\n").await?;
    stdout_handle.write_all(b"Hiiiiiiii\n").await
}).unwrap()

There is also a convenient extension trait to get the handles directly from the std::io handles:

futures_lite::future::block_on(async {
    use futures_lite::io::AsyncWriteExt as _;
    // you could do this: use async_blocking_stdio::StdioExt as _;
    // but there is also a prelude:
    use async_blocking_stdio::prelude::*;
    let mut stdout_handle = std::io::Stdout::async_handle().lock().await;
    stdout_handle.write_all(b"Hello world!\n").await?;
    stdout_handle.write_all(b"Hello jay\n").await?;
    stdout_handle.write_all(b"Hiiiiiiii\n").await
}).unwrap()

Key Functionality

This crate, in many ways, mirrors the functionality of the synchronous versions in the standard library.

You get access to the various handles with the following functions:

Unlike the Standard Library Handles, you cannot perform IO operations on these directly with auto-locking (due to the structure of the async versions of the io traits) - it would require special state.

Instead, you only get access to a "duplicate" of the Standard Library's ability to make locked versions of the stdio streams that can be operated on. This lock can be held across await points (it is an async-compatible synchronisation primitive), and it can be used with the various async io trait operations defined by futures_lite::io.

Unlike the standard library's stdio locks, the locks produced by this library are not re-entrant. This means that holding the lock across an await of a future that itself attempts to lock the same structure will cause a deadlock - though if that function uses the synchronous versions it won't do so due to the reentrancy of the standard library locks.

Motivation

Currently, the main way to use the Stdio handles in an async context is either to just read from or write to them synchronously (trusting that they will have sufficiently low latency to avoid choking the async executor), or use blocking::Unblock to wrap the streams.

The former lets you lock the stdio streams to ensure that a collection of io operations do not interleave - though, you can't hold the Mutex to standard IO streams across an await point without causing issues - and the latter does not let you do locking, but does let you do async io.

The important reason why it doesn't let you prevent interleaving via a lock, is that you can't use a locked handle to a standard stream in the construction of an Unblock, as it isn't Send - instead, you need to use the on-demand-locking handles (Stdout, Stderr, and Stdin).

Even if you could use the locked handle, it would be extremely inefficient due to creating new piping resources - such as pretty large buffers - for every single operation on the stdio streams, due to creating a new blocking::Unblock each time.

Furthermore, it would cause massive data loss due to input buffering, on standard input, as each unblocked structure has it's own internal input, and every time you finished using such a handle to anything with output, you'd need to wait to flush the output before destroying the ephemeral async stream (which would end up being much less efficient than just using the global sync interfaces).

This crate is primarily intended for fairly complex usecases. For simple usecases, you might want to consider just using the standard io streams (even if they are sync) as they intrinsically carry some internal buffering themselves and may meet your needs if you aren't dealing with complex asynchronous tasks. The primary utilities of this crate are:

  • It sets up a single, global instance of each standard io stream, that can be accessed and locked asynchronously. This means that, as long as you use this crate and account for the caveats with other users not going through this crate, all asynchronous accesses avoid interleaving.
    This is different than if each library that wanted to use stdio asynchronously were to set up their own blocking::Unblock using the global std::io handles, as these would each have individual buffers.
    Each function that wrote to different overall asynchronous handles would then sporadically interleave with other ones written asynchronously due to occasionally needing to unlock and relock the underlying, unblocked std::io handles. This would even apply in the case of flushing, if the sent amounts were large enough.
    This is for the reason described in the caveats section.
  • The shared global instance also allows global buffering of standard IO, without all the issues described above of using temporary buffering (w.r.t data loss and flushing).
  • It sets up asynchronous versions of the stream that attempt to match the behaviour of the synchronous streams very closely in terms of flushing on program exit. This avoids the need to manually flush from inside the program (unless you want to ensure data visibility at a specific point).
  • It allows you to use async-friendly locked versions of stream across await points, which is useful when you want to read or write to stdio in coherent, atomic chunks, even when a subtask may need to be awaited.

Solution

This crate solves this problem by providing a unified interface to global statics containing blocking::Unblock-wrapped versions of the std::io::Stdout, std::io::Stderr, and std::io::Stdin handles - the statics themselves being synchronised by async-safe mutexes, with the handles obtainable with crate::stdout, crate::stderr, and crate::stdin.

It also mirrors some features of the Standard Library implementation - such as trying to flush any buffers on program exit (which is exceptionally important due to using blocking::Unblock under the hood - this has fairly large internal buffers).

The concrete implementation details may differ in future (for instance, this crate doesn't care about fair locking, only that outputs are not broken up, and the back-end is not always guaranteed to remain blocking::Unblock), but the semantics remain similar.

The solution used - using the "single-operation-auto-locking" versions of the standard library std-stream handles - has some caveats on how it interacts with non-async users of the standard IO streams, or users that construct their own blocking::Unblock-wrappers from the standard library handles directly.

Caveats

Because the underlying blocking implementation needs to use the non-explicitly-synchronised version of std::io::Stdout, std::io::Stderr, and std::io::Stdin, if it has to perform multiple underlying io operations, then a synchronous user of the same stream may interleave it's IO calls with that of asynchronous users of the library.

This is essentially unavoidable, but you can also perform synchronous access via methods described below, if you want to ensure that there is no interleaving.

For stdin, this means you always want to ensure that any input is done after locking it (either synchronously or asynchronously), and due to the caveats of interleaved operations, you should strongly consider only using one top-level handle to manage input from stdin.

Sync Access

You might want to do some synchronous operations on the streams, even if most of the program is async. To do this, there are several useful things to know:

  • The handles provide the ability to use asynchronous locking, but they also provide methods to lock them in a synchronous context as well (lock vs lock_blocking, there's also try_lock). You should never use synchronous locking in an async context, though.
  • You can use futures_lite::io::BlockOn to get synchronous streams from the asynchronous streams. The documentation warns about the need to flush things, but this doesn't actually need to occur while holding the synchronous wrapper type (you can leave it to async code or, for this library, to the cleanup infrastructure, if it's reliable enough for what you need).
    There are also utility functions to obtain wrapped handles directly at the same time as locking, which can be found on the handle types you get from crate::stdout, crate::stderr, and crate::stdin.

Dependencies

~1MB
~17K SLoC