13 releases (breaking)
|Jan 8, 2024
|Nov 7, 2023
|Jul 4, 2023
|Dec 22, 2021
#31 in Operating systems
17,110 downloads per month
Used in 19 crates (16 directly)
virtio-queue crate provides a virtio device implementation for a virtio
queue, a virtio descriptor and a chain of such descriptors.
Two formats of virtio queues are defined in the specification: split virtqueues
and packed virtqueues. The
virtio-queue crate offers support only for the
The purpose of the virtio-queue API is to be consumed by virtio device
implementations (such as the block device or vsock device).
The main abstraction is the
Queue. The crate is also defining a state object
for the queue, i.e.
Let’s take a concrete example of how a device would work with a queue, using the MMIO bus.
First, it is important to mention that the mandatory parts of the virtio interface are the following:
- the device status field → provides an indication of the completed steps of the device initialization routine,
- the feature bits → the features the driver/device understand(s),
- one or more virtqueues → the mechanism for data transport between the driver and device.
Each virtqueue consists of three parts:
- Descriptor Table,
- Available Ring,
- Used Ring.
Before booting the virtual machine (VM), the VMM does the following set up:
- initialize an array of Queues using the Queue constructor.
- register the device to the MMIO bus, so that the driver can later send read/write requests from/to the MMIO space, some of those requests also set up the queues’ state.
- other pre-boot configurations, such as registering a fd for the interrupt assigned to the device, fd which will be later used by the device to inform the driver that it has information to communicate.
After the boot of the VM, the driver starts sending read/write requests to configure things like:
- the supported features;
- queue parameters. The following setters are used for the queue set up:
set_size→ for setting the size of the queue.
set_ready→ configure the queue to the
ready for processingstate.
set_used_ring_address→ configure the guest address of the constituent parts of the queue.
set_event_idx→ it is called as part of the features' negotiation in the
virtio-devicecrate, and is enabling or disabling the VIRTIO_F_RING_EVENT_IDX feature.
- the device activation. As part of this activation, the device can also create a queue handler for the device, that can be later used to process the queue.
Once the queues are ready, the device can be used.
The steady state operation of a virtio device follows a model where the driver produces descriptor chains which are consumed by the device, and both parties need to be notified when new elements have been placed on the associate ring to avoid busy polling. The precise notification mechanism is left up to the VMM that incorporates the devices and queues (it usually involves things like MMIO vm exits and interrupt injection into the guest). The queue implementation is agnostic to the notification mechanism in use, and it exposes methods and functionality (such as iterators) that are called from the outside in response to a notification event.
Data transmission using virtqueues
The basic principle of how the queues are used by the device/driver is the following, as showed in the diagram below as well:
- when the guest driver has a new request (buffer), it allocates free descriptor(s) for the buffer in the descriptor table, chaining as necessary.
- the driver adds a new entry with the head index of the descriptor chain describing the request, in the available ring entries.
- the driver increments the
idxwith the number of new entries, the diagram shows the simple use case of only one new entry.
- the driver sends an available buffer notification to the device if such notifications are not suppressed.
- the device will at some point consume that request, by first reading the
idxfield from the available ring. This can be directly achieved with
Queue::avail_idx, but we do not recommend to the consumers of the crate to use this because it is already called behind the scenes by the iterator over all available descriptor chain heads.
- the device gets the index of the descriptor chain(s) corresponding to the
- the device reads the corresponding descriptor(s) from the descriptor table.
- the device adds a new entry in the used ring by using
Queue::add_used; the entry is defined in the spec as
virtq_used_elem, and in
VirtqUsedElem. This structure is holding both the index of the descriptor chain and the number of bytes that were written to the memory as part of serving the request.
- the device increments the
idxfrom the used ring; this is done as part of the
Queue::add_usedthat was mentioned above.
- the device sends a used buffer notification to the driver if such notifications are not suppressed.
A descriptor is storing four fields, with the first two,
pointing to the data in memory to which the descriptor refers, as shown in the
diagram below. The
flags field is useful for indicating if, for example, the
buffer is device readable or writable, or if we have another descriptor chained
after this one (VIRTQ_DESC_F_NEXT flag set).
next field is storing the index
of the next descriptor if VIRTQ_DESC_F_NEXT is set.
Requirements for device implementation
- Abstractions from virtio-queue such as
DescriptorChaincan be used to parse descriptors provided by the device, which represent input or output memory areas for device I/O. A descriptor is essentially an (address, length) pair, which is subsequently used by the device model operation. We do not check the validity of the descriptors, and instead expect any validations to happen when the device implementation is attempting to access the corresponding areas. Early checks can add non-negligible additional costs, and exclusively relying upon them may lead to time-of-check-to-time-of-use race conditions.
- The device should validate before reading/writing to a buffer that it is device-readable/device-writable.
QueueT is a trait that allows different implementations for a
object for single-threaded context and multi-threaded context. The
implementations provided in
Queue→ it is used for the single-threaded context.
QueueSync→ it is used for the multi-threaded context, and is simply a wrapper over an
Besides the above abstractions, the
virtio-queue crate provides also the
Descriptor→ which mostly offers accessors for the members of the
DescriptorChain→ provides accessors for the
DescriptorChain’s members and an
Iteratorimplementation for iterating over the
DescriptorChain, there is also an abstraction for iterators over just the device readable or just the device writable descriptors (
AvailIter- is a consuming iterator over all available descriptor chain heads in the queue.
Queue allows saving the state through the
state function which returns
Queue objects can be created from a previously saved state by
QueueState::try_from. The VMM should check for errors when restoring
Queue from a previously saved state.
A big part of the
virtio-queue crate consists of the notification suppression
support. As already mentioned, the driver can send an available buffer
notification to the device when there are new entries in the available ring,
and the device can send a used buffer notification to the driver when there are
new entries in the used ring. There might be cases when sending a notification
each time these scenarios happen is not efficient, for example when the driver
is processing the used ring, it would not need to receive another used buffer
notification. The mechanism for suppressing the notifications is detailed in
the following sections from the specification:
Queue abstraction is proposing the following sequence of steps for
processing new available ring entries:
- the device first disables the notifications to make the driver aware it is
processing the available ring and does not want interruptions, by using
Queue::disable_notification. Notifications are disabled by the device either if VIRTIO_F_EVENT_IDX is not negotiated, and VIRTQ_USED_F_NO_NOTIFY is set in the
flagsfield of the used ring, or if VIRTIO_F_EVENT_IDX is negotiated, and
avail_eventvalue is not updated, i.e. it remains set to the latest
idxvalue of the available ring that was already notified by the driver.
- the device processes the new entries by using the
- the device can enable the notifications now, by using
Queue::enable_notification. Notifications are enabled by the device either if VIRTIO_F_EVENT_IDX is not negotiated, and 0 is set in the
flagsfield of the used ring, or if VIRTIO_F_EVENT_IDX is negotiated, and
avail_eventvalue is set to the smallest
idxvalue of the available ring that was not already notified by the driver. This way the device makes sure that it won’t miss any notification.
The above steps should be done in a loop to also handle the less likely case where the driver added new entries just before we re-enabled notifications.
On the driver side, the
Queue provides the
needs_notification method which
should be used each time the device adds a new entry to the used ring.
Depending on the
used_event value and on the last used value
needs_notification returns true to let the device know it
should send a notification to the guest.
We assume the users of the
Queue implementation won’t attempt to use the
queue before checking that the
ready bit is set. This can be verified by
Queue::is_valid which, besides this, is also checking that the three
queue parts are valid memory regions.
We assume consumers will use
AvailIter::go_to_previous_position only in
We assume the users will consume the entries from the available ring in the
recommended way from the documentation, i.e. device starts processing the
available ring entries, disables the notifications, processes the entries,
and then re-enables notifications.
This project is licensed under either of