surge-input

23 releases

0.2.12-alpha.0	Apr 7, 2023
0.2.11-alpha.0	Dec 19, 2022
0.2.5-alpha.0	Jun 21, 2022
0.2.4-alpha.0	Mar 14, 2022
0.1.42-alpha.0	Oct 27, 2021

#5 in #surge

104 downloads per month
Used in 10 crates (8 directly)

GPL-3.0 license

180KB
3K SLoC

surge-input

This Rust crate provides the necessary structures and functions to handle audio input in the surge synthesizer system.

SynthInput

The SynthInput struct represents an audio input source for the synthesizer. It contains two audio channels (audio_in_left and audio_in_right) and a number of parameter inputs.

SynthInputHandle

The SynthInputHandle struct is a handle to a SynthInput object that is managed by an audio backend such as Core Audio or JACK.

audio_in0, audio_in0_ptr, audio_in1, audio_in1_ptr

These constants are pointers to the audio input buffers for the SynthInput object. The _ptr suffix indicates that these are raw pointers to the buffer memory.

audio_in_left, audio_in_right

These constants are mutable references to the left and right audio channels of the SynthInput object.

non_os_audio_in0_ptr, non_os_audio_in1_ptr, non_os_audio_in_left, non_os_audio_in_right

These constants are similar to audio_in0, audio_in1, audio_in_left, and audio_in_right, but are used for non-OS audio backends.

parameter

The parameter module contains types and functions for handling parameter inputs in the SynthInput object.

init

The init function initializes the SynthInput object with default values.

Overall, surge-input provides a convenient and flexible way to handle audio input in the surge synthesizer system. The SynthInput object can be used to process audio input from a variety of sources and the parameter module allows for easy handling of parameter inputs.

91> where does the audio input come from in the context of a digital audio software synthesizer?

The audio input in the context of a digital audio software synthesizer typically comes from an external source such as a microphone or line input on an audio interface. It can also come from a software program that generates audio output and feeds it into the synthesizer as an input signal. In either case, the audio input is used as a source signal for processing within the synthesizer, often being filtered, modulated, and transformed in various ways before being output as a new audio signal.

92> in which format does the audio information arrive?

In the context of a digital audio software synthesizer, the audio information typically arrives as a stream of digital audio samples. Each sample is a numerical representation of the amplitude of the audio waveform at a particular point in time. The format of the audio samples can vary depending on the audio interface being used, but common formats include 16-bit or 24-bit PCM (pulse-code modulation) audio with a sampling rate of 44.1 kHz or 48 kHz. Other formats, such as floating-point audio, may also be used in some cases.

93> how would we connect an audio input from the command line in rust?

In Rust, connecting an audio input from the command line can be done using the cpal crate, which provides an API for working with the Cross-Platform Audio Library. Here is a simple example of how you can open an audio input stream using cpal:

use cpal::traits::{DeviceTrait, HostTrait, StreamTrait};
use std::thread;

fn main() {
    let host = cpal::default_host();
    let device = host.default_input_device().expect("no input device available");
    let config = device.default_input_config().expect("no default input config available");
    let sample_rate = config.sample_rate().0;
    let channels = config.channels();

    let stream = device.build_input_stream(
        &config.into(),
        move |data: &[f32], _: &cpal::InputCallbackInfo| {
            // Process the audio data
            // Here we just print the first 10 samples
            println!("{:?}", &data[0..10]);
        },
        |err| {
            eprintln!("an error occurred on the input stream: {}", err);
        },
    ).unwrap();

    stream.play().unwrap();

    // Keep the program running indefinitely to continue processing audio input
    loop {
        thread::park();
    }
}

This code will open the default input device, configure it with the default input configuration, and then build an input stream that will continuously call the provided input callback function to process the incoming audio data.

Once the stream is built, it can be started with the play method, and then the program can continue processing audio input indefinitely using a loop.

94> is this the most performant solution?

There isn't a straightforward answer to this question, as the most performant solution may depend on a number of factors, such as the specifics of the hardware and software being used, the size and complexity of the input data, and the requirements of the application.

That being said, using a ring buffer and a dedicated thread for reading and processing audio input data can often be a very efficient approach for handling real-time audio input in a software synthesizer. By keeping the input data in a buffer that is accessed in a thread-safe manner, and having a dedicated thread for reading and processing this data, you can avoid potential blocking or synchronization issues that might arise if you were to handle the input data directly in the same thread as the main application logic.

Additionally, you may be able to improve performance by using low-level libraries such as PortAudio or JACK, which provide optimized APIs for real-time audio input and output. These libraries can help to minimize the overhead of handling audio input in software, and allow you to take advantage of hardware acceleration and other performance optimizations.

95> how would we do it without PortAudio, cpal, or Jack?

Without relying on a third-party library like PortAudio, cpal, or Jack, one approach to receiving audio input would be to use the OS-specific APIs for capturing audio. For example, on Linux, one could use ALSA (Advanced Linux Sound Architecture) or PulseAudio. On macOS, one could use Core Audio. On Windows, one could use WASAPI (Windows Audio Session API) or DirectSound.

However, using the OS-specific APIs directly can be more complicated and less portable than using a cross-platform audio library like PortAudio or cpal. Additionally, it may require additional knowledge of the audio subsystem on each platform, as well as potentially different implementation details for each API.

96> what if we wanted to avoid these apis too?

If you wanted to avoid using PortAudio, cpal, or Jack, you would need to write your own code to interact with the audio device drivers or system APIs directly. This approach would likely be highly platform-specific and require a good understanding of low-level system programming.

One option could be to use a cross-platform library like SDL (Simple DirectMedia Layer) or GLFW (OpenGL FrameWork) that provides an abstraction layer for handling audio input. These libraries support a wide range of operating systems and provide APIs for handling both audio and graphics. However, using these libraries might add additional overhead and complexity to your project.

97> so, SDL and GLFW can interact directly with the audio device drivers?

SDL and GLFW provide a platform-independent way to interact with audio devices through their APIs, but they still rely on underlying audio drivers to handle the low-level hardware communication.

SDL's audio subsystem, for example, can use different backends to interact with the audio device, including DirectSound, ALSA, OSS, and more, depending on the platform. GLFW uses the system's default audio device and provides a simple API to interact with it.

So while SDL and GLFW provide an abstraction over the audio device driver, they still rely on it to function properly.

98> how would we do it without relying on these?

If you want to avoid using external libraries for handling audio input, you could use the platform-specific APIs for accessing the audio device directly. For example, on Linux, you could use the ALSA (Advanced Linux Sound Architecture) API. On Windows, you could use the Windows Multimedia API (WMM) or the Windows Core Audio API. On macOS, you could use the Core Audio framework.

However, using these APIs directly can be more complex than using a higher-level library like PortAudio, cpal, or Jack, and may require more low-level knowledge of the underlying audio device hardware and software. Additionally, using platform-specific APIs can make your code less portable across different operating systems.

99> could we get it to be more performant if we went to an even lower level?

Yes, it's possible to get more performance by going to an even lower level, but it would require more work and would make the code less portable.

One approach is to use the OS-specific audio APIs directly, such as Core Audio on macOS, ALSA on Linux, and WASAPI on Windows. This would require more code and platform-specific code, but it would allow for more control over the audio stream and potentially reduce latency.

Another approach is to use a lower-level programming language such as C or assembly to write the audio input code, which could provide more direct control over the hardware and potentially reduce latency. However, this would require more expertise in low-level programming and may not be necessary for most use cases.

100> let's go to the lowest level, the C/assembly interface to write the audio input code. how would we do this?

To interact with the audio device at the lowest level, we would need to write platform-specific C or assembly code that directly interacts with the audio device drivers. This would involve understanding the low-level details of the audio device driver interface for the specific operating system and hardware architecture.

Once we have the platform-specific C or assembly code, we could then call it from Rust using the Rust Foreign Function Interface (FFI). This would involve declaring the C or assembly functions in Rust and then calling them using the extern keyword.

While this approach would provide the highest level of performance, it also has the highest level of complexity and requires a deep understanding of the operating system and hardware architecture. Additionally, platform-specific code is not portable and would need to be written and maintained separately for each target platform.

101> that sounds okay --let's crank the performance. what do we need to understand about the operating system and the hardware architecture to do this properly?

To achieve the highest possible performance, we need to have a good understanding of the underlying operating system and hardware architecture. Here are some factors to consider:

CPU architecture: The CPU architecture plays a crucial role in determining the performance of audio input processing. It is important to understand the CPU's instruction set and optimize the code accordingly. For example, on x86-based systems, we can use SSE or AVX instructions to perform audio input processing efficiently.
Memory management: Audio input processing requires a significant amount of memory, especially when processing large audio files. It is important to optimize memory usage and minimize data movement between memory and CPU cache.
Interrupt handling: Audio input processing requires precise timing and responsiveness, which can be achieved through interrupt handling. It is important to optimize interrupt handling to minimize the latency between audio input and processing.
Real-time scheduling: Real-time scheduling is critical for audio input processing, as it ensures that the audio processing tasks are executed in a timely manner. It is important to configure the operating system to use a real-time scheduler and assign the highest priority to the audio input processing tasks.
Low-level hardware access: To achieve the highest possible performance, we may need to access hardware resources directly, bypassing the operating system's device drivers. This requires a good understanding of the hardware architecture and low-level programming skills.

Overall, achieving the highest possible performance for audio input processing requires a deep understanding of the underlying hardware and software architecture, as well as the ability to write low-level, optimized code.

102> how do we perform low-level hardware access?

Performing low-level hardware access involves using system-specific instructions and registers to interact directly with the hardware components. In order to perform low-level hardware access in Rust, you typically need to use inline assembly or a foreign function interface (FFI) to call C functions that have been written to access the hardware.

Inline assembly in Rust is supported using the asm! macro, which allows you to write inline assembly code directly in your Rust source code. However, writing inline assembly can be quite complex and error-prone, so it is usually only recommended for expert users who are familiar with the underlying hardware architecture and system-specific assembly language.

A more common approach to performing low-level hardware access in Rust is to use a foreign function interface (FFI) to call C functions that have been written to interact with the hardware. Rust has built-in support for calling C functions through its libc crate, which provides a set of bindings to the C standard library. You can also use Rust's bindgen crate to automatically generate Rust bindings for C headers, which can simplify the process of working with C functions.

When performing low-level hardware access, it is important to have a deep understanding of the hardware architecture and the system-specific instructions and registers that are used to interact with the hardware. You should also be familiar with any system-specific programming interfaces that are required to access the hardware, such as device drivers or system calls. Additionally, low-level hardware access can be dangerous and can potentially cause system instability or data loss, so it is important to be careful and thoroughly test your code before deploying it in a production environment.

103> let's do it with inline assembly on arm architecture. how do we identify the exact hardware we need to talk to?

To identify the exact hardware we need to talk to, we need to know the exact hardware address of the peripheral register that we want to access. In ARM architecture, the peripheral registers are mapped into the processor's memory address space at specific addresses. We can find the memory address of the peripheral register by consulting the datasheet of the microcontroller or microprocessor that we are using.

For example, let's say we want to access the GPIO (General Purpose Input/Output) peripheral on a Raspberry Pi using inline assembly. We can find the memory-mapped address of the GPIO peripheral by looking at the BCM2835 datasheet, which is the system-on-chip used in the Raspberry Pi. The datasheet provides a register map that shows the memory address of each peripheral register. We can then use this memory address in our inline assembly code to access the GPIO peripheral.

It's important to note that accessing hardware registers through inline assembly is not always necessary or even recommended. Many modern microcontrollers and microprocessors provide APIs or libraries that abstract the low-level hardware access and provide a safer and more portable way to interact with the hardware. However, if performance is a critical concern or if we need to perform operations that are not supported by the APIs or libraries, inline assembly may be a viable option.

104> which apis do we want to consider for this?

If you are planning to access hardware directly using inline assembly, you will need to consult the documentation provided by the hardware manufacturer to understand how to properly access the hardware. The exact APIs you use will depend on the specific hardware you are working with.

In general, it's important to be very careful when directly accessing hardware at a low level, as it can be easy to introduce bugs or cause unexpected behavior if you're not careful. You'll also need to be careful to write portable code that works across different operating systems and architectures, which can be challenging.

If you're looking for a more high-level way to interact with hardware, you might consider using a library or framework that provides an abstraction layer over the hardware, such as Rust's embedded-hal or the C libraries wiringPi or libgpiod. These libraries provide a more standard interface for interacting with hardware, and may be easier to work with than directly writing inline assembly.

105> interesting. i will come back to this later on. i think it is useful to understand precisely what the various layers are

Certainly, understanding the different layers involved in audio programming can help to gain a more comprehensive understanding of the system and identify potential bottlenecks or areas for optimization. Let me know if you have any more questions on the topic!

Dependencies

~12–20MB
~306K SLoC